Different from normal Meta-RL that trains policy to tackle a distribution of tasks, this work train on a distribution of tasks that are underpinned by a different causal structure. Focus on the question: whether MetaRL can produce an agent capable of causal reasoning. The work has three settin...
W. Zhang, X. Zhang, H. W. Deng, M. L. Zhang. Multi-instance causal representation learning for instance label prediction and out-of-distribution generalization. InProceedings of 36th Conference on Neural Information Processing Systems, New Orleans, USA, 2022. J. Brehmer, P. de Haan, P. Li...