3.2. Hindsight Bias Dealing with sparse task-solving rewards, HER utilizes the hindsight—the pseudo goal is going to be achieved in the near future—to learn from failures. Concretely, it relabels experience 𝑒=(𝑠,𝑎,𝑠′,𝑔,𝑟)e=(s,a,s′,g,r) with a pseudo goal 𝑔ℎ...