We also report human evaluation for LANI by asking raters if the generated path follows the instruction on a Likert- type scale of 1–5. Raters were shown the gener- ated path, the reference path, and the instruction. Parameters We use a horizon of 40 for both domains. During training, ...