All training results of RFCL from the paper are publically available on weights and biases for you to download and compare with:https://wandb.ai/stonet2000/RFCL-Sparse. See the reports section for more organized views of the results.
See the reports section for more organized views of the results. Not in the paper but in the code, you can train on demonstrations without any action labels. Sample-efficiency/wall-time will be worse but you can now easily train on demos that do not have action labels, just env states....