I trained the model using the 33B architecture and the train.py file with deepspeed , but when I saved the model using the safe_save_model_for_hf_trainer function, it was only 400M. the deepspeed is : { "bf16": { "enabled": "auto" }, "optimizer": { "type": "AdamW", "...
safe_rlhf algorithms ppo main.py ppo_lag main.py trainer.py ppo_reward_shaping main.py trainer.py evaluate cost.py reward.py logger.py models normalizer.py pretrained.py score_model __init__.py bloom modeling_bloom.py gpt2 modeling_gpt2.py gpt_neo mode...
│ 275 │ trainer.train() │ │ 276 │ trainer.save() │ │ 277 │ │ │ │ /yyy/xxxx/safe-rlhf/safe_rlhf/trainers/supervised_trainer.py:63 in __init__ │ │ │ │ 60 │ │ self.init_models() │ │ 61 │ │ self.init_datasets() │ │ 62 │ │ │ │ ❱ 63 │ │...