Ongoing research training transformer language models at scale, including: BERT & GPT-2 - Add Megatron-LM pretrain function for the core. · gurpreet-dhami/Megatron-DeepSpeed@8a85d59
microsoft / DeepSpeed Public Notifications Fork 3.9k Star 33.3k Code Issues 971 Pull requests 141 Discussions Actions Projects Security Insights New issue fixes in _partition_param_sec function #5613 Merged samadejacobs merged 1 commit into microsoft:master from mmhab:fix_partition_...
Ongoing research training transformer language models at scale, including: BERT & GPT-2 - Add Megatron-LM pretrain function for the core. · argonne-lcf/Megatron-DeepSpeed@8a85d59
Ongoing research training transformer language models at scale, including: BERT & GPT-2 - Add Megatron-LM pretrain function for the core. · xinyu-intel/Megatron-DeepSpeed@8a85d59