return no_grad_trunc_normal(tensor, mean, std, a, b) File "/opt/conda/lib/python3.8/site-packages/torch/nn/init.py", line 46, in no_grad_trunc_normal tensor.erfinv_() RuntimeError: "erfinv_cuda" not implemented for 'BFloat16' "--deepspeed", "/data_share/share/csc/Qwen-VL-mas...