Hello, Thank you for this interesting script. I am trying to test the deepspeed inference, but it gives me such error (I am just using a small bloom model to start with) Any hint on what is going wrong right here? Look like some nccl iss...
it fails with the following backtrace: [hali001:51586:0:51586] Caught signal 7 (Bus error: nonexistent physical address) === backtrace (tid: 51586) === 0 /root/ucx/install/lib/libucs.so.0(ucs_handle_error+0x78) [0xffff9117b368] 1 /root/ucx/install/lib/libucs.so.0(+0x2b184) ...
Credit the nonexistent mathematician Bourbaki writing about “le théorème de Zorn”. By 1940 John Tukey, celebrated for the Fast Fourier Transform, wrote of “Zorn’s Lemma”. Tukey’s impression was that this is how people in Princeton spoke of it at the time. He seems to have been ...