O1 algorithm6.48s/iteration↓ 10.87%↓ 0.00%< 0.2%910B*8P Bloom-7Bbaseline5.45s/iteration---910B*8P O1 algorithm5.49s/iteration↓ 12.68%↓ 0.7%< 0.01%910B*8P LLama-32Bbaseline5.23s/iteration---910B*16P O1 argorithm5.28s/iteration↓ 15.93%↓ 0.95%< 0.02%910B*16P LLama-7Bdistributed...
O1 algorithm 6.48s/iteration ↓ 10.87% ↓ 0.00% < 0.2% 910B*8P Bloom-7B baseline 5.45s/iteration -- -- -- 910B*8P O1 algorithm 5.49s/iteration ↓ 12.68% ↓ 0.7% < 0.01% 910B*8P LLama-32B baseline 5.23s/iteration -- -- -- 910B*16P O1 argorithm 5.28s/iteration ↓ 15.93% ↓...