…36397) Summary: This PR adds `torch.float8e4m3fn` support to cuSPARSELt and `to_sparse_semi_structured`. This will let users to run fp8 + 2:4 sparse matmuls on Hopper GPUs with cusparselt >= 0.6.2, via to `scaled_mm` API. ``` A = rand_sparse_semi_structured_mask(256, 128...
Tensors and Dynamic neural networks in Python with strong GPU acceleration - Update on "[sparse][semi-structured] Add float8 dtype support to 24 s… · pytorch/pytorch@d2434eb
Tensors and Dynamic neural networks in Python with strong GPU acceleration - Update on "[sparse][semi-structured] Add float8 dtype support to 24 s… · pytorch/pytorch@4497986