feat(triton-linalg): update triton-linalg and update to triton3.0.x(7… Oct 12, 2024 include feat(triton-linalg): update triton-linlag Feb 7, 2025 lib feat(triton-linalg): update triton-linlag Feb 7, 2025 test feat(triton-linalg): update triton-linlag Feb 7, 2025 tools feat(triton-...
RegisterTritonLinalgDialects.h triton-linalg-opt.cpp include lib test tools triton .gitignore .gitmodules ACKNOWLEDGMENTS CMakeLists.txt CODE_OF_CONDUCT.md LICENSE README.md Breadcrumbs triton-linalg /bin / CMakeLists.txt Latest commit cambricon feat(triton): init repo 4d27159· May 28, 2024...
1. Triton编程优化: - 作者在Kernel优化中采用了“两步走”策略:首先进行浅层优化(算子替换、合并kernel等),然后进行深层优化(分析IR,使用性能分析工具如perf)。如果浅层优化接近算子库性能,深层优化通过观察访存行为及汇编代码进一步调整。 - Triton的执行流程包括从Triton-lang到Triton GPU方言,再到LLVM IR,最后生...
triton-shared-opt --triton-to-linalg %file 2. Backend Component The intended use of the Triton middle layer is to be used as a component in a Triton back-end. This can be accomplished by adding the cmake targets it produces and its headers files to that back-end. An example back-end...
python Merge openai/triton-to-linalg Jun 20, 2023 test Add TritonToLinalg Jun 17, 2023 unittest [DOC] Fix syntax errors, typos, formatting; increase consistency (tri… Mar 17, 2023 .clang-format Merge triton-mlir branch - Complete rewrite of the backend from scr… Dec 21, 2022 .editorconf...
triton-linalg Public Development repository for the Triton-Linalg conversion C++ 185 Apache-2.0 19 0 0 Updated Feb 7, 2025 View all repositories People This organization has no public members. You must be a member to see who’s a part of this organization. Top languages C++ Python ...
rust algebra matrix multiplication linear linalg matmul Updated Mar 30, 2019 Rust alprn42 / Instruction-Counter Star 1 Code Issues Pull requests In this project, ınstruction numbers from a c program are counted with pin and c++. counter cpp pin registers instruction matmul instruction-...
Add @nikitaved to torch.linalg CODEOWNERS/persons_of_interest (#1… Feb 5, 2025 CODE_OF_CONDUCT.md Create CODE_OF_CONDUCT.md Feb 29, 2020 CONTRIBUTING.md [Easy] Add ROCm support to nightly pull tool (#141282) Dec 27, 2024 Dockerfile docker: Use miniforge, install from pip (#134274...
🐛 Describe the bug torch.norm and torch.linalg.norm funcitons gives wrong result with torch.complex32 typed tensor. The result is correct for torch.complex64 and torch.complex128 type. The problem exists for p=2 and p=1 norm as I have te...
Test_case Generator for mlu-ops (https://github.com/Cambricon/mlu-ops). ffmpeg-mluPublic Integrated MLU-accelerated video processing into ffmpeg on Ubuntu/Centos triton-linalgPublic Development repository for the Triton-Linalg conversion tensorflow_modelzooPublic...