ggml_cuda_force_mmq

2025-03-30 10:13:27

拼音 [ 拼音 ]

CUDA: fix logic for V100 + GGML_CUDA_FORCE_MMQ by JohannesGa...

Fixes LostRuins#1390 . The logic for the combination of V100s and GGML_CUDA_FORCE_MMQ seems to be wrong on master. By default, when compiling without GGML_CUDA_FORCE_MMQ, the MMQ kernels should onl...
CUDA: fix logic for V100 + GGML_CUDA_FORCE_MMQ · ggml-org/...

LLM inference in C/C++. Contribute to ggml-org/llama.cpp development by creating an account on GitHub.
...performance regression on 0.1.32 -> GGML_CUDA_FORCE_MMQ...

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: YES New versions (After 0.1.31) ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no I tried to force this via env variables, but it did not help. Is there way to configure this via OLLAMA OS Linux ...
Clarify default MMQ for CUDA and LLAMA_CUDA_FORCE_MMQ flag by...

Clarify default MMQ for CUDA and LLAMA_CUDA_FORCE_MMQ flag (ggml-org#… … 8591377 arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jun 30, 2024 Clarify default MMQ for CUDA and LLAMA_CUDA_FORCE_MMQ flag (ggml-org#… … b1776ff MagnusS0 pushed a commi...
CUDA: fix logic for V100 + GGML_CUDA_FORCE_MMQ (#12098) · gg...

LLM inference in C/C++. Contribute to ggml-org/llama.cpp development by creating an account on GitHub.