quantization_config+bnb_config

2025-06-06 07:18:12

拼音 [ 拼音 ]

...pass a `bitsandbytesconfig` object in `quantization_config...

from transformers import BitsAndBytesConfig # 创建一个BitsAndBytesConfig对象 quantization_config = BitsAndBytesConfig( load_in_4bit=True, # 加载模型时使用4位量化 bnb_4bit_quant_type="nf4", # 使用NF4量化类型 bnb_4bit_use_double_q
...as supported load-time `quantization_config` · Issue #10...

ask is to also support Huggingface's ownOptimum Quanto right now its possible to use it, but only as post-load on-demand quantization, there is no option to use it like BnB or TorchAO to apply quantization automatically during load itself. @yiyixuxu@sayakpaul@DN6@asomoza...