[pip3] vector-quantize-pytorch==1.0.1 [conda] blas 1.0 mkl https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main [conda] cudatoolkit 9.0 h41a26b3_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main [conda] libblas 3.9.0 12_osx64_mkl conda-forge [conda] libcblas 3.9.0 12...
[2023-08-09 11:18:04,324] [INFO] [logging.py:96:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1 Using /home/bryce/.cache/torch_extensions/py39_cu117 as PyTorch extensions root... Detected CUDA files, patching ldflags Emitting ninja build file...
I finetuned (or further pretrained) the Model OpenChat (a Mistral 7B finetuning) on my own data. This worked well and the inference produces nice results. Now I want to merge the adapter weights with the original model, to quantize the model in a further step. The issue is that c...
mod: IRModule, named_params: List[Tuple[str, nn.Parameter]], args: CompileArgs, model_config, ) -> None: def _metadata(): metadata = { "quantization": args.quantization.name, "model_type": args.model.name, "params": [ { "name": name, "shape": list(param.shape), "dtype": par...