Model loader: llama.cpp n-gpu-layers: 40 n_ctx: 3072 The error should happen. Now go to your device manager, and disable one of the GPUs. Repeat step 1, and the issue does not happens. Screenshot No response Logs llm_load_print_meta: max token length = 48 llm_load_tensors: ggml...
Bug/problem Screenshot This is the screenshot of the missing AutoAWQ loader. Logs Traceback (most recent call last): File"D:\StableDiffution\text-generation-webui\installer_files\env\Lib\site-packages\gradio\queueing.py", line 527,inprocess_events response = await route_utils.call_process_a...
Fixed v1.9.1, I confirm, credits to@oobaboogafor taking serious time to fix. bitsnaps closed this ascompletedon Jul 6, 2024
“G:\Oobabooga Text UI\oobabooga-windows\oobabooga-windows\text-generation-webui\modules\GPTQ_loader.py”, line 32,in_load_quant model = AutoModelForCausalLM.from_config(config) File “G:\Oobabooga Text UI\oobabooga-windows\oobabooga-windows\installer_files\env\lib\site-packages\transformers...
yet. Traceback (most recent call last): File “F:\Programme\oobabooga_windows\text-generation-webui\server.py”, line 70, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name) File “F:\Programme\oobabooga_windows\text-generation-webui\modules\models.py”, ...
shared.model, shared.tokenizer = load_model(selected_model, loader) ^^^ File "K:\oobabooga\modules\models.py", line 93, in load_model output = load_func_maploader ^^^ File "K:\oobabooga\modules\models.py", line 172, in huggingface_loader model = LoaderClass...
(model_name) File "E:\oobabooga_windows\text-generation-webui\modules\models.py", line 258, in llamacpp_loader model, tokenizer = LlamaCppModel.from_pretrained(model_file) File "E:\oobabooga_windows\text-generation-webui\modules\llamacpp_model.py", line 50, in from_pretrained self.model...
oobabooga/text-generation-webui withcatai/catai Here is a typical run using LLaMA v2 13B on M2 Ultra: $ make -j && ./main -m models/llama-13b-v2/ggml-model-q4_0.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 400 -e I llama.cpp build info: ...
oobaboogacommentedSep 25, 2023 I have added thedisable_exllamaoption for Transformers here:36c38d7 LoRA training with GPTQ models should work now. Make sure to load the model using the Transformers loader with bothauto-devicesanddisable_exllamachecked. ...
Fork from text-generation-webui https://github.com/oobabooga/text-generation-webui/blob/main/modules/llamacpp_model.py """ import logging import re from typing import Dict Expand Down Expand Up @@ -62,11 +63,11 @@ def __del__(self): self.model.__del__() @classmethod def from_pr...