device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'), /usr/local/lib64/python3.9/site-packages/torch_npu/contrib/transfer_to_npu.py:301: ImportWarning: ********** The to
System Info System Info I'm running into an issue where I'm not able to load a 4-bit or 8-bit quantized version of Falcon or LLaMa models. This was working a couple of weeks ago. This is running on Colab. I'm wondering if anyone knows of...
device mapforthe 4-bit model: {'': 0} Traceback (most recent call last): File"C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\server.py", line 302,in<module>shared.model, shared.tokenizer = load_model(shared.model_name) File"C:\Users\xxxx\Deep\text-diffusion-webui\...