下载完成后把chatglm-6b文件夹移动到wenda/mode文件夹下 请注意要修改config.yml配置文件的第91行,把cuda fp16改成适合自己显存的参数,不然会爆显存
warnings.warn( Building extension module wkv6... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/2] /usr/bin/nvcc --generate-dependencies-with-compile --dependency-output wkv6_cuda.cuda.o.d -DTORCH_EXTENSION_NAME=wkv6...
I installed the CUDA toolkit following the info on the link you provided, and then I was able to install deepspeed, but then when I tried to run read.py, I got page after page of warnings and error messages that whizzed by, ending with "RuntimeError: Error building extension 'transforme...
请把config.yml配置文件中第96行注释掉 还有如果你想使用RWKV请注意你的CUDA版本,一定要使用作者提供的cuda_11.8.0_522.06,不然会报错 另外一定不要使用作者提供的chatglm-6b-int4(v1.1英文增强版)模型 使用这个模型来启动GLM6B会报错 想要使用GLM6B请到这里下载模型:THUDM/chatglm-6b at main (huggingface.co)...
building 'torch_scatter._segment_coo_cpu' extension x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -W...
cuda-cuxxfilt 169.2kB @ 3.4kB/s 1.2s cuda-compiler 1.3kB @ 27.0 B/s 0.6s filelock 14.1kB @ 283.0 B/s 0.1s libffi 42.1kB @ 845.0 B/s 0.0s blas 14.1kB @ 282.0 B/s 0.0s libcurand 3.1kB @ 61.0 B/s 0.6s libcurand-dev 51.5MB @ 777.8kB/s 1m:3.6s pip 1.4MB @ 20.6kB/s ...