支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运...
6B ./bin/gpt-j -m models/gpt-j-6B/ggml-model.bin -p "This is an example" # Install Python dependencies python3 -m pip install -r ../requirements.txt # Run the Cerebras-GPT 111M model # Download from: https://huggingface.co/cerebras python3 ../examples/gpt-2/convert-cerebras-to...
bin/quantize ../models/Llama-2-7b-chat/ggml-model-f16.GGUF ../models/Llama-2-7b-chat/ggml...
push相应的二进制和库文件:adb push bin/* /data/local/tmp/bin/ adb push src/libggml.so /dat...