python3 -m pip install -r requirements.txt # generate GGUF model python convert-hf-to-gguf.py <MODEL_PATH> --outfile <GGUF_PATH> --model-name deepseekcoder # use q4_0 quantization as an example ./quantize <GGUF_PATH> <OUTPUT_PATH> q4_0 ./main -m <OUTPUT_PATH> -n 128 -p ...
10.8% and 5.9% respectively on HumanEval Python, HumanEval Multilingual, MBPP and DS-1000. Surprisingly, our DeepSeek-Coder-Base-7B reaches the performance of CodeLlama-34B. The DeepSeek-Coder-Instruct-33B model after instruction tuning outperforms GPT35-turbo on HumanEval and achieves comparable...
python3 -m pip install -r requirements.txt # generate GGUF model python convert-hf-to-gguf.py <MODEL_PATH> --outfile <GGUF_PATH> --model-name deepseekcoder # use q4_0 quantization as an example ./quantize <GGUF_PATH> <OUTPUT_PATH> q4_0 ./main -m <OUTPUT_PATH> -n 128 -p ...
Step 2: Further Pre-training using an extended 16K window size on an additional 200B tokens, resulting in foundational models (DeepSeek-Coder-Base). Step 3: Instruction Fine-tuning on 2B tokens of instruction data, resulting in instruction-tuned models (DeepSeek-Coder-Instruct). ...
Surprisingly, our DeepSeek-Coder-Base-7B reaches the performance of CodeLlama-34B. The DeepSeek-Coder-Instruct-33B model after instruction tuning outperforms GPT35-turbo on HumanEval and achieves comparable results with GPT35-turbo on MBPP. More evaluation details can be found in the Detailed ...
python3 -m pip install -r requirements.txt # generate GGUF model python convert-hf-to-gguf.py <MODEL_PATH> --outfile <GGUF_PATH> --model-name deepseekcoder # use q4_0 quantization as an example ./quantize <GGUF_PATH> <OUTPUT_PATH> q4_0 ./main -m <OUTPUT_PATH> -n 128 -p ...
Surprisingly, our DeepSeek-Coder-Base-7B reaches the performance of CodeLlama-34B. The DeepSeek-Coder-Instruct-33B model after instruction tuning outperforms GPT35-turbo on HumanEval and achieves comparable results with GPT35-turbo on MBPP. More evaluation details can be found in the Detailed ...
python3 -m pip install -r requirements.txt # generate GGUF model python convert-hf-to-gguf.py <MODEL_PATH> --outfile <GGUF_PATH> --model-name deepseekcoder # use q4_0 quantization as an example ./quantize <GGUF_PATH> <OUTPUT_PATH> q4_0 ./main -m <OUTPUT_PATH> -n 128 -p ...
10.8% and 5.9% respectively on HumanEval Python, HumanEval Multilingual, MBPP and DS-1000. Surprisingly, our DeepSeek-Coder-Base-7B reaches the performance of CodeLlama-34B. The DeepSeek-Coder-Instruct-33B model after instruction tuning outperforms GPT35-turbo on HumanEval and achieves comparable...
Surprisingly, our DeepSeek-Coder-Base-7B reaches the performance of CodeLlama-34B. The DeepSeek-Coder-Instruct-33B model after instruction tuning outperforms GPT35-turbo on HumanEval and achieves comparable results with GPT35-turbo on MBPP. More evaluation details can be found in the Detailed ...