lm-eval-harness简介 lm-eval-harness是一个开源框架,主要用于评估语言模型在不同任务上的性能。其设计目标是为研究者和开发者提供统一、可扩展的评估工具,支持多种自然语言处理任务的标准化测试。该框架通过模块化设计简化了评估流程,用户可快速配置实验参数、加载数据集,并生成详细的评估报告。lm-eval-harness的...
To install the lm-eval package from the github repository, run: git clone https://github.com/EleutherAI/lm-evaluation-harness cd lm-evaluation-harness pip install -e . We also provide a number of optional dependencies for extended functionality. Extras can be installed via pip install -e "....
lm_eval --model vllm --model_args "pretrained=$model_identifier,tensor_parallel_size=$number_of_gpus,dtype=auto" --tasks $task_name --batch_size auto --log_samples --output_path "output/${model_identifier}_${task_name}" Footer © 2024 GitHub, Inc. Footer navigation Terms Privacy ...
在测评过程中,我们首先使用lm evaluation harness对C-Eval模型进行了基准测试。通过对比C-Eval与其他主流大语言模型在各项评估指标上的表现,我们发现C-Eval在文本生成和语言理解方面具有一定的优势。特别是在处理长文本和复杂语境时,C-Eval表现出了较强的泛化能力和鲁棒性。 接下来,我们使用vllm对C-Eval模型进行了更...
Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models) - lm-eval-harness submodule · goombalab/phi-mamba@aeccfa7
main Breadcrumbs lm-eval-harness/ requirements.txtLatest commit HistoryHistory File metadata and controls Code Blame 1 lines (1 loc) · 5 Bytes Raw 1 -e . Footer © 2024 GitHub, Inc. Footer navigation Terms Privacy Security Status Docs Contact Manage cookies Do not share my personal ...
3 changes: 2 additions & 1 deletion 3 examples_deepspeed/MoE/ds_evalharness.sh Original file line numberDiff line numberDiff line change @@ -28,7 +28,7 @@ TASKS="lambada" VOCAB_FILE=/data/Megatron-LM/data/gpt2-vocab.json MERGE_FILE=/data/Megatron-LM/data/gpt2-merges.txt export HF...
Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models) - phi-mamba/lm_harness_eval.py at main · goombalab/phi-mamba
Command:python eval/lm_eval_harness.py --checkpoint_dir checkpoints/stabilityai/stablelm-base-alpha-3b --precision "bf16-true" --eval_tasks "[gsm8k]" --batch_size 4 --save_filepath "results-stablelm-3b_gsm8k.json" Running greedy_until requests ...
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models" - lolcats/lm_eval_harness/models.py at main · HazyResearch/lolcats