lm_eval --model vllm --model_args "pretrained=$model_identifier,tensor_parallel_size=$number_of_gpus,dtype=auto" --tasks $task_name --batch_size auto --log_samples --output_path "output/${model_identifier}_${task_name}" Footer © 2024 GitHub, Inc. Footer navigation Terms Privacy ...
Run eval harness as usual with awandb_argsflag. Use this flag to provide arguments for initializing a wandb run (wandb.init) as comma separated string arguments. lm_eval \ --model hf \ --model_args pretrained=microsoft/phi-2,trust_remote_code=True \ --tasks hellaswag,mmlu_abstract_algebr...
在测评过程中,我们首先使用lm evaluation harness对C-Eval模型进行了基准测试。通过对比C-Eval与其他主流大语言模型在各项评估指标上的表现,我们发现C-Eval在文本生成和语言理解方面具有一定的优势。特别是在处理长文本和复杂语境时,C-Eval表现出了较强的泛化能力和鲁棒性。 接下来,我们使用vllm对C-Eval模型进行了更...
Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models) - phi-mamba/lm_harness_eval.py at main · goombalab/phi-mamba
Command:python eval/lm_eval_harness.py --checkpoint_dir checkpoints/stabilityai/stablelm-base-alpha-3b --precision "bf16-true" --eval_tasks "[gsm8k]" --batch_size 4 --save_filepath "results-stablelm-3b_gsm8k.json" Running greedy_until requests ...
[submodule "3rdparty/lm-evaluation-harness"] path = 3rdparty/lm-evaluation-harness url = https://github.com/EleutherAI/lm-evaluation-harness.git 1 change: 1 addition & 0 deletions 1 3rdparty/lm-evaluation-harness Submodule lm-evaluation-harness added at ca3d86 0 comments on commit aeccfa...
Ongoing research training transformer language models at scale, including: BERT & GPT-2 - Fix lm_eval_harness for GPT models (#292) · hubert-lee/Megatron-DeepSpeed@37050b8
main Breadcrumbs lm-eval-harness/ requirements.txtLatest commit HistoryHistory File metadata and controls Code Blame 1 lines (1 loc) · 5 Bytes Raw 1 -e . Footer © 2024 GitHub, Inc. Footer navigation Terms Privacy Security Status Docs Contact Manage cookies Do not share my personal ...
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models" - lolcats/lm_eval_harness/models.py at main · HazyResearch/lolcats
Adds Eleuther's LM Eval Harness as a callback in Levanter. It's much slower than it needs to be because I'm not doing any sequence packing, but it gets the job done. Scores on Llama 3 seem reasonab...