General benchmarks: Based on theLanguage Model Evaluation Harness, theOpen LLM Leaderboardis the main benchmark for general-purpose LLMs (like ChatGPT). There are other popular benchmarks likeBigBench,MT-Bench, etc. Task-specific benchmarks: Tasks like summarization, translation, question answerin...
View model candidates in the model leaderboard Metrics reference Predictions with custom models Make single predictions Batch predictions Batch prediction dataset requirements Make manual batch predictions Make automatic batch predictions Edit your automatic batch prediction configuration Delete your automatic batch...
Amazon SageMaker Studio Lab: Studio Lab is a free service that gives you access to AWS compute resources, in an environment based on open-source JupyterLab, without requiring an AWS account. Amazon SageMaker Canvas: Gives you the ability to use machine learning to generate predictions without nee...
View model candidates in the model leaderboard Metrics reference Predictions with custom models Make single predictions Batch predictions Batch prediction dataset requirements Make manual batch predictions Make automatic batch predictions Edit your automatic batch prediction configuration Delete your automatic batch...
View model candidates in the model leaderboard Metrics reference Predictions with custom models Make single predictions Batch predictions Batch prediction dataset requirements Make manual batch predictions Make automatic batch predictions Edit your automatic batch prediction configuration Delete your automatic batch...
View model candidates in the model leaderboard Metrics reference Predictions with custom models Make single predictions Batch predictions Batch prediction dataset requirements Make manual batch predictions Make automatic batch predictions Edit your automatic batch prediction configuration Delete your automatic batch...
View model candidates in the model leaderboard Metrics reference Predictions with custom models Make single predictions Batch predictions Batch prediction dataset requirements Make manual batch predictions Make automatic batch predictions Edit your automatic batch prediction configuration Delete your automatic batch...