GPT-Code-Clippy (GPT-CC) 是 GitHub Copilot 的开源版本,这是一种基于 GPT-3 的语言模型,称为 GPT-Codex,根据 GitHub 的公开代码进行了微调。 用于训练 GPT-CC 的数据集是从SEART GitHub 搜索获得的。 数据集 包含高质量源代码的数据集 公开数据集的可能链接包括: ...
GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model -- based on GPT-3, called GPT-Codex -- that is fine-tuned on publicly available code from GitHub. Datasets The dataset used to train GPT-CC is obtained from SEART GitHub Search using the following...
Repository files navigation README code-clippy-data-processing Data processing utility for gpt-code-clippy Parser Aim of the parser_module is to parse the code used for training GPT-Code-Clippy to do multiple levels of pre-processing, evaluation.About...
GPT-Code-Clippy (GPT-CC) is an open source version ofGitHub Copilot, a language model -- based onGPT-3, calledGPT-Codex-- that is fine-tuned on publicly available code from GitHub. The dataset used to train GPT-CC is obtained fromSEART GitHub Searchusing the following criteria: ...