XGen-7B模型包含3个版本,分别是XGen-7B-4K-base、XGen-7B-8K-base和XGen-7B-8K-inst。其中,XGen-7B-4K-base是在8000亿tokens数据上训练的,输入序列长度为2k,然后继续以4k输入长度继续训练了4000亿tokens的数据。XGen-7B-8K-base则是基于XGen-7B-4K-base进行初始化,在3000亿tokens数据上进行输入序列长度为8k的...
XGen-7B-8K-base在之前提到的模型的基础上增加了3000亿个token,使其总的上下文理解能力达到了1.5万亿个token。这个模型也以Apache2.0许可发布。 XGen-7B-inst在公共领域的教学数据上进行了微调,包括databricks-dolly-15k、oasst1、Baize和与GPT相关的数据集。该模型在4,000个和8,000个token上进行了训练,仅用于研究...
XGen-7B-8K-Basewith support for 8K sequence length. XGen-7B-8k-Instwith instruction-finetuning (for research purpose only). The tokenization uses the OpenAI Tiktoken package, which can be installed viapip: The models can be used as auto-regressive samplers as follows: ...
To address this, we have trained XGen, a series of 7B parameter models on up to 8K sequence length for up to 1.5T tokens. We have also finetuned the XGen models on public-domain instructional data, creating their instruction-tuned counterparts (XGen-Inst). We open-source our models for ...