How can we extend a pre-trained model tomany language understanding tasks, without labeled or additional unlabeled data? Pre-trainedlanguage models (PLMs) have been effective for a wide range of NLP tasks. However, existing approaches either require fine-tuningon downstream labeled datasets or manua...
Pre-trained language models (PLMs) are first trained on a large dataset and then directly transferred to downstream tasks, or further fine-tuned on another small dataset for specific NLP tasks. Early PLMs, such as Skip-Gram [1] and GloVe [2], are shallow neural networks, and their word e...
Pre-trained Language Models Can be Fully Zero-Shot Learners Xuandong Zhao, Siqi Ouyang, Zhiguo Yu, Ming Wu, Lei Li ACL 2023|July 2023 How can we extend a pre-trained model to many language understanding tasks, without labeled or additional unlabeled data? Pre-trained lang...
Pre-trained language models (PLMs) for Tagalog can be categorized into two kinds: monolingual models and multilingual models. However, existing monolingual models are only trained in small-scale Wikipedia corpus and multilingual models fail to deal with Tagalog-specific knowledge needed for various ...
1 报告摘要 Pre-trained Language Models (PLMs), such as BERT and ERNIE, lead to remarkable headway in many Natural Language Processing tasks. In the information retrieval (IR) community, PLMs have also attracted much atte...
Pre-trained language models (PLMs) aim to learn universal language representations by conducting self-supervised training tasks on large-scale corpora. Since PLMs capture word semantics in different contexts, the quality of word representations highly depends on word frequency, which usually follows a ...
Pre-trained language models (PLMs) serve as backbones for various real-world systems. For high-stake applications, it's equally essential to have reasonable confidence estimations in predictions. While the vanilla confidence scores of PLMs can already be effectively utilized, PLMs consistently become ...
Impossible Triangle: What’s Next for Pre-trained Language Models? 不可能的三角:预训练语言模型的下一步是什么? 机构: 微软认知服务研究小组 摘要: 大规模预训练语言模型(PLM)的最新发展极大地提高了模型在各种NLP任务中的能力,即在特定任务微调和零次/少次学习后的性能。然而,许多这样的模型都具有惊人的巨大...
Pre-trained language models (PLMs) have played an increasing role in multimedia research. In terms of vision-language (VL) tasks, they often serve as a language encoder and still require an additional fusion network for VL reasoning, resulting in excessive memory overhead. In this paper, we ...
Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3, with 175 billion parameters and 570GB training data, drew a lot of attention due to the capacity of few-shot (even zero-shot) learning. However, applying GPT-3 to address ...