Section 1: Recap: LMs and decoding algorithms 1.1 自然语言生成(NLG) 自然语言生成(NLG) 自然语言生成指的是我们生成 (即写入) 新文本的任何任务 NLG 包括以下内容: 机器翻译 摘要 对话(闲聊和基于任务) 创意写作:讲故事,诗歌创作 自由形式问答 (即生成答案,从文本或知识库中提取) 图像字幕 1.2 要点
More on decoding algorithms / 解码算法 NLG tasks and neural approaches to them / NLG任务及其神经网络解法 NLG evaluation: a tricky situation / NLG评估:一个棘手的情况 Concluding thoughts on NLG research, current trends, and the future / NLG研究的一些想法,目前的趋势,未来的可能方向 1.语言模型与解...
More on decoding algorithms / 解码算法 NLG tasks and neural approaches to them / NLG任务及其神经网络解法 NLG evaluation: a tricky situation / NLG评估:一个棘手的情况 Concluding thoughts on NLG research, current trends, and the future / NLG研究的一些想法,目前的趋势,未来的可能方向 1.语言模型与解...
Text summarization: NLP algorithms can automatically generate concise summaries of lengthy articles or documents, allowing users to quickly grasp the main points without reading the entire text. This feature is particularly useful in news apps, research tools, or content curation platforms. S...
More on decoding algorithms /解码算法 NLG tasks and neural approaches to them /NLG任务及其神经网络解法 NLG evaluation: a tricky situation /NLG评估:一个棘手的情况 Concluding thoughts on NLG research, current trends, and the future /NLG研究的一些想法,目前的趋势,未来的可能方向 ...
句法分析算法 (Parsing algorithms) 语法和基于知识的方法 (Grammar and knowledge-based approach) 多任务方法 (Multi-task approaches) 面向大型多语言的方法 (Massively multilingual oriented approaches) 低资源语言词性标注、句法分析和相关任务 (Low-resource languages pos-tagging, parsing and related tasks) 形态...
Video summarization helps in efficient storage and also quick surfing of large collection of videos without losing the important ones. The summarization of the videos is done with the help of subtitles which is obtained using several text summarization algorithms. The proposed technique generates the ...
import gensim from gensim import corpora text1 = ["""Gensim is a free open-source Python library for representing documents as semantic vectors, as efficiently and painlessly as possible. Gensim is designed to process raw, unstructured digital texts using unsupervised machine learning algorithms.""...
由于NLG在Decoder阶段都是要一个个“蹦词”的,所以文章最后会介绍一些常用的Decoding algorithms。 1.Summarization 技术上一般将Summarization分为抽取式(Extractive summarization)和生成式(Abstractive summarization)。抽取式摘要指的从一片文档中选择一些重要的句子,作为该文档的摘要,这种方法相对简单,但是不够灵活;生成式...
Tokenization is the initial step in NLP, where the text is divided into individual words or phrases called tokens. By dividing the text into tokens, the algorithms get a basic understanding of the structure and context of the text, making it easier to process and analyze. The word tokens are...