1.文章背景和问题 文章认为大多数 Sequence-to-sequence (seq2seq) 在解决语法纠错(grammatical error correction, GEC) 任务时有两个弊端: 模型受到训练数据大小的限制,模型泛化性不足。seq2seq模型仅在 error-correction 文本对上训练,让具有上百万参数的模型并不具有很好的泛化性。即当错误句子和训练实例略有不...
1. 第一种,作者通过有标注的训练集的y和x构造一个correction-to-error 的mapping,这个mapping保存了当前token可以被替换的词和改词所在的句子。然后,为了保证多样性和语义上比较接近吗,作者这里选择使用当前token的句子和所有候选句子的编辑距离相似度来计算权重,也就是这里的权重wi来选择句子,如果两个句子之间相似度...
Grammatical Error Correction (GEC) is the task of correcting different kinds of errors in text such as spelling, punctuation, grammatical, and word choice errors. GEC is typically formulated as a sentence correction task. A GEC system takes a potentially erroneous sentence as input and is expected...
Grammatical Error Correction (GEC) is the task of correcting different kinds of errors in text such as spelling, punctuation, grammatical, and word choice errors. GEC is typically formulated as a sentence correction task. A GEC system takes a potentially erroneous sentence as input and is expected...
nlprustmachine-learningnatural-language-processingspellcheckgrammarstyle-checkerproofreadinggrammatical-error-correction UpdatedMay 23, 2023 Rust HillZhang1999/MuCGEC Star498 Code Issues Pull requests MuCGEC中文纠错数据集及文本纠错SOTA模型开源;Code & Data for our NAACL 2022 Paper "MuCGEC: a Multi-Referenc...
However, neural-based seq2seq grammatical error correction models are computationally expensive both in training and in translation inference. Also, they tend to suffer from poor generalization and arrive at inept capabilities due to limited error-corrected data, and thus, incapable of effectively ...
1.比赛第一名:有道团队的论文:Youdao’s Winning Solution to the NLPCC-2018 Task 2 Challenge: A Neural Machine Translation Approach to Chinese Grammatical Error Correction 2.比赛第二名:阿里团队的论文:Chinese Grammatical Error Correction Using Statistical and Neural Models ...
Grammatical error correction (GEC) systems strive to correct both global errors in word order and usage, and local errors in spelling and inflection. Further developing upon recent work on neural machine translation, we propose a new hybrid neural model with nested attention layers for GE...
In this work, we reinvestigate the classifier-based approach to article and preposition error correction going beyond linguistically mo- tivated factors. We show that state-of-the-art results can be achieved without relying on a plethora of heuristic rules, complex feature engineering and advance...
Abstract Nowadays, data augmentation through synthetic data has been widely used in the field of Grammatical Error Correction (GEC) to alleviate the problem of data scarcity. However, these synthetic data are mainly used in the pre-training phase rather than the data-limited fine tuning phase due...