Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models

Li, Chong; Zhang, Cenyuan; Zheng, Xiaoqing; Huang, Xuanjing

Computer Science > Computation and Language

arXiv:2105.14813 (cs)

[Submitted on 31 May 2021 (v1), last revised 1 Jun 2021 (this version, v2)]

Title:Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models

Authors:Chong Li, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang

View PDF

Abstract:A sequence-to-sequence learning with neural networks has empirically proven to be an effective framework for Chinese Spelling Correction (CSC), which takes a sentence with some spelling errors as input and outputs the corrected one. However, CSC models may fail to correct spelling errors covered by the confusion sets, and also will encounter unseen ones. We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances, and apply a task-specific pre-training strategy to enhance the model. The generated adversarial examples are gradually added to the training set. Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models across three different datasets, achieving stateof-the-art performance for CSC task.

Comments:	Accepted by ACL 2021
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2105.14813 [cs.CL]
	(or arXiv:2105.14813v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2105.14813

Submission history

From: Chong Li [view email]
[v1] Mon, 31 May 2021 09:17:33 UTC (320 KB)
[v2] Tue, 1 Jun 2021 15:18:14 UTC (320 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-05

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Chong Li
Xuanjing Huang

export BibTeX citation

Computer Science > Computation and Language

Title:Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators