Universal Dependencies for Learner English

Berzak, Yevgeni; Kenney, Jessica; Spadine, Carolyn; Wang, Jing Xian; Lam, Lucia; Mori, Keiko Sophie; Garza, Sebastian; Katz, Boris

Computer Science > Computation and Language

arXiv:1605.04278 (cs)

[Submitted on 13 May 2016 (v1), last revised 8 Jun 2016 (this version, v2)]

Title:Universal Dependencies for Learner English

Authors:Yevgeni Berzak, Jessica Kenney, Carolyn Spadine, Jing Xian Wang, Lucia Lam, Keiko Sophie Mori, Sebastian Garza, Boris Katz

View PDF

Abstract:We introduce the Treebank of Learner English (TLE), the first publicly available syntactic treebank for English as a Second Language (ESL). The TLE provides manually annotated POS tags and Universal Dependency (UD) trees for 5,124 sentences from the Cambridge First Certificate in English (FCE) corpus. The UD annotations are tied to a pre-existing error annotation of the FCE, whereby full syntactic analyses are provided for both the original and error corrected versions of each sentence. Further on, we delineate ESL annotation guidelines that allow for consistent syntactic treatment of ungrammatical English. Finally, we benchmark POS tagging and dependency parsing performance on the TLE dataset and measure the effect of grammatical errors on parsing accuracy. We envision the treebank to support a wide range of linguistic and computational research on second language acquisition as well as automatic processing of ungrammatical language. The treebank is available at this http URL. The annotation manual used in this project and a graphical query engine are available at this http URL.

Comments:	Updated parsing experiments to EWT v1.3, improved grammatical error marking, minor revisions. To appear in ACL 2016
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1605.04278 [cs.CL]
	(or arXiv:1605.04278v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1605.04278

Submission history

From: Yevgeni Berzak [view email]
[v1] Fri, 13 May 2016 18:45:22 UTC (78 KB)
[v2] Wed, 8 Jun 2016 02:33:34 UTC (65 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2016-05

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Yevgeni Berzak
Jessica Kenney
Carolyn Spadine
Jing Xian Wang
Lucia Lam

…

export BibTeX citation

Computer Science > Computation and Language

Title:Universal Dependencies for Learner English

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Universal Dependencies for Learner English

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators