Information and Media Technologies
Online ISSN : 1881-0896
ISSN-L : 1881-0896
Computer Networks and Broadcasting
Phrase Structure Annotation and Parsing for Learner English
Keisuke SakaguchiRyo Nagata
Author information
JOURNAL FREE ACCESS

2017 Volume 12 Pages 316-339

Details
Abstract

Learner English often contains grammatical errors with structural characteristics such as omissions, insertions, substitutions, and word order errors. These errors are not covered by the existing context-free grammar (CFG) rules. Therefore, it is not at all straightforward how to annotate learner English with phrase structures. Because of this limitation, there has been almost no work on phrase structure annotation for learner corpora despite its importance and usefulness. To address this issue, we propose a phrase structure annotation scheme for learner English, that consists of five principles. We apply the annotation scheme to two different learner corpora and show (i) its effectiveness at consistently annotating learner English with phrase structure (i.e., high inter-annotator agreement); (ii) the structural characteristics (CFG rules) of learner English obtained from the annotated corpora; and (iii) phrase structure parsing performance on learner English for the first time. We also release the annotation guidelines, the annotated data, and the parser model to the public.

Content from these authors
© 2017 The Association for Natural Language Processing
Previous article Next article
feedback
Top