Improved part-of-speech tagging for online conversational text with word clusters
Human Language Technologies: Conference of the North American …, 2013•research.ed.ac.uk
We consider the problem of part-of-speech tagging for informal, online conversational text.
We systematically evaluate the use of large-scale unsupervised word clustering and new
lexical features to improve tagging accuracy. With these features, our system achieves state-
of-the-art tagging results on both Twitter and IRC POS tagging tasks; Twitter tagging is
improved from 90% to 93% accuracy (more than 3% absolute). Qualitative analysis of these
word clusters yields insights about NLP and linguistic phenomena in this genre. Additionally …
We systematically evaluate the use of large-scale unsupervised word clustering and new
lexical features to improve tagging accuracy. With these features, our system achieves state-
of-the-art tagging results on both Twitter and IRC POS tagging tasks; Twitter tagging is
improved from 90% to 93% accuracy (more than 3% absolute). Qualitative analysis of these
word clusters yields insights about NLP and linguistic phenomena in this genre. Additionally …
Abstract
We consider the problem of part-of-speech tagging for informal, online conversational text. We systematically evaluate the use of large-scale unsupervised word clustering and new lexical features to improve tagging accuracy. With these features, our system achieves state-of-the-art tagging results on both Twitter and IRC POS tagging tasks; Twitter tagging is improved from 90% to 93% accuracy (more than 3% absolute). Qualitative analysis of these word clusters yields insights about NLP and linguistic phenomena in this genre. Additionally, we contribute the first POS annotation guidelines for such text and release a new dataset of English language tweets annotated using these guidelines. Tagging software, annotation guidelines, and large-scale word clusters are available at: http://www. ark. cs. cmu. edu/TweetNLP
research.ed.ac.uk
Showing the best result for this search. See all results