NILE: Fast Natural Language Processing for Electronic Health Records

Yu, Sheng; Cai, Tianrun; Cai, Tianxi

Computer Science > Computation and Language

arXiv:1311.6063v5 (cs)

[Submitted on 23 Nov 2013 (v1), last revised 16 Jul 2019 (this version, v5)]

Title:NILE: Fast Natural Language Processing for Electronic Health Records

Authors:Sheng Yu, Tianrun Cai, Tianxi Cai

View PDF

Abstract:Objective: Narrative text in Electronic health records (EHR) contain rich information for medical and data science studies. This paper introduces the design and performance of Narrative Information Linear Extraction (NILE), a natural language processing (NLP) package for EHR analysis that we share with the medical informatics community. Methods: NILE uses a modified prefix-tree search algorithm for named entity recognition, which can detect prefix and suffix sharing. The semantic analyses are implemented as rule-based finite state machines. Analyses include negation, location, modification, family history, and ignoring. Result: The processing speed of NILE is hundreds to thousands times faster than existing NLP software for medical text. The accuracy of presence analysis of NILE is on par with the best performing models on the 2010 i2b2/VA NLP challenge data. Conclusion: The speed, accuracy, and being able to operate via API make NILE a valuable addition to the NLP software for medical informatics and data science.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1311.6063 [cs.CL]
	(or arXiv:1311.6063v5 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1311.6063

Submission history

From: Sheng Yu [view email]
[v1] Sat, 23 Nov 2013 22:39:52 UTC (99 KB)
[v2] Fri, 14 Feb 2014 23:21:43 UTC (99 KB)
[v3] Wed, 2 Apr 2014 03:34:41 UTC (99 KB)
[v4] Mon, 13 Oct 2014 20:43:09 UTC (99 KB)
[v5] Tue, 16 Jul 2019 14:12:22 UTC (610 KB)

Computer Science > Computation and Language

Title:NILE: Fast Natural Language Processing for Electronic Health Records

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:NILE: Fast Natural Language Processing for Electronic Health Records

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators