Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program

Proc AMIA Symp. 2001:17-21.

Abstract

The UMLS Metathesaurus, the largest thesaurus in the biomedical domain, provides a representation of biomedical knowledge consisting of concepts classified by semantic type and both hierarchical and non-hierarchical relationships among the concepts. This knowledge has proved useful for many applications including decision support systems, management of patient records, information retrieval (IR) and data mining. Gaining effective access to the knowledge is critical to the success of these applications. This paper describes MetaMap, a program developed at the National Library of Medicine (NLM) to map biomedical text to the Metathesaurus or, equivalently, to discover Metathesaurus concepts referred to in text. MetaMap uses a knowledge intensive approach based on symbolic, natural language processing (NLP) and computational linguistic techniques. Besides being applied for both IR and data mining applications, MetaMap is one of the foundations of NLM's Indexing Initiative System which is being applied to both semi-automatic and fully automatic indexing of the biomedical literature at the library.

MeSH terms

  • Abstracting and Indexing / methods
  • Algorithms
  • Information Storage and Retrieval
  • Natural Language Processing*
  • Unified Medical Language System*
  • Vocabulary, Controlled