Hierarchical concept indexing of full-text documents in the Unified Medical Language System information sources map

1999 
Full-text documents are a vital and rapidly growing part of online biomedical information. A single large document can contain as much information as a small database, but normally lacks the tight structure and consistent indexing of a database. Retrieval systems will often miss highly relevant parts of a document if the document as a whole appears irrelevant. Access to full-text information is further complicated by the need to search separately many disparate information resources. This research explores how these problems can be addressed by the combined use of two techniques: 1) natural language processing for automatic concept-based indexing of full text, and 2) methods for exploiting the structure and hierarchy of full-text documents. We describe methods for applying these techniques to a large collection of full-text documents drawn from the Health Services/Technology Assessment Text (HSTAT) database at the National Library of Medicine (NLM), and examine how this hierarchical concept indexing can assist both document- and source-level retrieval in the context of NLM's Information Sources Map project.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    18
    Citations
    NaN
    KQI
    []