Natural Language Processing Techniques for Extracting and Categorizing Finding Measurements in Narrative Radiology Reports

2015 
Background: Accumulating quantitative outcome parameters may contribute to constructing a healthcare organization in which outcomes of clinical procedures are reproducible and predictable. In imaging studies, measurements are the principal category of quantitative parameters. Objectives: The purpose of this work is to develop and evaluate two natural language processing engines that extract finding and organ measurements from narrative radiology reports and to categorize extracted measurements by their “temporality“. Methods: The measurement extraction engine is developed as a set of regular expressions. The engine was evaluated against a manually created ground truth. Automated categorization of measurement temporality is defined as a machine learning problem. A ground truth was manually developed based on a corpus of radiology reports. A maximum entropy model was created using features that characterize the measurement itself and its narrative context. The model was evaluated in a ten-fold cross validation protocol. Results: The measurement extraction engine has precision 0.994 and recall 0.991. Accuracy of the measurement classification engine is 0.960. Conclusions:The work contributes to machine understanding of radiology reports and may find application in software applications that process medical data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    27
    Citations
    NaN
    KQI
    []