Semi-Automatic Terminology Generation for Information Extraction from German Chest X-Ray Reports.
2017
: Extraction of structured data from textual reports is an important subtask for building medical data warehouses for research and care. Many medical and most radiology reports are written in a telegraphic style with a concatenation of noun phrases describing the presence or absence of findings. Therefore a lexico-syntactical approach is promising, where key terms and their relations are recognized and mapped on a predefined standard terminology (ontology). We propose a two-phase algorithm for terminology matching: In the first pass, a local terminology for recognition is derived as close as possible to the terms used in the radiology reports. In the second pass, the local terminology is mapped to a standard terminology. In this paper, we report on an algorithm for the first step of semi-automatic generation of the local terminology and evaluate the algorithm with radiology reports of chest X-ray examinations from Wurzburg university hospital. With an effort of about 20 hours work of a radiologist as domain expert and 10 hours for meetings, a local terminology with about 250 attributes and various value patterns was built. In an evaluation with 100 randomly chosen reports it achieved an F1-Score of about 95% for information extraction.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
4
Citations
NaN
KQI