A Strategy for Annotating Clinical Records with Phenotypic Information relating to the Chronic Obstructive Pulmonary Disease

2014 
Background: Chronic obstructive pulmonary disease (COPD) is a life-threatening lung disorder whose recent prevalence has led to an increasing burden on public healthcare. Phenotypic information in electronic clinical records is essential in providing suitable personalised treatment to patients with COPD. However, as phenotypes are often “hidden” within free text in clinical records, clinicians could benefit from text mining systems that facilitate their prompt recognition. This paper reports on a semi-automatic methodology for producing a corpus that can ultimately support the development of text mining tools that, in turn, will expedite the process of identifying groups of COPD patients. Methods and Results: A corpus of 1,000 clinical records was formed based on selection criteria informed by the expertise of two COPD specialists. We developed an annotation scheme that is aimed to produce fine-grained, expressive and computable COPD annotations without burdening our curators with a highly complicated task. This was implemented in the Argo platform by means of a semi-automatic annotation workflow that integrates several text mining tools, including a graphical user interface for marking up documents. The automatically generated annotations show that around 40% of phenotypic expressions can be decomposed into granular concept types by our selected recognisers. Conclusion: We describe in this work the means by which we aim to support the process of COPD phenotype curation from a clinical corpus, i.e., by the application of various text mining tools integrated into an annotation workflow. Although the corpus being described is a work in progress, our initial results are encouraging and have accordingly guided our ongoing development work.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    39
    References
    3
    Citations
    NaN
    KQI
    []