Extracting Gene-Disease Relations from Text to Support Biomarker Discovery

2017 
The biomedical literature constitutes a rich source of evidence to support the discovery of biomarkers. However, locating evidence in huge volumes of text can be difficult, as typical keyword queries cannot account for the meaning and structure of text. Text mining (TM) methods carry out automated semantic analysis of documents, to facilitate structured searching that can more precisely match users' information needs. We describe our TM approach to the detection of sentence-level associations between genes and diseases, as a first step towards developing a sophisticated search system targeted at locating biomarker evidence in the literature. We vary the sophistication of our detection methodology according to sentence complexity, using either co-occurring mentions of genes and diseases, or linguistic patterns obtained using evidence from approximately 1 million biomedical abstracts. We demonstrate that this method can detect associations more successfully than applying a single technique, with an accuracy that compares highly favourably to related efforts. We also show that the identified relations can complement those detected using alternative approaches.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    57
    References
    5
    Citations
    NaN
    KQI
    []