Extracting Information from Molecular Pathway Diagrams

2017 
Health and life sciences’ research fields like personalized medicine, drug discovery, pharmacovigilance and systems biology make an intensive use of graphical information to represent knowledge in the form of domain-specific diagrams, such as molecular pathway‘s. The aim is to provide added value to written text in scientific literature and related documents. Enabling access to all the existing literature for further research requires enabling access to the information contained in these diagrams. Molecular pathways are very different from more conventional diagrams (e.g. flowcharts), and therefore interpretation of molecular pathway diagrams requires domain-specific knowledge to remove ambiguity. In this paper, we propose a method that automatically extracts information from molecular pathways using computer vision techniques. To the best of our knowledge this is the first attempt to retrieve information from images depicting molecular pathway diagrams. The lack of a significant, publicly available dataset with annotated ground truth has led to experimental evaluation on synthetic data. Results show high precision and recall values for the detection of entities and relations. We compare and describe the substantial differences between the proposed method and prior art on the closest diagram type using CLEF-IP flowchart summarization task.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    40
    References
    1
    Citations
    NaN
    KQI
    []