BiLSTM-CRF for geological named entity recognition from the geoscience literature

2019 
Many detailed geoscience reports lie unused, offering both challenges and opportunities for information extraction. In geoscience research, geological named entity recognition (GNER) is an important task in the field of geoscience information extraction. Regarding numerical geoscience data, research on information extraction remains limited. Most conventional NER approaches are heavily dependent on feature engineering, and such sentence-level-based methods suffer from the tagging inconsistency problem. Based on the above observations, this paper proposes a neural network approach, namely, attention-based bidirectional long short-term memory with a conditional random field layer (Att-BiLSTM-CRF), for name entity recognition to extract information entities describing geoscience information from geoscience reports. This approach leverages global information learned from an attention mechanism to enforce tagging consistency across multiple instances of the same token in a document. Experiments on the constructed dataset show that our method achieves comparable performance to that of other state-of-the-art systems. Additionally, our method achieved an average F1 score of 91.47% in the NER extraction task.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    50
    References
    7
    Citations
    NaN
    KQI
    []