Using Convolutional Neural Networks to Extract Keywords and Keyphrases: A Case Study for Foodborne Illnesses

2019 
Keywords and keyphrases are important for Natural Language Processing (NLP) applications such as document classification, information retrieval, and topic identification. They are also useful for capturing different classes of entities from content related to healthcare, biology, food science, and journalism fields. There are different approaches to extract keywords and keyphrases. Deep learning approaches have achieved high-performance results in terms of keywords and keyphrase extraction. However, among deep learning approaches, Convolutional Neural Network (CNN) potentials have not been fully explored as a technique for extracting keywords and keyphrases. In this work, we performed a comparative study using a benchmark dataset, the IEEE Xplore collection to test the CNN generalization ability in selecting keywords and keyphrases. In addition, we further collected a corpus in the field of foodborne illness outbreaks. We utilize this corpus to develop a CNN-based identification approach of keywords and keyphrases related to foodborne illnesses. Results were compared with several supervised (KEA, GuidedLDA) and unsupervised (LDA) machine learning algorithms. CNN outperformed these algorithms in selecting relevant keywords and keyphrases for foodborne illnesses. The findings of this study have also confirmed superiority of CNN-based algorithm for keyphrase extraction to other machine learning approaches.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    6
    Citations
    NaN
    KQI
    []