A Deep Learning Approach for Chinese Tourism Field Attribute Extraction

2019 
Attribute extraction is a very significant and fundamental task of Natural Language Processing (NLP). It aims to extract missing attribute knowledge for entities from unstructured contexts. Attribute knowledge is an important part of the knowledge graph. Chinese tourism field attribute extraction can help people to construct knowledge graph in Chinese tourism field. In this paper, we crawl attractions information for different travel websites as annotating raw data. We formalize the problem as a sequence labeling task and propose a novel deep sequence labeling model, called BERT-ResCNNs-BLSTM-CRF. First, we fine-tune the pre-trained BERT model to get character embedding. We employ multiple convolutional layers to capture local features from character embedding. In order to tackle the vanishing gradient problem in deep convolutional networks, we use deep residual learning. Then the local features are concatenated with the character embedding vector to feed into Bidirectional long short-term network (BLSTM) to obtain local information of each sentence. Finally, we utilize the conditional random field (CRF) to predict a label sequence for an input sentence. As far as we know, we are the first to propose the joint model for sequence labeling task. Experimental results demonstrate the effectiveness of our method compared with several state-of-art baselines.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []