Technical Phrase Extraction for Patent Mining: A Multi-level Approach

2020 
Recent years have witnessed a booming increase of patent applications, which provides an open chance for revealing the inner law of innovation, but in the meantime, puts forward higher requirements on patent mining techniques. Considering that patent mining highly relies on patent document analysis, this paper makes a focused study on constructing a technology portrait for each patent, i.e., to recognize technical phrases concerned in it, which can summarize and represent patents from a technology angle. To this end, we first give a clear and detailed description about technical phrases in patents based on various prior works and analyses. Then, combining characteristics of technical phrases and multi-level structures of patent documents, we develop an Unsupervised Multi-level Technical Phrase Extraction (UMTPE) model. Particularly, a novel evaluation metric called Information Retrieval Efficiency (IRE) is designed to evaluate the extracted phrases from a new perspective, which greatly supplements traditional metrics like Precision and Recall. Finally, extensive experiments on real-world patent data show the effectiveness of our UMTPE model.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    1
    Citations
    NaN
    KQI
    []