Salience Rank: Efficient Keyphrase Extraction with Topic Modeling.

2017 
Topical PageRank (TPR) uses latent topic distribution inferred by Latent Dirichlet Allocation (LDA) to perform ranking of noun phrases extracted from documents. The ranking procedure consists of running PageRank K times, where K is the number of topics used in the LDA model. In this paper, we propose a modification of TPR, called Salience Rank. Salience Rank only needs to run PageRank once and extracts comparable or better keyphrases on benchmark datasets. In addition to quality and efficiency benefit, our method has the flexibility to extract keyphrases with varying tradeoffs between topic specificity and corpus specificity.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    30
    Citations
    NaN
    KQI
    []