A novel approach for ranking web documents based on query-optimized personalized pagerank

2020 
Ranking plays an important role in the search process of web documents on a huge corpus. This not only reduces the searching time but also provides useful documents to the users. In this paper, we extend our earlier query-optimized PageRank approach by combining the TF-IDF and personalized PageRank algorithm to generate a robust ranking mechanism. In our earlier approach, we modeled a ranking scheme by considering the link structures of the documents along with their content. A novel feature selection technique named as ‘Term-term correlation-based feature selection’ (TCFS) is also proposed which removes all noise terms from the document before the ranking process starts. We believe that by incorporating TCFS and personalized PageRank of the documents along with their relevance will improve the retrieval results. The aim is to modify the link structure based on the similarity score between the content of the document and the user query. Experimental results show that the proposed feature selection technique can outperform the conventional feature selection techniques, and the performance of the combined TF-IDF and personalized PageRank approach is promising compared to the traditional approaches.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    63
    References
    3
    Citations
    NaN
    KQI
    []