Stylistic Characteristics and Retrieval of Chinese Academic Literatures: A Comparative Study on the Stylistic Characteristics between Academic Papers and Press Reports of Computer Science

2014 
Computer science academic documents have distinctive stylistic features that can be explored to facilitate the automatic identification and retrieval of Chinese computer science academic papers on the web. This paper establishes a computer academic literature corpus and an IT news corpus to find the different typical expressions, average length of sentences, ratio of Chinese characters to Roman alphabets, and gives them different weights. Finally, this paper applies the results to Baidu-based NSIRS system. Precision evaluation over the NSIRS shows significant advantage of our approach over previous study using the same system. 2 figs. 4 tabs. 14 refs.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []