Stylistic Characteristics and Retrieval of Chinese Academic Literatures: A Comparative Study on the Stylistic Characteristics between Academic Papers and Press Reports of Computer Science
2014
Computer science academic documents have distinctive stylistic features that can be explored to facilitate the automatic identification and retrieval of Chinese computer science academic papers on the web. This paper establishes a computer academic literature corpus and an IT news corpus to find the different typical expressions, average length of sentences, ratio of Chinese characters to Roman alphabets, and gives them different weights. Finally, this paper applies the results to Baidu-based NSIRS system. Precision evaluation over the NSIRS shows significant advantage of our approach over previous study using the same system. 2 figs. 4 tabs. 14 refs.
Keywords:
- Correction
- Cite
- Save
- Machine Reading By IdeaReader
0
References
1
Citations
NaN
KQI