A Cross Language Text Categorization Algorithm from the Perspective of Information Retrieval

Yue Liu,Ming Tian,Weitao Zhou,Lin Dai

A Cross Language Text Categorization Algorithm from the Perspective of Information Retrieval

2012

Yue Liu
Ming Tian
Weitao Zhou
Lin Dai

In this paper, we propose a novel method that performs Cross Language Text Categorization (CLTC) from the perspective of Information Retrieval. We present an input document in target language in the form of a query in source language. Then we retrieve the training documents in source language and find K most relevant results. At last, we use the class labels of the K results to predict the class of the input document. The only external resource required by our method is a bilingual dictionary. Experimental results show that our method gives promising performance, which is better than translation-based method.

Keywords:

CLTC
Language identification
Information retrieval
RDF query language
Universal Networking Language
Document retrieval
Data control language
Concept search
Language model
Computer science
Natural language processing
Question answering
Information extraction
Artificial intelligence

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations