DUTIR at TREC 2007 Blog Track.

Rui Song,Qin Tang,Daming Shi,Hongfei Lin,Zhihao Yang

DUTIR at TREC 2007 Blog Track.

2007

Rui Song
Qin Tang
Daming Shi
Hongfei Lin
Zhihao Yang

This paper describes DUTIR at TREC 2007 Blog Track. In data preprocessing, a non English language list created from the corpus was used to remove the non English blogs, blog templates were also used to extract the post and comment; in Opinion Retrieval task, information in the meta tags were also indexed; in the polarity subtask, a method based on SVM was used and the Information Gain attribute selecting method was used to assist SVM; in Feed Distillation task, three type of feeds were analyzed according to their tag structure, information extracted from particular tags of the feeds were finally indexed.

Keywords:

Feature selection
Data mining
Information retrieval
Support vector machine
Meta element
Information extraction
Data pre-processing
Computer science
Indexation
english language
information gain

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations