Organising documents based on standard-example split test

Kenta Fukuoka,Tomofumi Nakano,Nobuhiro Inuzuka

Organising documents based on standard-example split test

2005

Kenta Fukuoka
Tomofumi Nakano
Nobuhiro Inuzuka

A purpose of text-mining is to summarise a large collection of documents. This paper proposes a new method to view a summary of large document set. It consists of two techniques, one of which constructs classification trees using a split test called the standard-example (standard-document) split test, and the other is a method to display features in each class of documents classified in the trees. The standard-example split test is a test which divides examples by their distance (or similarity) from a standard-example which is selected by a criterion. This is the first method which applies this test to text mining. The display method exhibits representative words of document classes which emphasise their feature.

Keywords:

Data mining
Machine learning
Similitude
Decision tree learning
Text mining
Artificial intelligence
Computer science
information gain

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations