A Splitting Criteria Based on Similarity in Decision Tree Learning

2012 
Decision trees are considered to be the most effective and widely used data mining technique for classification, their representation is intuitive and generally easy to be comprehended by humans. The most critical issue in the learning process of decision trees is the splitting criteria. In this paper, We firstly provide the definition of similarity computation that usually used in data clustering and apply it to the learning process of decision trees. Then, we propose a novel splitting criteria which chooses the split with maximum similarity and the decision tree is called mstree. At the same time, we suggest the pruning methodology. The empirical experiments conducted on benchmark datasets have verified that the algorithm has outperformed some classic algorithms such as id3,c4.5 in the classification precision, and less affected by the size of training set
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    11
    Citations
    NaN
    KQI
    []