K-nearest neighbor and C4.5 algorithms as data mining methods: advantages and difficulties

2003 
Summary form only given. Data mining is considered a fast growing technology as a result of the combination of some existing technologies such as machine learning, database systems, statistics and visualization. Some data mining algorithms has been used to offer a solution to classification problems in databases. To explain this task, comparison between the k-nearest neighbor (K-NN) and C4.5 algorithms in terms of their performance as classifier is carried out. While the K-NN is a supervised learning algorithm, C4.5 is an inductive learning algorithm. It is shown that the K-NN algorithm has the options for weight setting, normalization, editing the data and it can be used to develop hybrid systems for data mining. It is also shown the C4.5 algorithm can generate rules from a single tree with the ability to transform multiple decision trees into a set of classification rules and it can be used to better scale up rule generation in terms of size and number of rules and learning time.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    8
    Citations
    NaN
    KQI
    []