Dynamic k-NN Classification Based on Region Homogeneity

2020 
The effectiveness of the k-NN classifier is highly dependent on the value of the parameter k that is chosen in advance and is fixed during classification. Different values are appropriate for different datasets and parameter tuning is usually inevitable. A dataset may include simultaneously well-separated and not well-separated classes as well as noise in certain regions of the metric space. Thus, a different k value should be employed depending on the region where the unclassified instance lies. The paper proposes a new algorithm with five heuristics for dynamic k determination. The heuristics are based on a fast clustering pre-processing procedure that builds an auxiliary data structure. The latter provides information about the region where the unclassified instance lies. The heuristics exploit the information and dynamically determine how many neighbours will be examined. The data structure construction and the heuristics do not involve any input parameters. The proposed heuristics are tested on several datasets. The experimental results illustrate that in many cases they can achieve higher classification accuracy than the k-NN classifier that uses the best tuned k value.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    1
    Citations
    NaN
    KQI
    []