Group Labeling Methodology Using Distance-based Data Grouping Algorithms

2020 
Clustering algorithms are often used to form groups based on the similarity of their members. In this context, understanding a group is just as important as its composition. Identifying, or labeling groups can assist with their interpretation and, consequently, guide decision-making efforts by taking into account the features from each group. Interpreting groups can be beneficial when it is necessary to know what makes an element a part of a given group, what are the main features of a group, and what are the differences and similarities among them. This work describes a method for finding relevant features and generate labels for the elements of each group, uniquely identifying them. This way, our approach solves the problem of finding relevant definitions that can identify groups. The proposed method transforms the standard output of an unsupervised distance-based clustering algorithm into a Pertinence Degree (GP), where each element of the database receives a GP concerning each formed group. The elements with their GPs are used to formulate ranges of values for their attributes. Such ranges can identify the groups uniquely. The labels produced by this approach averaged  94 . 83%  of correct answers for the analyzed databases, allowing a natural interpretation of the generated definitions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []