Density-based clustering with side information and active learning

Viet Vu Vu,Hong-Quan Do

Density-based clustering with side information and active learning

2017

Data clustering is one of the most important tasks in machine learning and data mining, which aims to discover structure and the relational between observations inside data sets. In many situations, side information about the clusters is available in addition to the values of the features. For example, the cluster labels of some observations may be known (called seeds), or certain observations may be known to belong (or not) to the same cluster (pairwise constraints). Many semi-supervised clustering algorithms are presented in literature to improve the clustering accuracy by effectively exploring these available side information. However, each algorithm usually uses one kind of side information. In this paper, we propose a new semi-supervised density based clustering which integrates both kinds of side information, and embeds an active learning strategy in the process of finding clusters, named MCSSDBS. Experiments conducted on real data sets from UCI show the effectiveness of our algorithm compared with the semi-supervised density-based clustering (SSDBSCAN).

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations