Clustering from Constraint Graphs.
2008
In constrained clustering it is common to model the pairwise constraints as edges on the graph of observations. Using results from graph theory, we analyze such constraint graphs in two contexts, both of immediate value to practitioners. First, we explore the issue of constraint noise under several intuitive noise models. We apply results from random graph theory, which facilitate the analysis of finite-sized graphs and realistic data partitions and noise levels, to obtain a quantification of the effect noisy edges may have on any constrained clustering algorithm under a set of commonly-used assumptions. We also demonstrate the dangers in the common practice of connected-component constraint set augmentation, when used in the presence of noise. Second, we describe two practical randomized algorithms that estimate the number of induced clusters using only a small number of constraints. We conclude with an experimental evaluation that shows the effect of noise on common UCI data sets, as well as some aspects of the behavior of our algorithms.
Keywords:
- Machine learning
- Pattern recognition
- Artificial intelligence
- Constrained clustering
- Butterfly graph
- Constraint graph
- Coxeter graph
- Distance-hereditary graph
- Voltage graph
- Factor-critical graph
- Simplex graph
- Mathematics
- Cluster analysis
- Graph theory
- Random graph
- Modular decomposition
- Block graph
- Line graph
- Computer science
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
14
References
3
Citations
NaN
KQI