GBAGC: A General Bayesian Framework for Attributed Graph Clustering
2014
Graph clustering, also known as community detection, is a long-standing problem in data mining. In recent years, with the proliferation of rich attribute information available for objects in real-world graphs, how to leverage not only structural but also attribute information for clustering attributed graphs becomes a new challenge. Most existing works took a distance-based approach. They proposed various distance measures to fuse structural and attribute information and then applied standard techniques for graph clustering based on these distance measures. In this article, we take an alternative view and propose a novel Bayesian framework for attributed graph clustering. Our framework provides a general and principled solution to modeling both the structural and the attribute aspects of a graph. It avoids the artificial design of a distance measure in existing methods and, furthermore, can seamlessly handle graphs with different types of edges and vertex attributes. We develop an efficient variational method for graph clustering under this framework and derive two concrete algorithms for clustering unweighted and weighted attributed graphs. Experimental results on large real-world datasets show that our algorithms significantly outperform the state-of-the-art distance-based method, in terms of both effectiveness and efficiency.
Keywords:
- Machine learning
- Clustering coefficient
- k-medians clustering
- Data mining
- Hierarchical clustering
- CURE data clustering algorithm
- Correlation clustering
- Cluster analysis
- Artificial intelligence
- Canopy clustering algorithm
- Mathematics
- Pattern recognition
- Brown clustering
- Single-linkage clustering
- Constrained clustering
- Computer science
- Fuzzy clustering
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
36
References
55
Citations
NaN
KQI