Scalable Betweenness Centrality Maximization via Sampling

2016 
Betweenness centrality is a fundamental centrality measure in social network analysis. Given a large-scale network, how can we find the most central nodes? This question is of key importance to numerous important applications that rely on betweenness centrality, including community detection and understanding graph vulnerability. Despite the large amount of work on designing scalable approximation algorithms for betweenness centrality, estimating it on large-scale networks remains a computational challenge. In this paper, we study the Betweenness Centrality Maximization problem: given a graph $G=(V,E)$ and a positive integer $k$, find a set $S^* \subseteq V$ that maximizes betweenness centrality subject to the cardinality constraint $|S^*| \leq k$. We present an efficient randomized algorithm that provides a $(1-1/e-\epsilon)$-approximation with high probability, where $\epsilon>0$. Our results improve the current state-of-the-art result by Yoshida~\cite{yoshida2014almost}. Furthermore, we provide theoretical evidence for the validity of a crucial assumption in the literature of betweenness centrality estimation, namely that in real-world networks $O(|V|^2)$ shortest paths pass through the top-$k$ central nodes, where $k$ is a constant. On the experimental side, we perform an extensive experimental analysis of our method on real-world networks, demonstrate its accuracy and scalability, and study different properties of central nodes. Finally, we provide three graph mining applications of our method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    43
    References
    13
    Citations
    NaN
    KQI
    []