Minimizing Dependence between Graphs

2017 
In recent years, modeling the relation between two graphs has received unprecedented attention from researchers due to its wide applications in many areas, such as social analysis and bioinformatics. The nature of relations between two graphs can be divided into two categories: the vertex relation and the link relation. Many studies focus on modeling the vertex relation between graphs and try to find the vertex correspondence between two graphs. However, the link relation between graphs has not been fully studied. Specifically, we model the cross-graph link relation as cross-graph dependence, which reflects the dependence of a vertex in one graph on a vertex in the other graph. A generic problem, called Graph Dependence Minimization (GDM), is defined as: given two graphs with cross-graph dependence, how to select a subset of vertexes from one graph and copy them to the other, so as to minimize the cross-graph dependence. Many real applications can benefit from the solution to GDM. Examples include reducing the cross-language links in online encyclopedias, optimizing the cross-platform communication cost between different cloud services, and so on. This problem is trivial if we can select as many vertexes as we want to copy. But what if we can only choose a limited number of vertexes to copy so as to make the two graphs as independent as possible? We formulate GDM with a budget constraint into a combinatorial optimization problem, which is proven to be NP-hard. We propose two algorithms to solve GDM. Firstly, we prove the submodularity of the objective function of GDM and adopt the size-constrained submodular minimization (SSM) algorithm to solve it. Since the SSM-based algorithm cannot scale to large graphs, we design a heuristic algorithm with a provable approximation guarantee. We prove that the error achieved by the heuristic algorithm is bounded by an additive factor which is proportional to the square of the given budget. Extensive experiments on both synthetic and real-world graphs show that the proposed algorithms consistently outperform the well-studied graph centrality measure based solutions. Furthermore, we conduct a case study on the Wikipedia graphs with millions of vertexes and links to demonstrate the potential of GDM to solve real-world problems.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []