NoLeaks: Differentially Private Causal Discovery Under Functional Causal Model

2022 
Causal inference is widely used in clinical research, economic analysis, and other fields. As is the case with many statistical data, the findings of causal discovery (i.e., causal graph) might leak demographic information of participants. For example, a causal link between one genome and a rare disease can reveal the participation of a minority patient in genome-wide association studies. To date, differential privacy has served as the de facto foundation for guaranteeing the privacy of causal discovery algorithms. However, existing approaches to protecting causal discovery from privacy leakage rely heavily on private conditional independence tests, which generate a considerable amount of noise and are thus prone to inaccuracy. As a result of their limited accuracy and scalability, they are insufficient for non-trivial datasets (e.g., those with more than ten variables). In this paper, we advocate a novel focus on enforcing privacy for causal discovery algorithms based on functional causal models. First, we propose NoLeaks, a differentially private causal discovery algorithm, which manifests both high accuracy and efficiency compared with prior works. Second, we design a quasi-Newton numerical optimization algorithm for solving NoLeaks in a highly efficient way. Third, we evaluate NoLeaks using both public benchmarks and synthetic data. We observe that NoLeaks achieves comparable performance or even surpasses the state-of-the-art (non-private) approaches. We also find encouraging results that NoLeaks can smoothly scale to large datasets, on which existing works would fail. Through a case study and a downstream application, we observe encouraging results on the versatile usages of NoLeaks.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []