language-icon Old Web
English
Sign In

Aggregation for Sensitive Data

2019 
In many modern applications, considerations like privacy, security and legal doctrines like the GDPR put limitations on data storage and sharing with third parties. Specifically, access to individual level data points is restricted and machine learning models need to be trained with aggregated versions of the datasets. Learning with aggregated data is a new and relatively unexplored form of semi-supervision. We tackle this problem by designing aggregation paradigms that conform to certain kinds of privacy or non-identifiability requirements. We further develop novel learning algorithms that can nevertheless be used to learn from only these aggregates. We motivate our framework for the case of Gaussian regression, and subsequently extend our techniques to subsume arbitrary binary classifiers and generalised linear models. We provide theoretical results and empirical evaluation of our methods on real data from healthcare and telecom.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []