Exploratory Study of Privacy Preserving Fraud Detection

2018 
With the wide adoption of the Internet, digital transactions surge exponentially and so do the impersonation fraud. While machine learning techniques show strong promise to be the building block for digital fraud detection systems, clients may be reluctant to share the raw data with such systems due to privacy concerns. The emerging privacy preserving machine learning techniques that employ homomorphic encryption to resolve this conundrum unfortunately increases the computational overhead of detection. In this paper, we present a first-of-a-kind empirical performance study of a private fraud detection system developed at SiS ID, a French business security platform. A privacy-preserving decision tree which can classify transactions into four risk classes (safe, moderately risky, very risky and fraud) is trained on more than 160000 real world transactions, and we quantitatively compare the classification accuracy, latency and network bandwidth under various combinations of encryption parameters and learning hyper-parameters, in order to explore the impact of the configuration on the performances. Our results show that the computation and communication overhead of processing encrypted data increases by an order of magnitude of 5, and highly depends on the configuration of the encryption key and the number of nodes in the decision tree.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    1
    Citations
    NaN
    KQI
    []