Chaotic geometric data perturbed and ensemble gradient homomorphic privacy preservation over big healthcare data

2021 
Data privacy protection has become a multifaceted function with the inception of advanced data analysis tools and data mining methods. To this end, several privacy preservation methods have been developed in order to safeguard sensitive data concerning big healthcare domain. To prevent deterministic disclosure of quasi identifiers and minimize the overhead involved in preservation of privacy over big healthcare data, a novel method called, Chaotic Geometric Data Perturbation and Ensemble Gradient Homomorphic (CGDP-EGH) is proposed. In this paper, we look into the privacy preservation issue where discrete patient records in big healthcare domain that can be uniquely recognized without using identity correlated attributes. With this regard, a data perturbation method is employed that can be used over big healthcare data to ward off deterministic disclosure of quasi identifier, while providing the data to the essential persons for further processing. The proposed method protects or preserves the geometrical data properties on the basis of the privacy protection parameters based on the chaotic function. The essential proposal is to preserve geometrical functions by means of chaotic perturbation. This is performed by means of two different models. First, Chaotic Geometric Data Perturbation Quasi Identifier model is used to preserve the identity correlated attributes via quasi identifier. Next, with the acquired quasi identifiers, an Ensemble gradient homomorphic privacy preservation model is applied to privacy protection for big healthcare data. The performance of the proposed CGDP-EGH is evaluated in terms of information loss, accuracy, overhead and true positive rate. The experimental results have shown that the proposed method is effective and performs better compared to the state-of-the-art methods using the same Diabetes 130-US hospitals dataset. Simulation result shows that the proposed method can improve privacy preservation accuracy and true positive rate over big healthcare data in which the analysis is performed for Diabetes 130-US hospitals dataset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []