CometCloudCare (C3): Distributed Machine LearningPlatform-as-a-Service with Privacy Preservation

2014 
The growth of data sharing initiatives in neuroscience and genomics [14, 16, 19, 25] represents an exciting opportunity to confront the “small N ” problem plaguing contemporary studies [20]. When possible, open data sharing provides the greatest benefit. However some data cannot be shared at all due to privacy concerns and/or risk of re-identification. Sharing other data sets is hampered by the proliferation of complex data use agreements (DUAs) which preclude truly automated data mining. These DUAs arise because of concerns about the privacy and confidentiality for subjects; though many do permit direct access to data, they often require a cumbersome approval process that can take months. Additionally, some researchers have expressed doubts about the efficiency and scalability of centralized data storage and analysis for large volume datasets [18]. In response, distributed cloud solutions have been suggested [23]; however, the task of transferring large volumes of imaging data (processed or unprocessed) to and from the cloud is far from trivial. More worrisome than the challenges of data transfer and storage is the tendency for labs to collect, label, and maintain neuroimaging data in idiosyncratic ways. Developing standardized data collection and storage is a recent trend [26], and achieving such a standard may take years, or may never happen at all.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []