Proceedings of the fifth international workshop on Data-Intensive Distributed Computing Date

2011 
It is our great pleasure to welcome you to the Sixth International Workshop on Data-intensive Distributed Computing (DIDC 2014), which is held in conjunction with the International ACM Symposium on High Performance Distributed Computing (HPDC 2014). The data needs of scientific as well as commercial applications from a diverse range of fields have been increasing exponentially over the recent years. Digital data generated from various sources such as scientific instruments, sensors, internet transactions, email, video and click streams can be large, diverse, longitudinal and distributed which poses new challenges and requirements for offline and real time processing where extraction of meaningful information can open novel application areas and lead to new breakthroughs. This data deluge and the increase in the demand for large-scale data processing has necessitated collaboration and sharing of data collections among the world's leading education, research, and industrial institutions and use of distributed resources owned by collaborating parties. In a widely distributed environment, data is often not locally accessible and has thus to be remotely retrieved and stored. While traditional distributed systems work well for computation that requires limited data handling, they may fail in unexpected ways when the computation accesses, creates, and moves large amounts of data especially over wide-area networks. Further, data accessed and created is often poorly described, lacking both metadata and provenance. Scientists, researchers, and application developers are boften forced to solve basic data-handling issues, such as physically locating data, how to access it, and/or how to move it to visualization and/or compute resources for further analysis. Although many efforts have been made to develop new programming paradigms and models that can handle the data needs of the application automatically, the results are far from being optimized. DIDC focuses on the challenges imposed by data-intensive applications on distributed systems, and on the different state-of-the-art solutions proposed to overcome these challenges. It brings together the collaborative and distributed computing community and the data management community in an effort to generate productive conversations on the planning, management, and scheduling of data handling tasks and data storage resources. This year's workshop continues with the tradition of gathering distinguished speakers and providing a diverse program with a variety of topics ranging from parallel programming models for data-intensive applications to Cloud architecture and testbed design. We also include a Hot Topics session that presents and discusses current trends and upcoming challenges in DIDC such as data distribution for GPU processing models in exa-scale computing and challenges and future directions in big data processing.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    2
    Citations
    NaN
    KQI
    []