Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics

2017 
Azure Data Lake Store (ADLS) is a fully-managed, elastic, scalable, and secure file system that supports Hadoop distributed file system (HDFS) and Cosmos semantics. It is specifically designed and optimized for a broad spectrum of Big Data analytics that depend on a very high degree of parallel reads and writes, as well as collocation of compute and data for high bandwidth and low-latency access. It brings together key components and features of Microsoft?s Cosmos file system-long used by internal customers at Microsoft and HDFS, and is a unified file storage solution for analytics on Azure. Internal and external workloads run on this unified platform. Distinguishing aspects of ADLS include its design for handling multiple storage tiers, exabyte scale, and comprehensive security and data sharing features. We present an overview of ADLS architecture, design points, and performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    67
    Citations
    NaN
    KQI
    []