Traffic Optimization for ExaScale Science Applications
2017
Massive datasets continue to be acquired, simulated, processed and
analyzed by globally distributed scientific collaborations, and the
volume of this data is growing exponentially. These datasets need to
be exchanged through a global network infrastructure. Applications
that manage and analyze such massive data volumes can benefit
substantially from the information about networking, computing and
storage resources from each member sites, and more directly from
network-resident services that optimize and load balance resource
usage among multiple data transfer and analytic requests, and achieve
a better utilization of multi-resources in clusters. The Application-
Layer Traffic Optimization (ALTO) protocol can provide via extensions
the network information about different clusters/sites, to both users
and proactive network management services where applicable, with the
goal of improving both application performance and network resource
utilization. However, it has been verified in both science networks
and commercial data center networks that network resource in many
cases is not the bottleneck preventing the efficiency of large dataset
transfer and data-intensive analytics. To achieve a greater overall
efficiency of the science programs' workflows information about
different resources, such as computing, storage and networking, should
be provided to data intensive applications simultaneously. In this
document, we propose that it is feasible to use existing ALTO services
to provides not only network information, but also information about
other resources in science networks including computing and storage.
We introduce an Exascale Science Application Orchestrator (ExaO),
which achieves an efficient multi-resource allocation to support low-
latency dataset transfer and data intensive analytics in exascale
science networks. ExaO provides simple APIs for users to submit and
manage dataset transfer and analytic requests and to monitor the
status of each request, along with fine-grained local and global
network and site state information in real-time. It collects cluster
information from multiple ALTO services utilizing topology extensions
and leverages emerging SDN control capabilities to orchestrate the
resource allocation for dataset transfers and analytic tasks, leading
to improved transfer and analytic latency as well as more efficient
utilization of multi-resources in clusters/ sites.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
0
Citations
NaN
KQI