Today's scientific applications increasingly involve large amounts of input/output data that must be moved among multiple computing facilities via wide-area networks (WANs). The bandwidth of WANs, however, is growing at a much smaller rate and thus becoming a bottleneck. Moreover, the network bandwidth has not been viewed as a limited resource, and thus coordinated allocation is lacking. Uncoordinated scheduling of competing data transfers over shared network links results in suboptimal system performance and poor user experiences. To address these problems, we propose a data transfer scheduler to coordinate and schedule data transfers between distributed computing facilities over WANs. Specifically, the scheduler prioritizes and allocates resources to data transfer requests based on user-centric utility functions in order to achieve maximum overall user satisfaction. We conducted trace-based simulation and demonstrated that our data transfer scheduling algorithms can considerably improve data transfer performance as well as quantified user satisfaction compared with traditional first-come, first-serve or short-job-first approaches.
Query routing is an intelligent service that can direct query requests to appropriate servers that are capable of answering the queries. The goal of a query routing system is to provide efficient associative access to a large, heterogeneous, distributed collection of information providers by routing a user query to the most relevant information sources that can provide the best answer. Effective query routing not only minimizes the query response time and the overall processing cost, but also eliminates a lot of unnecessary communication overhead over the global networks and over the individual information sources. The AQR-Toolkit divides the query routing task into two cooperating processes: query refinement and source selection. It is well known that a broadly defined query inevitably produces many false positives. Query refinement provides mechanisms to help the user formulate queries that will return more useful results and that can be processed efficiently. As a complimentary process, source selection reduces false negatives by identifying and locating a set of relevant information providers from a large collection of available sources. By pruning irrelevant information sources, source selection also reduces the overhead of contacting the information servers that do not contribute to the answer of the query. The system architecture of AQR-Toolkit consists of a hierarchical network (a directed acyclic graph) with external information providers at the leaves and query routers as mediating nodes. The end-point information providers support query-based access to their documents. At a query router node, a user may browse and query the meta information about information providers registered at that query router or make use of the router's facilitates for query refinement and source selection.
As systems scale toward exactable, many resources will become increasingly constrained. While some of these resources have historically been explicitly allocated, many -- such as network bandwidth, I/O bandwidth, or power -- have not. As systems continue to evolve, we expect many such resources to become explicitly managed. This change will pose critical challenges to resource management and job scheduling. In this paper, we explore the potentiality of relaxing network allocation constraints for Blue Gene systems. Our objectives to improve the batch scheduling performance, where the partition-based interconnect architecture provides a unique opportunity to explicitly allocate network resources to jobs. This paper makes three major contributions. The first is substantial benchmarking of parallel applications, focusing on assessing application sensitivity to communication bandwidth at large scale. The second is two new scheduling schemes using relaxed network allocation and targeted at balancing individual job performance with overall system performance. The third is a comparative study of our scheduling schemes versus the existing one under different workloads, using job traces collected from the 48-rack Mira, an IBM Blue Gene/Q system at Argonne National Laboratory.
Increasingly larger scale simulations are generating an unprecedented amount of data. However, the increasing gap between computation and I/O capacity on High End Computing machines makes a severe bottleneck for data analysis. As a solution, in-situ analytics processes output data while simulations are running and before placing data on disk. Data movement between simulation and analytics, however, incurs overheads of in-situ analytics at scale. This paper tries to answer the following question: can we use compression technology to reduce the data movement cost and improve the performance of in-situ analytics for peta-scale applications? In particular, we explore when, where, how to use the compression techniques to reduce data movement cost between simulation and analytics. To find out the best algorithm and place to compress data in given situation, we introduce an adaptive data compression algorithm in this paper. The adaptive compression service is developed and analyzed for the in-situ analytics middleware. Experimental results demonstrate that compression service increases data transition bandwidth and improve the application End-to-End transfer performance.
The research literature to date mainly aimed at reducing energy consumption in HPC environments. In this paper we propose a job power aware scheduling mechanism to reduce HPC's electricity bill without degrading the system utilization. The novelty of our job scheduling mechanism is its ability to take the variation of electricity price into consideration as a means to make better decisions of the timing of scheduling jobs with diverse power profiles. We verified the effectiveness of our design by conducting trace-based experiments on an IBM Blue Gene/P and a cluster system as well as a case study on Argonne's 48-rack IBM Blue Gene/Q system. Our preliminary results show that our power aware algorithm can reduce electricity bill of HPC systems as much as 23%.
With the rapid growth of dataset, more and more traditional applications turn into big data applications running on cloud platforms. This trend has continued, and even accelerated. Compared with traditional applications, big data applications spend lots of time to transfer data among computing nodes. Hence, the communication optimization is a must for big data applications. Network topology and routing algorithm of the underlying system are two major factors in determining the communication performance of big data applications. Once the system is deployed, the network topology is fixed and static or dynamic routing protocols are preinstalled. Users cannot change them. Therefore, it is hard for application developers to identify the optimal network configuration for their applications with distinct communication patterns. In this study, we design a flexible container-based tuning system (Flex Tuner) allowing users to create a farm of lightweight virtual machines (containers) on host machines. In addition, we use software-defined networking (SDN) technique to connect and direct the network traffic among these containers. Users can soft-tune the network topology and network traffic of the Flex Tuner, thereby enabling application developers to analyze their applications on the same system with different network configuration. The preliminary experimental results have shown that Flex Tuner can represent application performance variations caused by network topology and routing algorithm. Case studies through both synthetic big data programs and benchmarks have indicated that Flex Tuner enables researchers to analyze the communication cost of their big data applications and to find the suitable network topology and routing algorithm.