Distributed Data Management for Grid Computing
34
Citation
0
Reference
20
Related Paper
Citation Trend
Abstract:
Foreword. Preface. Acknowledgements. PART I: AN OVERVIEW OF GRID COMPUTING. 1. What is Grid Computing? The Basics of Grid Computing. Leveling the Playing Field of Buzzword Mania. Paradigm Shift. Beyond Client/Server. New Topology. 2. Why Are Businesses Looking at Grid Computing? History Repeats Itself. Early Needs. Artists and Engineers. The Whys and Wherefores of Grid. Financial Factors. Business Drivers. Technology's Role. 3. Service-Oriented Architectures. What is Service-Oriented Architecture (SOA)? Driving Forces Behind SOA. Maturing Technology. Business. World Events. Enter Basic Supply and Demand Economics. Fundamental Shift in Computing. 4. Parallel Grid Planes. Using Art to Describe Life: Grid is the Borg. Grid Planes. Compute Grids. Data Grids. Compute and Data Grid - Parallel Planes. True Grid Must Include Data Management. Basic Data Management Requirements. Evolving the Data Grid. PART II: DATA MANAGEMENT IN GRID COMPUTING. 5. Scaling in the Grid Topology. Evolution in Data Management. Client/Server Evolution. Grid Evolution. Different Implementations of a Data Grid. Level Zero Data Grids. FTP in Grid. Distributed Filing Systems. Faster Servers. MetaData Hubs and Distributed Data Integration. Level 1 Data Grids. Foundations. Case Study: Integrasoft Grid Fabric (IGF). Application Characteristics for Grid. 6. Traditional Data Management. Data Management. History. Features. Key for Usability. 7. Relational Data Management as a Baseline for Understanding Data Grid. Evolution of the Relational Model. Parallels to Data Management in Grid. Analysis of the Functional Tiers. Engines Determine the Type of Data Grid. Data Management Features. 8. Foundation of Comparing Data Grids. Core Engine Determines Performance and Flexibility. Replicated vs. Distributed. Centralized vs. Peer-to-Peer Synchronization. Access to the Data Grid. Support for Traditional Data Management Features. Support for Data Management Features Specific to Grid. 9. Data Regionalization. What are Data Regions? Data Regions in Traditional Terms. Data Management in a Data Grid. Data Distribution Policy. Data Distribution Policy Expression. Data Replication Policy. Data Replication Policy Expression. Synchronization Policy. Load and Store Policy. Data Load Policy Expression. Data Store Policy Expression. Event Notification Policy. Event Notification Policy Expression. Quality of Service (QoS) Levels. 10. Data Synchronization. Intra-Region Synchronization. Inter-Region Synchronization. Synchronization Architectures. Centralized Synchronization Manager. Peer-to-Peer Synchronization. Synchronization Patterns. Synchronization Granularity. Synchronization Policy Expression. Synchronization Pattern Simulations. Synchronization Policy as a Standard Interface. 11. Data Integration. Enterprise Application/Information Integration in Grid. STP, EAI, and EII. EII in Grid. Natural Separation of Process and Data. Data Load Policy. Data Store Policy. Load, Store, and Synchronization. Enterprise Data Grid Integration. 12. Data Affinity. A Measurable Quantity. What to Expect from Data Affinity. How to Achieve Data Affinity. Regionalization, Synchronization, Distribution and Data Affinity. Data Distribution is Key to Data Affinity. Data Affinity and Task Routing. Integration of Compute and Data Grids. Examples. PART III: PRACTICAL APPLICATIONS OF GRID COMPUTING. 13. Which Applications are Good Candidates for the Grid. Grid Enabling Application Chrematistics. Grid'able Applications. Use Case Presentations. 14. Calculation Intensive Applications. Description. Use Cases. General Architecture. Data Grid Analysis. 15. Data Mining, Data Warehouses. Description. Use Cases. General Architecture. Data Grid Analysis. Benefits and Data Grid Specifics. 16. Geographic Boundary Problems. Description. Business Use Cases. General Architecture. Data Grid Analysis. Benefits and Data Grid Specifics. 17. Command and Control. Problem Description. Solution Architecture. Data Grid Analysis. Application Spin Offs. 18. Web Services's Role in the SOA/SONA Evolution. Definition of Web Services. Description. Data Management: The Key Stone to Web Services. Web Services, Grid Infrastructures, and SONA. The Undiscovered Past. The SONA Model. 19. The Compute Utility. Overview. Architecture. PART IV: REFERENCE MATERIAL. 20. Language Interface. Programmatic. Query Based. XML Based. 21. Basic Programming Examples. Hello World Example. Coarse Granularity. Coarse Data Atom. Writer Program. Reader Program. Fine Granularity Example. Writer Program. Reader Program. Random Number Surface Example. 22. Additional Reading. Useful Information Sources. White Papers. Grid. GridFTP. Distributed File Systems. Standards Bodies. Globus - Data Grid. Global Grid Forum. W3C. Public and University Grid Efforts. Scientific Research Use of Grid. Web Services. Distributed Computing. Compute Utility. Service Oriented Architectures. Data Affinity. 23. White Paper: Natural Attraction Forces of Data Bodies within a Data Grid to Describe Efficient Data Distribution Patterns. Introduction. Observation. Hypothesis. Laws of Attraction. How does this fit in with Data Distribution Patterns of Single Data Bodies within a Data Grid Fabric? Collision of Single Data Bodies. The Effects of the Data Grid on Single Data Body. Conclusions. 24. Glossary of Terms. References. Index.Keywords:
Data grid
Grid file
DRMAA
Cite
Preface Foreword 1. Grids in Context 2. Computational Grids I Applications 3 Distributed Supercomputing Applications 4 Real-Time Widely Distributed Instrumentation Systems 5 Data-Intensive Computing 6 Teleimmersion II Programming Tools 7 Application-Specific Tools 8 Compilers, Languages, and Libraries 9 Object-Based Approaches 10 High-Performance Commodity Computing III Services 11 The Globus Toolkit 12 High-Performance Schedulers 13 High-Throughput Resource Management 14 Instrumentation and Measurement 15 Performance Analysis and Visualization 16 Security, Accounting, and Assurance IV Infrastructure 17 Computing Platforms 18 Network Protocols 19 Network Quality of Service 20 Operating Systems and Network Interfaces 21 Network Infrastructure 22 Testbed Bridges from Research to Infrastructure Glossary Bibliography Contributor Biographies
Testbed
Blueprint
Instrumentation
Cite
Citations (7,832)
Data grid
Metadata management
e-Science
Distributed database
Cite
Citations (1,143)
Increasingly, parallel processing is being seen as the only cost-effective method for the fast solution of computationally large and data-intensive problems. The emergence of inexpensive parallel computers such as commodity desktop multiprocessors and clusters of workstations or PCs has made such parallel methods generally applicable, as have software standards for portable parallel programming. This sets the stage for substantial growth in parallel software.Data-intensive applications such as transaction processing and information retrieval, data mining and analysis and multimedia services have provided a new challenge for the modern generation of parallel platforms. Emerging areas such as computational biology and nanotechnology have implications for algorithms and systems development, while changes in architectures, programming models and applications have implications for how parallel platforms are made available to users in the form of grid-based services.This book takes into account these new developments as well as covering the more traditional problems addressed by parallel computers.Where possible it employs an architecture-independent view of the underlying platforms and designs algorithms for an abstract model. Message Passing Interface (MPI), POSIX threads and OpenMP have been selected as programming models and the evolving application mix of parallel computing is reflected in various examples throughout the book.
Cite
Citations (1,929)
I. Parallelism 1. Introduction 2. Parallel Computer Architectures 3. Parallel Programming Considerations II. Applications 4. General Application Issues 5. Parallel Computing in CFD 6. Parallel Computing in Environment and Energy 7. Parallel Computational Chemistry 8. Application Overviews III. Software technologies 9. Software Technologies 10. Message Passing and Threads 11. Parallel I/O 12. Languages and Compilers 13. Parallel Object-Oriented Libraries 14. Problem-Solving Environments 15. Tools for Performance Tuning and Debugging 16. The 2-D Poisson Problem IV. Enabling Technologies and Algorithms 17. Reusable Software and Algorithms 18. Graph Partitioning for Scientific Simulations 19. Mesh Generation 20. Templates and Numerical Linear Algebra 21. Software for the Scalable Solutions of PDEs 22. Parallel Continuous Optimization 23. Path Following in Scientific Computing 24. Automatic Differentiation V. Conclusion 25. Wrap-up and Features
Cite
Citations (412)
DRMAA
Directory service
Data grid
Grid file
Cite
Citations (0)
The Grid approach allows collaborative pooling of distributed resources across multiple domains. However, the benefits of the Grid are limited to those offered by the commodity application development framework used. Several elegant and flexible application development frameworks support only specific Grid architectures, thereby not allowing the applications to exploit the full potential of the Grid. In order to initiate community interest to standardize a high-level abstraction layer for different Grid architectures, we introduce a collection of abstractions and data structures that collectively build a basis for an open Grid computing environment.
Pooling
Abstraction
DRMAA
Abstraction layer
Cite
Citations (31)
Under Grid environments, how to access a set of heterogeneous data resource in a uniformed way is a well studied subject. The two most important problems about it are: 1. how to hide the different access manners of different data resources and 2. how to map real data to virtual data which can be used by various applications. In this paper, the Data Grid virtual access and integration tool: virtual Data Center has been designed to meet these new challenges. It addresses a new paradigms and constraints which deeply impacts data access. At the same time, the virtual data access possibilities and an automated method for data resources integrating and heterogeneous data resource access are promoted.
Data access
Data grid
Data center
Cite
Citations (3)
Computer cluster
Grid system
Cite
Citations (47)
Multi-core processors are growing as a new industry trend as single core processors rapidly reach the physical limits of possible complexity and speed. In the new Top500 supercomputer list, more than 20% processors belong to the multi-core processor family. However, without an in-depth study on application behaviors and trends on multi-core clusters, we might not be able to understand the characteristics of multi-core cluster in a comprehensive manner and hence not be able to get optimal performance. In this paper, we take on these challenges and design a set of experiments to study the impact of multi-core architecture on cluster computing. We choose to use one of the most advanced multi-core servers, Intel Bensley system with Woodcrest processors, as our evaluation platform, and use benchmarks including HPL, NAMD, and NAS as the applications to study. From our message distribution experiments, we find that on an average about 50% messages are transferred through intra-node communication, which is much higher than intuition. This trend indicates that optimizing intra- node communication is as important as optimizing inter- node communication in a multi-core cluster. We also observe that cache and memory contention may be a potential bottleneck in multi-core clusters, and communication middleware and applications should be multi-core aware to alleviate this problem. We demonstrate that multi-core aware algorithm, e.g. data tiling, improves benchmark execution time by up to 70%. We also compare the scalability of a multi-core cluster with that of a single-core cluster and find that the scalability of the multi-core cluster is promising.
Multi-core processor
GPU cluster
Computer cluster
Benchmark (surveying)
Many core
Cite
Citations (175)