Mining Sequential Patterns on a Grid-Computing Environment
3
Citation
26
Reference
10
Related Paper
Citation Trend
Abstract:
This paper presents the design and implementation of a grid-computing environment for mining sequential patterns. An Apriori-like algorithm for mining sequential patterns is deployed in the proposed grid-computing environment. Apriori-like algorithm is not of very high performance in comparison to others but it is more convenient to be realized for distributed processing in a grid computing environment due to its loosely coupled processes. Two types of grids are designed, the computing grid and data grid, in the proposed environment. All grids are installed with full functions, each of which is wrapped by Globus toolkit. Grid services are invoked by the users or other grids and able to respond to the invoking side. There are 10 computers serving as grid nodes each of which is equipped with different hardware components and is distributed on two campuses. The experimental results show that the proposed grid-computing environment provides a flexible and efficient platform for mining sequential patterns from large datasets.Keywords:
Grid file
Distributed Computing Environment
Data grid
DRMAA
Grid computing has emerged in recent years as a framework for supporting complex compilations over large data sets. In general, grids enable the efficient sharing and management of computing resources for the purpose of performing large complex tasks. In particular, grids have been defined as anything from batch schedulers to Peer-to-Peer (P2P) platforms. This chapter serves as an introduction to Grid Computing, Data Management in Grid, and the various discussion threads carried through this book. From the start, the confusion due to the overuse of technology buzzwords is level set. Discussion threads introduced here are carried throughout the following chapters, items such as the parallels of data management in Grid and Client/Server, the data access bottlenecks that emerge due to Grid and Client/Server Topology Mismatches that a true Data Grid must alleviate. Grid Computing represents a fundamental paradigm shift away from client/server as client server was to the mainframe/mini. As a result data management in this highly distributed compute environment needs to match challenges of the Grid Topology.
DRMAA
Data grid
Grid file
Confusion
Data Sharing
Cite
Citations (11)
The data grid has provided the emerging technology for constructing the ultra large-scale data storage and management in the grid computing. This paper presents a new kind of data grid middleware for data storage resources discovery and dynamic management in gird environment. The data grid middleware has the characteristics of self-adaptive software system structure. The architecture of grid storage resources discovery and dynamic management is presented for discovering data storage resources from the different computer organizational structure, the different grid, and different store medium in the grid. This middleware can realize the necessary functions for the ultra large-scale application in data grid environment. It could be applied to the ultra large-scale data storage management in grid computing in next generation internet.
Data grid
DRMAA
Grid file
Cite
Citations (0)
The current generation of Grid infrastructures designed for production activity is strongly computing oriented and tuned on the needs of applications that requires intensive computations. Problems arise when trying to use such Grids to satisfy the sharing of data-oriented and service-oriented resources as happens in the IVOA community. We have designed, developed and implemented a Grid query element to access data source from an existing production Grid environment. We also enhanced the Grid middleware model (collective resources and sites) to manage Data Sources extending the Grid semantic. The query element and the modified grid Information System are able to connect the Grid environment to Virtual Observatory resources. A specialized query element is designed to work as Virtual Observatory resource in the Grid so than an Astronomer can access Virtual Observatory data using the IVOA standards.
Virtual Observatory
Grid file
Data grid
DRMAA
Data access
Cite
Citations (2)
Well organized and easy usable Grid management system is very important for executing various Grid applications and managing Grid computing environment.Moreover, information system which can support Grid management system by providing various Grid environment related information is also one of the most interesting issue in the Grid middleware system area.Effective cooperation between Grid management system and information system can make a novel Grid middleware system.Especially, service oriented architecture based Grid management system is flexible and extensible for providing various type of Grid services.Also, information system based on data mining process which comprises various different kinds of domains such as users, resources and applications can make Grid management system more precise and efficient.In this paper, we propose semantic Grid middleware system which is a combination of Grid management system and semantic information system.
DRMAA
Data grid
USable
Grid file
Cite
Citations (0)
Grid Computing is a platform for coordinated resource sharing and problem solving on a global scale among virtual organizations. Grid uses Grid Services to access and use a set of Grid resources. Subsequently, these Grid Services need to be discovered, selected and invoked quickly and efficiently to satisfy the needs of a demanding environment such as Grid. In this paper, we introduce how the Grid Computing takes advantage of the benefits of Semantic Web to manipulate Grid Services obtaining better searches, results and performance.
DRMAA
Data grid
Shared resource
Grid file
Virtual organization
Cite
Citations (1)
Focused on the problems that the development of grid application is difficult and the type of discovered resources is restricted when directly using globus toolkit API, some technologies such as grid container, grid plug-in, grid object pool, grid resource adapter and the subgrid communication protocol for computing grid are introduced, the kernel methods and hiberarchy of computing grid distributed middleware (CGDM) are discussed in this paper. By applying these technologies, the computing grid distributed middleware is realized, which is independent of computing grid portal and grid application. The real application in a campus computing grid development shows that this middleware can make it easy to discover new types of grid resources and lead to a quick grid application developing and interconnection between the subgrids.
DRMAA
Data grid
Grid file
Cite
Citations (13)
DRMAA
Data grid
Grid file
Testbed
Cite
Citations (6)
Foreword. Preface. Acknowledgements. PART I: AN OVERVIEW OF GRID COMPUTING. 1. What is Grid Computing? The Basics of Grid Computing. Leveling the Playing Field of Buzzword Mania. Paradigm Shift. Beyond Client/Server. New Topology. 2. Why Are Businesses Looking at Grid Computing? History Repeats Itself. Early Needs. Artists and Engineers. The Whys and Wherefores of Grid. Financial Factors. Business Drivers. Technology's Role. 3. Service-Oriented Architectures. What is Service-Oriented Architecture (SOA)? Driving Forces Behind SOA. Maturing Technology. Business. World Events. Enter Basic Supply and Demand Economics. Fundamental Shift in Computing. 4. Parallel Grid Planes. Using Art to Describe Life: Grid is the Borg. Grid Planes. Compute Grids. Data Grids. Compute and Data Grid - Parallel Planes. True Grid Must Include Data Management. Basic Data Management Requirements. Evolving the Data Grid. PART II: DATA MANAGEMENT IN GRID COMPUTING. 5. Scaling in the Grid Topology. Evolution in Data Management. Client/Server Evolution. Grid Evolution. Different Implementations of a Data Grid. Level Zero Data Grids. FTP in Grid. Distributed Filing Systems. Faster Servers. MetaData Hubs and Distributed Data Integration. Level 1 Data Grids. Foundations. Case Study: Integrasoft Grid Fabric (IGF). Application Characteristics for Grid. 6. Traditional Data Management. Data Management. History. Features. Key for Usability. 7. Relational Data Management as a Baseline for Understanding Data Grid. Evolution of the Relational Model. Parallels to Data Management in Grid. Analysis of the Functional Tiers. Engines Determine the Type of Data Grid. Data Management Features. 8. Foundation of Comparing Data Grids. Core Engine Determines Performance and Flexibility. Replicated vs. Distributed. Centralized vs. Peer-to-Peer Synchronization. Access to the Data Grid. Support for Traditional Data Management Features. Support for Data Management Features Specific to Grid. 9. Data Regionalization. What are Data Regions? Data Regions in Traditional Terms. Data Management in a Data Grid. Data Distribution Policy. Data Distribution Policy Expression. Data Replication Policy. Data Replication Policy Expression. Synchronization Policy. Load and Store Policy. Data Load Policy Expression. Data Store Policy Expression. Event Notification Policy. Event Notification Policy Expression. Quality of Service (QoS) Levels. 10. Data Synchronization. Intra-Region Synchronization. Inter-Region Synchronization. Synchronization Architectures. Centralized Synchronization Manager. Peer-to-Peer Synchronization. Synchronization Patterns. Synchronization Granularity. Synchronization Policy Expression. Synchronization Pattern Simulations. Synchronization Policy as a Standard Interface. 11. Data Integration. Enterprise Application/Information Integration in Grid. STP, EAI, and EII. EII in Grid. Natural Separation of Process and Data. Data Load Policy. Data Store Policy. Load, Store, and Synchronization. Enterprise Data Grid Integration. 12. Data Affinity. A Measurable Quantity. What to Expect from Data Affinity. How to Achieve Data Affinity. Regionalization, Synchronization, Distribution and Data Affinity. Data Distribution is Key to Data Affinity. Data Affinity and Task Routing. Integration of Compute and Data Grids. Examples. PART III: PRACTICAL APPLICATIONS OF GRID COMPUTING. 13. Which Applications are Good Candidates for the Grid. Grid Enabling Application Chrematistics. Grid'able Applications. Use Case Presentations. 14. Calculation Intensive Applications. Description. Use Cases. General Architecture. Data Grid Analysis. 15. Data Mining, Data Warehouses. Description. Use Cases. General Architecture. Data Grid Analysis. Benefits and Data Grid Specifics. 16. Geographic Boundary Problems. Description. Business Use Cases. General Architecture. Data Grid Analysis. Benefits and Data Grid Specifics. 17. Command and Control. Problem Description. Solution Architecture. Data Grid Analysis. Application Spin Offs. 18. Web Services's Role in the SOA/SONA Evolution. Definition of Web Services. Description. Data Management: The Key Stone to Web Services. Web Services, Grid Infrastructures, and SONA. The Undiscovered Past. The SONA Model. 19. The Compute Utility. Overview. Architecture. PART IV: REFERENCE MATERIAL. 20. Language Interface. Programmatic. Query Based. XML Based. 21. Basic Programming Examples. Hello World Example. Coarse Granularity. Coarse Data Atom. Writer Program. Reader Program. Fine Granularity Example. Writer Program. Reader Program. Random Number Surface Example. 22. Additional Reading. Useful Information Sources. White Papers. Grid. GridFTP. Distributed File Systems. Standards Bodies. Globus - Data Grid. Global Grid Forum. W3C. Public and University Grid Efforts. Scientific Research Use of Grid. Web Services. Distributed Computing. Compute Utility. Service Oriented Architectures. Data Affinity. 23. White Paper: Natural Attraction Forces of Data Bodies within a Data Grid to Describe Efficient Data Distribution Patterns. Introduction. Observation. Hypothesis. Laws of Attraction. How does this fit in with Data Distribution Patterns of Single Data Bodies within a Data Grid Fabric? Collision of Single Data Bodies. The Effects of the Data Grid on Single Data Body. Conclusions. 24. Glossary of Terms. References. Index.
Data grid
Grid file
DRMAA
Cite
Citations (34)
The representation, description, organization, deployment, discovery, access, maintenance and final destroy of resources are important research problems in grid computing, as they directly affect the design and implementation of grid protocols, grid languages, and grid software. In this paper, a grid is viewed as a virtual computer, and these problems are studied from a computer architecture viewpoint, using the methodology accumulated in past address space studies. A formal model of grid resource space with three layers is proposed: effective resource space, virtual resource space, and physical resource space, and thus it is called the EVP model. This model is used in the design and implementation of the GSML software suite and the grid operating system of the Vega grid project. This three-layer approach could enhance the usability, modularity, transparency, and autonomy of grid systems. This EVP model could be used not only for computing grid, but also for data grid, information grid, business grid, and peer-to-peer systems.
DRMAA
Vega
Grid file
Data grid
Cite
Citations (18)
Data management is one of the most important problems in grid environments. One important challenge facing grid computing is the design of a grid file system. The Global Grid Forum defines a grid file system as a human-readable resource namespace for management of heterogeneous distributed data resources, that can span across multiple autonomous administrative domains. This paper evaluates Expand, a new grid file system according to the Global Grid Forum recommendations that integrates heterogeneous data storage resources in grids using standard grid technologies: GridFTP and the OGSA ByteIO interface defined by the Open Grid Forum.
Namespace
Grid file
Data grid
DRMAA
File transfer
Cite
Citations (0)