Massive Data Query Optimization on Large Clusters

2012 
The growing demand for massive data processing and analysis applications has led both academia and industry to design lots of new types of highly scalable massive data-intensive computing platforms base on the large clusters in the cloud environment. How to get an fast query response time, especially to those ad hoc queries, is becoming very important in the large clusters environment. In this paper, we designed a series of algorithms for the query optimization. We designed an ecient massive data query and optimization mechanism SemanQuery. SemanQuery have two characters: First, it has better semantics, that is to say it has some intelligent when processing massive data queries through a semantic matching algorithm. Second, In order to reduce the query cost, we constructed a very large query network in SemanQuery and optimize it. Simulation experiment and result showed that SemanQuery will improve the query eciency better on large clusters.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    2
    Citations
    NaN
    KQI
    []