Progressive Clustering for Database Distribution on a Grid

2005 
The increasing availability of clusters and grids of workstations provides cheap and powerful resources for distributed data mining. To exploit these resources we need new algorithms adapted to this kind of environment, in particular with respect to the way to fragment data and to use this fragmentation. An "intelligent" distribution of data is required and can be obtained from clustering. Most existing parallel methods of clustering are developed for supercomputers with shared memory and hence can not be used on a grid. This paper presents a new clustering algorithm, called progressive clustering, which executes a clustering in an efficient and incremental distributed way. The data clusters resulting from this algorithm can subsequently be used in distributed data mining tasks
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    11
    Citations
    NaN
    KQI
    []