logo
    Lingo: Search Results Clustering Algorithm Based on Singular Value Decomposition
    284
    Citation
    12
    Reference
    10
    Related Paper
    Citation Trend
    Comparison of three kind of the clustering and find cost function and loss function and calculate them. Error rate of the clustering methods and how to calculate the error percentage always be one on the important factor for evaluating the clustering methods, so this paper introduce one way to calculate the error rate of clustering methods. Clustering algorithms can be divided into several categories including partitioning clustering algorithms, hierarchical algorithms and density based algorithms. Generally speaking we should compare clustering algorithms by Scalability, Ability to work with different attribute, Clusters formed by conventional, Having minimal knowledge of the computer to recognize the input parameters, Classes for dealing with noise and extra deposition that same error rate for clustering a new data, Thus, there is no effect on the input data, different dimensions of high levels, K-means is one of the simplest approach to clustering that clustering is an unsupervised problem.
    Hierarchical clustering
    Single-linkage clustering
    Brown clustering
    Consensus clustering
    Spectral Clustering
    Citations (4)
    To deal with large-scale data clustering problems,a speeding K-means parallel clustering method was presented which randomly sampled first and then used max-min distance means to carry out K-means parallel clustering.Sampling based method avoids the problem of clustering in local solutions and max-min distance based method makes the initial clustering centers tend to be optimum.Results of a large number of experiments show that the proposed method is affected less by the initial clustering center and improves the precision of clustering in both stand-alone environment and cluster environment.It also reduces the number of iterations of clustering and the clustering time.
    Data stream clustering
    Single-linkage clustering
    k-medians clustering
    Clustering high-dimensional data
    Citations (0)
    Abstract Selective clustering ensemble algorithm can eliminate the inferior quality clustering member’s influence and can achieve a better clustering solution relative to the clustering ensemble algorithm. For high dimensional data clustering, in this paper, a novel selective ensemble algorithm based on semi-supervised K-means clustering is proposed. In this paper, through a large number of experiments to verify the validity of the proposed algorithm for dealing with high dimensional data clustering. The new algorithm can achieve statistically significant performance improvement over other clustering algorithms.
    Data stream clustering
    Single-linkage clustering
    Clustering high-dimensional data
    Clustering aggregation problem is a kind of formal description for clustering ensemble problem and technologies for the solving of clustering aggregation problem can be used to construct clustering division with better clustering performance when the clustering performances of each original clustering division are fluctuant or weak. In this paper, an approach based on genetic algorithm for clustering aggregation problem, named as GeneticCA, is presented To estimate the clustering performance of a clustering division, clustering precision is defined and features of clustering precision are discussed In our experiments about clustering performances of GeneticCA for document clustering, hamming neural network is used to construct clustering divisions with fluctuant and weak clustering performances. Experimental results show that the clustering performance of clustering division constructed by GeneticCA is better than clustering performance of original clustering divisions with clustering precision as criterion.
    Single-linkage clustering
    Data stream clustering
    Clustering high-dimensional data
    Citations (31)
    Clustering is a process of partitioning data objects into different groups according to some similarity or dissimilarity measure, e.g., distance criterion. The distance criterion fails to group the objects as all the objects are almost equidistant in high dimensional dataset, hence the distance criterion becomes meaningless. In the literature, numerous clustering algorithms are presented for clustering high dimensional dataset, which select relevant dimensions in high dimensional dataset and perform clustering of the objects on the selected dimensions. As these clustering algorithms produce different clustering results on the same dataset, there is confusion in the selection of clustering algorithm for better clustering of high dimensional dataset. In this paper, we present a comparative study of conventional feature selection based clustering algorithms and propose a new feature selection based clustering method IQRAM (inter quartile range and median based clustering of high dimensional dataset) for clustering high dimensional dataset. We perform our experiments on two real datasets and analyse the clustering results using five well-known clustering quality measures and student’s t-test. The qualitative results show that IQRAM outperform ten competitive clustering algorithms.
    Clustering high-dimensional data
    Single-linkage clustering
    Data stream clustering
    Consensus clustering
    Constrained clustering
    Hierarchical clustering
    Primary motivation towards this study was to obtain better clustering results from K-means clustering algorithm. Several studies had provided relevant findings showing normal clustering algorithm and scope of improvements regarding clustering accuracy. With the aim to increase clustering accuracy, a conceptual notion of Genetic Algorithm is utilized in K-means clustering algorithm. Depending on the Genetic algorithm concepts, an improved clustering technique is proposed in this study for obtaining more accurate and more precise clustering outcomes. The contribution from this study could be essential in terms of topic-modelled data and clustering text documents.
    Data stream clustering
    Single-linkage clustering
    Clustering high-dimensional data
    Single-linkage clustering
    Constrained clustering
    Clustering high-dimensional data
    Data stream clustering
    k-medians clustering
    Single-linkage clustering
    Data stream clustering
    k-medians clustering
    Consensus clustering
    Constrained clustering
    Clustering high-dimensional data