Scalable Collaborative Filtering Based on Splitting-Merging Clustering Algorithm

2018 
Recommender systems apply information filtering technologies to identify a set of items that could be of interest to a user. Collaborative filtering (CF) is one of the most well-known successful filtering techniques in recommender systems and has been widely applied. However the usual CF techniques face issues that limit their application, especially in dealing with highly sparse and large-scale data. For instance, CF algorithms using the k-Nearest Neighbor approach are very efficient in filtering interesting items to users but in the same time they require a very expensive computation and grow non-linearly with the number of users and items in a database. To address this scalability issues, some researchers propose to use clustering methods. K-means is among the well-known clustering algorithms but has the shortcomings of dependency on the number of the clusters and on the initial centroids, which lead to inaccurate recommendations and increase computation time. In this paper, we will show by comparing with K-means based approaches how a clustering algorithm called K-means+ that considers the statistical nature of data can improve the performances of recommendation with reasonable computation time. The results presented that predictions of substantially better quality are obtained with the proposed K-means+ method. These results also provide significant evidences that the proposed Splitting-Merging clustering based CF is more scalable than the conventional one.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    74
    References
    2
    Citations
    NaN
    KQI
    []