Data sparsity: a key disadvantage of user-based collaborative filtering?

2012 
Traditionally, data sparsity is seen as a key disadvantage of user-based CF. It is often assumed that data sparsity may cause small number of co-rated items or no such ones between two users, resulting in unreliable or unavailable similarity information, and further incurring poor recommendation quality. However, the analysis process is often not experimentally verified. To make a detailed analysis, the effects of the data sparsity on user-based CF are experimented with three steps. Firstly, the relationships between the data sparsity and the number of co-rated items are investigated. Secondly, the characteristics of the number are explored. Thirdly, the effects of the number on the recommendation quality are evaluated. Experimental results show that: a) as data sparsity increases, the number of co-rated items doesn't drop, and b) recommendation quality doesn't drop as the number of co-rated items decreases. These results show that the traditional analysis about the effects of data sparsity is problematic. We hope that this new conclusion about the effects of data sparsity can provide implications for the design of CF algorithms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    2
    Citations
    NaN
    KQI
    []