Mining unstructured content for recommender systems: an ensemble approach

2016 
Recommendation of textual documents requires indexing mechanisms to extract structured metadata for attribute-aware recommender systems. Applying a variety of text mining algorithms has the advantage of capturing different aspects of unstructured content, resulting in richer descriptions. However, it is difficult to integrate them into a unique model so that these descriptions can efficiently improve recommendation accuracy. This article proposes a generic model based on ensemble learning that combines simple text mining methods in a post-processing approach. After executing each text mining technique, each set of metadata of a particular type is applied to the recommender module, which generates attribute-specific rankings. Then, the resulting recommendations are ensembled to generate a final personalized ranking to the user. We evaluated our ensemble technique with two attribute-aware collaborative recommenders (k-Nearest Neighbors and BPR-Mapping) and we demonstrate its generality by means of comparisons among different types of ensembles. We used two datasets from different domains, the first is from the Brazilian Embrapa Agency of Technology Information website, whose documents are written in Portuguese language, and the second is the HetRec MovieLens 2k, published by the GroupLens Research Group, whose movies' storylines are written in English. The experiments show that, particularly to the k-NN recommender, better accuracy can be obtained when multiple metadata types are combined. The proposed approach is extensible and flexible to new indexing and recommendation techniques.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    38
    References
    9
    Citations
    NaN
    KQI
    []