Using automatic keyword extraction to detect off-topic posts in online discussion boards

2009 
Online discussions boards represent a rich repository of knowledge organized in a collection of user generated content. These conversational cyberspaces allow users to express opinions, ideas and pose questions and answers without imposing strict limitations about the content. This freedom, in turn, creates an environment in which discussions are not bounded and often stray from the initial topic being discussed. In this paper we focus on approaches to assess the relevance of posts to a thread and detecting when discussions have been steered off-topic. A set of metrics estimating the level of novelty in online discussion posts are presented. These metrics are based on topical estimation and contextual similarity between posts within a given thread. The metrics are aggregated to rank posts based on the degree of relevance they maintain. The aggregation scheme is data-dependent and is normalized relative to the post length.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    9
    Citations
    NaN
    KQI
    []