Measures for topical cohesion of user communities on Twitter

2017 
Nowadays, Online Social Networks (OSN) are commonly used by groups of users to communicate. Members of a family, colleagues, fans of a brand, political groups: the demand for a precise identification of these groups is increasing from brand monitoring, business intelligence and e-reputation management. However, a gap can be observed between the communities detected by many data analytics algorithms on OSN, and effective groups existing in real life: the detected communities often lack of meaning and internal semantic cohesion. Most of existing literature on OSN either focuses on the community detection problem in graphs without considering the topic of the messages exchanged, or concentrates exclusively on the messages without taking into account the social links. In this article, we support the hypothesis that communities extracted on OSN should be topically coherent. We therefore propose a model to represent the interaction between users on Twitter, the reference on micro-blogging OSN, and metrics to evaluate the topical cohesion of the detected communities. As an evaluation, we measure the topical cohesion of the groups of users detected by a baseline community detection algorithm, using two measures inspired from the classification domain, and one measure inspired from the NLP domain. A detailed analysis is performed on a big tweet dataset, from which a user graph is built. Introduced measures are compared with statistics to better picture the experiment, and yield interesting insights on a social and textual corpus.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    40
    References
    2
    Citations
    NaN
    KQI
    []