Unsupervised Discovery Of Semantically Aware Communities With Tensor Kruskal Decomposition: A Case Study In Twitter

2020 
Substantial empirical evidence, including the success of synthetic graph generation models as well as of analytical methodologies, suggests that large, real graphs have a recursive community structure. The latter results, in part at least, in other important properties of these graphs such as low diameter, high clustering coefficient values, heavy degree distribution tail, and clustered graph spectrum. Notice that this structure need not be official or moderated like Facebook groups, but it can also take an ad hoc and unofficial form depending on the functionality of the social network under study as for instance the follow relationship on Twitter or the connections between news aggregators on Reddit. Community discovery is paramount in numerous applications such as political campaigns, digital marketing, crowdfunding, and fact checking. Here a tensor representation for Twitter subgraphs is proposed which takes into consideration both the followfollower relationships but also the coherency in hashtags. Community structure discovery then reduces to the computation of Tucker tensor decomposition, a higher order counterpart of the well-known unsupervised learning method of singular value decomposition (SVD). Tucker decomposition clearly outperforms the SVD in terms of finding a more compact community size distribution in experiments done in Julia on a Twitter subgraph. This can be attributed to the facts that the proposed methodology combines both structural and functional Twitter elements and that hashtags carry an increased semantic weight in comparison to ordinary tweets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    2
    Citations
    NaN
    KQI
    []