Triangular clustering in document networks

2009 
Document networks have the characteristic that a document node, e.g. a webpage or an article, carries meaningful content. Properties of document networks are not only affected by topological connectivity between nodes, but are also strongly influenced by the semantic relation between the content of the nodes. We observed that document networks have a large number of triangles and a high value clustering coefficient. Also there is a strong correlation between the probability of formation of a triangle and the content similarity among the three nodes involved. We propose the degree-similarity product (DSP) model, which well reproduces these properties. The model achieves this by using a preferential attachment mechanism that favours the linkage between nodes that are both popular and similar. This work is a step forward towards a better understanding of the structure and evolution of document networks.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    25
    Citations
    NaN
    KQI
    []