Understanding Site-Based Inference Potential for Identifying Hidden Attributes

2013 
The popularity of social networking sites has led to the creation of massive online databases containing (potentially sensitive) personal information, portions of which are often publicly accessible. Although most popular social networking sites allow users to customize the degree to which their information is publicly exposed, the disclosure of even a small, seemingly innocuous set of profile attributes may be sufficient to infer a surprisingly revealing set of attribute-value pairings. This paper analyzes the predictive accuracy of existing and ensemble inference algorithms to infer hidden attributes using publicly exposed attribute-values. For our tested population, we find that (i) certain attributes are more accurately predicted than others, (ii) each tested inference algorithm is well-suited for inferring a particular subset of attributes, and (iii) these subsets of inferable attributes often have little overlap. Taken collectively, our results indicate that the amount of information one can extract from a given user's public profile is often greater than the sum of the attributes that the user has chosen to publish.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    2
    Citations
    NaN
    KQI
    []