Empirical study on lexical sentiment in passwords from Chinese websites

2019 
Abstract Passwords are especially ubiquitous in authentication systems. Although a lot of research has concentrated on the analysis of passwords, statistical characteristics are limited to explicit properties such as string length, the use of numeric characters in passwords. We explore the implicit properties of lexical sentiment in passwords by utilizing Natural Language Processing technology. Firstly, we construct several dictionaries, such as frequently used words, General Inquirer polarities, extended Ekman sentiment words. Then, an algorithm to split passwords into meaningful units is designed. Finally, statistical characteristics in lexical sentiment are discovered from three large-scale password sets leaked from the Internet websites. The results show that the occurrence probability of sentiment words is higher than that of several known patterns, such as [country], [number][female name]. With consideration of sentiment polarity, we find that people tend to use positive words in passwords. The percentage of positive sentiment is greater than that of negative words. Further research reveals that the joy type of sentiment is more popular in passwords than other kinds of sentiments, such as surprise and sadness. The discoveries suggest that the lexical sentiment, especially, positive and joy type can be utilized as a component in password patterns to measure password strength.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    6
    Citations
    NaN
    KQI
    []