The development and validation of the Romanian version of Linguistic Inquiry and Word Count 2015 (Ro-LIWC2015)

2020 
Today, performing automatic language analysis to extract meaning from natural language is one of the top-notch directions in social science research, but it can be challenging. Linguistic Inquiry and Word Count 2015 (LIWC2015; Pennebaker et al. 2015) is one of the most versatile, yet easy to master instruments to transform any text into data, meeting the needs of psychologists who are not usually proficient in data science. Moreover, LIWC2015 is already available in multiple languages, which opens the door to exciting intercultural quests. The current article introduces the first Romanian version of LIWC2015, Ro-LIWC2015, and thus, contributes to the line of research concerning multilingual analysis. Throughout the paper, we describe the challenges of creating the Romanian dictionary and discuss other linguistics aspects, which could be useful for new adaptations of LIWC2015. Also, we present the results of two studies for assessing the criterion validity of Ro-LIWC2015. The first study focuses on the consistency between the Romanian and the English dictionaries in analyzing a corpus of books. The second study tests whether Ro-LIWC2015 can acquire linguistic differences in contrasting corpora. For this purpose, we analyzed posts from help-seeking forums for anxiety, depression, and health issues, and leveraged supervised learning to address several classification problems. The selected algorithm allows feature ranking, which facilitates more thorough interpretations. The linguistic markers extracted with Ro-LIWC2015 mirrored a number of disorder-specific features of depression and anxiety. Given the obtained results, this research encourages the use of Ro-LIWC2015 for hypothesis testing.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    58
    References
    1
    Citations
    NaN
    KQI
    []