Using Speech Acts to Categorize Email and Identify Email Genres

2006 
We define genres of email as well as a subset of "speech acts" relevant to email enhanced for email specific discourse. After creating a ground truth set of emails based on these email acts, we compare the performance of two classifiers (Random Forests and SVM-light) in identifying the primary communicative intent of the email and its corresponding genre. We experiment with using feature sets derived from two verb lexicons as well as a feature set containing selected characteristics of email. Results show better classifier accuracy using the verb lexicon with the smaller number of classes over the larger, and that using part of speech tagging to focus on selecting only verbs, causes a slight drop in performance. Using the email characteristics set alone results in better performance than either of the verb lexicons alone, but the best results are obtained using a combination of the smaller verb lexicon and the email characteristics set.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    35
    Citations
    NaN
    KQI
    []