Annotating Subsets of the Enron Email Corpus

2006 
an annotation project for two subsets of the Enron email corpus. The first is a subset of the UC Berkeley Enron Email Analysis Project and the second consists of a portion of emails from the Voice Transcripts Email Correlated Corpora. Parts of the automatic content extraction (ACE) annotation guidelines, extended for the email domain are used for annotation. We also categorize the emails with email speech acts, mark whether the text contains discussions of meetings/conversations, and determine the degree of correlation of the subject line with the text body.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    18
    Citations
    NaN
    KQI
    []