Annotating a corpus of clinical text records for learning to recognize symptoms automatically

Rob Koeling,John A. Carroll,A Rosemary Tate,Amanda Nicholson

Annotating a corpus of clinical text records for learning to recognize symptoms automatically

2011

Rob Koeling
John A. Carroll
A Rosemary Tate
Amanda Nicholson

We report on a research effort to create a corpus of clinical free text records enriched with annotation for symptoms of a particular disease (ovarian cancer). We describe the original data, the annotation procedure and the resulting corpus. The data (approximately 192K words) was annotated by three clinicians and a procedure was devised to resolve disagreements. We are using the corpus to investigate the amount of symptom-related information in clinical records that is not coded, and to develop techniques for recognizing these symptoms automatically in unseen text.

Keywords:

Information retrieval
Annotation
Data mining
Computer science
original data
Artificial intelligence
Natural language processing
text messaging
clinical record

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations