Validating candidate gene-mutation relations in MEDLINE abstracts via crowdsourcing

John D. Burger,Emily Doughty,Samuel Bayer,David Tresner-Kirsch,Ben Wellner,John S. Aberdeen,Kyungjoon Lee,Maricel G. Kann,Lynette Hirschman

Validating candidate gene-mutation relations in MEDLINE abstracts via crowdsourcing

2012

We describe an experiment to elicit judgments on the validity of gene-mutation relations in MEDLINE abstracts via crowdsourcing. The biomedical literature contains rich information on such relations, but the correct pairings are difficult to extract automatically because a single abstract may mention multiple genes and mutations. We ran an experiment presenting candidate gene-mutation relations as Amazon Mechanical Turk HITs (human intelligence tasks). We extracted candidate mutations from a corpus of 250 MEDLINE abstracts using EMU combined with curated gene lists from NCBI. The resulting document-level annotations were projected into the abstract text to highlight mentions of genes and mutations for review. Reviewers returned results within 36 hours. Initial weighted results evaluated against a gold standard of expert curated gene-mutation relations achieved 85% accuracy, with the best reviewer achieving 91% accuracy. We expect performance to increase with further experimentation, providing a scalable approach for rapid manual curation of important biological relations.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations