Creating a focused corpus of factual outcomes from biomedical experiments

James Eales,George Demetriou,Robert Stevens

Creating a focused corpus of factual outcomes from biomedical experiments

2011

The results of an experiment are often described in a series of textual state- ments, the most concise of which being the title of the article. Here we imple- mented a novel approach, using standard data mining techniques, to collect a set of concise `factual' statements about a research area. We compare two standard text classification approaches to identify `factual' and `non-factual' sentences in article titles; the first of which uses a statistical language-modelling approach, and the second a more sophisticated semantic and grammatical approach. We find that the simple approach provides more accurately classified titles; achiev- ing 92% overall accuracy compared to 90% for the complex approach. We also implement a strategy to convert the phrasal dependencies in a `factual' title into subject-predicate-object structures (triples). These triples can then be organised according to a schema provided by domain ontologies; which occurs by mapping URIs to entities found in the textual labels.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations