Creating a focused corpus of factual outcomes from biomedical experiments
2011
The results of an experiment are often described in a series of textual state- ments, the most concise of which being the title of the article. Here we imple- mented a novel approach, using standard data mining techniques, to collect a set of concise `factual' statements about a research area. We compare two standard text classification approaches to identify `factual' and `non-factual' sentences in article titles; the first of which uses a statistical language-modelling approach, and the second a more sophisticated semantic and grammatical approach. We find that the simple approach provides more accurately classified titles; achiev- ing 92% overall accuracy compared to 90% for the complex approach. We also implement a strategy to convert the phrasal dependencies in a `factual' title into subject-predicate-object structures (triples). These triples can then be organised according to a schema provided by domain ontologies; which occurs by mapping URIs to entities found in the textual labels.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
11
References
0
Citations
NaN
KQI