Semantic representation of scientific literature: bringing claims, contributions and named entities onto the Linked Open Data cloud
2015
Processing (NLP) for Rhetorical Entity (RE) detection; (ii) Named Entity (NE) recognition based on the Linked Open Data (LOD) cloud; and (iii) automatic knowledge base construction for both NEs and REs using semantic web ontologies that interconnect entities in documents with the machine-readable LOD cloud. Results. We present a complete workflow to transform scientific literature into a semantic knowledge base, based on the W3C standards RDF and RDFS. A text mining pipeline, implemented based on the GATE framework, automatically extracts rhetorical entities of type Claims and Contributions from full-text scientific literature. These REs are further enriched with named entities, represented as URIs to the linked open data cloud, by integrating the DBpedia Spotlight tool into our workflow. Text mining results are stored in a knowledge base through a flexible export process that provides for a dynamic mapping of semantic annotations to LOD vocabularies through rules stored in the knowledge base. We created a gold standard corpus from computer science conference proceedings and journal articles, where Claim and Contribution sentences are manually annotated with their respective types using LOD URIs. The performance of the RE detection phase is evaluated against this corpus, where it achieves an average Fmeasure of 0.73. We further demonstrate a number of semantic queries that show how the generated knowledge base can provide support for numerous use cases in managing scientific literature. Availability. All software presented in this paper is available under open source licenses at http://www.semanticsoftware.info/semantic-scientific-literature-peerj-20... [19]. Development releases of individual components are additionally available on our GitHub page at https://github.com/SemanticSoftwareLab [20]. URL https://peerj.com/articles/cs-37/ [21] DOI 10.7717/peerj-cs.37 [22] Copyright © 2015 Sateli and Witte. Distributed under Creative Commons CC-BY 4.0. History Submitted 4 August 2015 Accepted 13 November 2015 Published 9 December 2015 Acknowledgments This work was partially funded by an NSERC Discovery Grant. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Attachment Size peerj-cs-37.pdf [23] 8.69 MB Semantics for the Masses Except where otherwise noted, all original content on this site is copyright by its author and licensed under a Creative Commons Attribution-Share Alike 2.5 Canada License. Source URL (retrieved on 2016-07-30 23:58 ): http://www.semanticsoftware.info/biblio/semantic-representation-scientific-literature-peerj-compsci-2015 Links: [1] http://www.semanticsoftware.info/users/bahar [2] http://www.semanticsoftware.info/taxonomy/term/418 [3] http://www.semanticsoftware.info/taxonomy/term/391 [4] http://www.semanticsoftware.info/category/blog-tags/natural-language-processing [5] http://www.semanticsoftware.info/taxonomy/term/419 [6] http://www.semanticsoftware.info/taxonomy/term/390 [7] http://www.semanticsoftware.info/category/blog-tags/semantic-publishing [8] http://www.semanticsoftware.info/category/blog-tags/semantic-web [9] http://www.semanticsoftware.info/category/topic/semantic-web [10] http://www.semanticsoftware.info/category/topic/semantic-computing [11] http://www.semanticsoftware.info/category/topic/nlp [12] http://www.semanticsoftware.info/category/topic/text-mining [13] http://www.semanticsoftware.info/biblio/author/73 [14] http://www.semanticsoftware.info/biblio/author/1 [15] http://www.semanticsoftware.info/biblio/author/161 [16] http://www.semanticsoftware.info/biblio/keyword/16 [17] http://www.semanticsoftware.info/biblio/keyword/104 [18] http://www.semanticsoftware.info/biblio/keyword/2 [19] http://www.semanticsoftware.info/semantic-scientific-literature-peerj-2015-supplements [20] https://github.com/SemanticSoftwareLab [21] https://peerj.com/articles/cs-37/ [22] http://dx.doi.org/10.7717/peerj-cs.37 [23] http://www.semanticsoftware.info/system/files/peerj-cs-37.pdf
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
30
References
24
Citations
NaN
KQI