The Canonical Model of Structure for Data Extraction in Systematic Reviews of Scientific Research Articles
2018
The systematic review activity is time-consuming, error prone and labour intensive activity due to the manual processes involved; with data extraction being an extremely difficult and cognitively demanding process. Automation can save a significant amount of time and reduces the workload. However, there is no unified approach for automatic data extraction in systematic reviews. This paper presents a canonical model of structure of the papers that serves as a unified approach and a foundation for subsequent extraction of information from scientific research articles automatically. The model was developed using text mining and natural language processing techniques on one thousand (1000) published research papers. A novel approach was used to identify the various section headings from the papers. This approach achieved an accuracy of 82%. A statistical analysis of the most frequent words/phrases in the section headings was used to build the canonical model of structure of the papers.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
19
References
6
Citations
NaN
KQI