The Canonical Model of Structure for Data Extraction in Systematic Reviews of Scientific Research Articles

Muhammad Bello Aliyu,Rahat Iqbal,Anne E. James

The Canonical Model of Structure for Data Extraction in Systematic Reviews of Scientific Research Articles

2018

The systematic review activity is time-consuming, error prone and labour intensive activity due to the manual processes involved; with data extraction being an extremely difficult and cognitively demanding process. Automation can save a significant amount of time and reduces the workload. However, there is no unified approach for automatic data extraction in systematic reviews. This paper presents a canonical model of structure of the papers that serves as a unified approach and a foundation for subsequent extraction of information from scientific research articles automatically. The model was developed using text mining and natural language processing techniques on one thousand (1000) published research papers. A novel approach was used to identify the various section headings from the papers. This approach achieved an accuracy of 82%. A statistical analysis of the most frequent words/phrases in the section headings was used to build the canonical model of structure of the papers.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations