Automatic Pattern Generator of Natural Language Text Applied in Public Health
2015
At the moment, a huge amount of scientific articles is available, referring to a wide variety of topics like medicine, technology, economics, finance, and so on. Scientific papers show results of scientific interest and also present the evaluation and interpretation of relevant arguments. Due to the fact that these papers are created with a high frequency it is feasible to analyze how people write in a given domain. Within the discipline of natural language processing there are different approaches to analyze large amounts of text corpus. Identification patterns with semantic elements in a text, let us classify and examine the corpus to facilitate interpretation and management of information through computers. At the moment, a semiautomatic or automatic way to generate natural language patterns is not available or quite complicated. In the paper, it is shown how a tool developed for this research is tested in a domain of public health. The results obtained – by means of a tool and aided by graphs – provide groups of words that are used (to determine if they come from a specific vocabulary), most common grammatical categories, most repeated words in a domain, patterns found, and frequency of patterns found. A domain of public health has been selected containing 800 papers concerning different topics referring to genetics. The topics include mutations, genetic deafness, DNA, trinucleotide, suppressor genes, among others. An ontology of public health has been used to provide the basis of the study.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
16
References
0
Citations
NaN
KQI