Ontology-based HTML to XML conversion
2005
Current wrapper approaches break down in extracting data from differently structured and frequently changing Web pages. To tackle this challenge, this paper defines domain-specific ontology, captures the semantic hierarchy in Web pages automatically by exploiting both structural information and common formatting information, and recognizes and extracts data by using ontology-based semantic matching without relying on page-specific formatting. It is adaptive to differently structured and frequently changing Web pages for a domain of interest.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
11
References
1
Citations
NaN
KQI