An AlgebraicApproach toRule-Based n ormationxtraction
2008
Traditional approaches torule-based information extraction (IE)haveprimarily beenbasedon regular expres- siongrammars. However, thesegrammar-based systems have difficulty scaling tolargedatasetsandlargenumbersof rules. Inispired by traditional database researchs, we propose analgebraic approach torule-based IE thataddresses these scalability issues through queryoptimization. Theoperators of ouralgebra aremotivated byourexperience inbuilding several rule-based extraction programs overdiverse datasets. Wepresent theoperators ofouralgebra andpropose several optimization strategies motivated bythetext-specific characteristics ofour operators. Finally we validate thepotential benefits ofour approach byextensive experiments overreal-world blogdata.
Keywords:
- Correction
- Cite
- Save
- Machine Reading By IdeaReader
8
References
0
Citations
NaN
KQI