FAQ Extracting and Domain Filtering Based on Improved Bayes

Zhengtao Yu,Huanyun Zong,Yangbo Xu,Jianyi Guo,Yu Mao,Xiangyan Meng

FAQ Extracting and Domain Filtering Based on Improved Bayes

2009

Zhengtao Yu
Huanyun Zong
Yangbo Xu
Jianyi Guo
Yu Mao
Xiangyan Meng

FAQ(Frequently Asked Questions) is the basis of Question Answering System (QA) that oriented frequently asked questions database. For the FAQ is difficult to collect and organize, this paper proposed an automatic acquisition method of domain FAQ based on improved Bayes. Parsing HTML pages into DOM tree, combining with the restricted domain knowledge base, extracting the node information and structural characteristics of DOM tree as the classified feature, using the improved Bayesian classified learning algorithm, constructing the classification model, acquiring FAQ from the HTML page automatically and filtering out the domain FAQ , the experimental results of this method show that it has a remarkable effect.

Keywords:

Naive Bayes classifier
Information retrieval
Bayes' theorem
Data mining
Parsing
Question answering
Computer science
Statistical classification
Document Object Model
Domain knowledge
Feature extraction
Pattern recognition
Artificial intelligence
Filter (signal processing)

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations