FAQ Extracting and Domain Filtering Based on Improved Bayes

2009 
FAQ(Frequently Asked Questions) is the basis of Question Answering System (QA) that oriented frequently asked questions database. For the FAQ is difficult to collect and organize, this paper proposed an automatic acquisition method of domain FAQ based on improved Bayes. Parsing HTML pages into DOM tree, combining with the restricted domain knowledge base, extracting the node information and structural characteristics of DOM tree as the classified feature, using the improved Bayesian classified learning algorithm, constructing the classification model, acquiring FAQ from the HTML page automatically and filtering out the domain FAQ , the experimental results of this method show that it has a remarkable effect.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    1
    Citations
    NaN
    KQI
    []