AIML Knowledge Base Construction from Text Corpora

2013 
Text mining (TM) and computational linguistics (CL) are computationally intensive fields where many tools are becoming available to study large text corpora and exploit the use of corpora for various purposes. In this chapter we will address the problem of building conversational agents or chatbots from corpora for domain-specific educational purposes. After addressing some linguistic issues relevant to the development of chatbot tools from corpora, a methodology to systematically analyze large text corpora about a limited knowledge domain will be presented. Given the Artificial Intelligence Markup Language as the “assembly language” for the artificial intelligence conversational agents we present a way of using text corpora as seed from which a set of “source files” can be derived. More specifically we will illustrate how to use corpus data to extract relevant keywords, multiword expressions, glossary building and text patterns in order to build an AIML knowledge base that could be later used to build interactive conversational systems. The approach we propose does not require deep understanding techniques for the analysis of text.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    53
    References
    4
    Citations
    NaN
    KQI
    []