Semantic web and language resources for e-government: linguistically motivated data mining

2011 
Language Technologies (LT) perform well when they rely on a previous Language Resources (LR) development. Hence, in this paper we illustrate how to build an efficient data mining system based on a coherent formalization of natural language and on a lingware (in Machine-Readable Form) built on the universal concepts of "lexical unit", "meaning unit" and "morphosyntactic context". From a linguistic and semantic point of view we get more coherent results through a linguistically motivated data mining than a statistical approach. We developed LR for Natural Language Processing (NLP) applications, composed by electronic dictionaries made of terminological multiword-expressions (Machine-Readable Form) and by local grammars (in the form of finite-state automata and transducers -- FSA/FST). Both parts of this lingware were built and applied according to Lexicon-Grammar (LG) formalization principles and methods. The mentioned language resources form the basis to develop an Information Retrieval System for e-Government. This device is a Semantic Web (SW) application, which will automatically recognize a given set of frequently-asked questions (from here on, FAQs) on European Community Information, previously formalized as syntactic patterns inside local grammars.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    1
    Citations
    NaN
    KQI
    []