Patterns of syntactic trees for parsing arabic texts

2010 
In order to parse Arabic texts, we have chosen to use a machine learning approach. It learns from an Arabic Treebank. The knowledge enclosed in this Treebank is structured as patterns of syntactic trees. These patterns are representative models of syntactic components of the Arabic language. They are not only layered but also both structurally and contextually rich. They serve as an informational source for guiding the parsing process. Our parser is progressive given that it proceeds by treating a sentence into a number of stages, equal to the number of its words. At each step, the parser affects the target word with the most likely patterns to represent it in the context where it is put. Then, it joins the selected patterns with those collected in the previous steps so as to construct the representative syntactic tree(s) of the whole sentence. Preliminary tests have yielded to obtain accuracy and f-score which are respectively equal to 84.78% and 77.52%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    2
    Citations
    NaN
    KQI
    []