Sentence Realization with Unlexicalized Tree Linearization Grammars

2012 
Sentence realization, as one of the important components in natural language generation, has taken a statistical swing in recent years. While most previous approaches make heavy usage of lexical information in terms of N -gram language models, we propose a novel method based on unlexicalized tree linearization grammars. We formally define the grammar representation and demonstrate learning from either treebanks with gold-standard annotations, or automatically parsed corpora. For the testing phase, we present a linear time deterministic algorithm to obtain the 1-best word order and further extend it to perform exact search for n-best linearizations. We carry out experiments on various languages and report state-of-the-art performance. In addition, we discuss the advantages of our method on both empirical aspects and its linguistic interpretability.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    3
    Citations
    NaN
    KQI
    []