Transformation-based tree-to-tree alignment

2012 
Previous experiments suggest that a rule-based approach to tree alignment error correction serves to be an eective complement to statistical alignment. We show how, using relatively few features, an implementation of Brill’s Transformation-Based Learning algorithm improves the results of a high precision model of the statistical aligner Lingua-Align. Using our system to correct already tree aligned data, we achieve balanced F-scores of 80.6 on our test set and 85.2 on our development test set. Using it as a tree aligner on word aligned data, our best F-scores using the same model amount to 78.7 and 83.0 respectively. Finally, we apply a pipeline of alignment and error correction tools to create several versions of a large parallel treebank consisting of various domains for Dutch to English for use in a syntax-based MT system. We conclude that transformation-based learning is a promising approach for the large-scale creation of parallel treebanks for various NLP purposes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    39
    References
    2
    Citations
    NaN
    KQI
    []