Lexicalized stochastic modeling of constraint-based grammars using log-linear measures and EM training

Stefan Riezler,Jonas Kuhn,Detlef Prescher,Mark Johnson

Lexicalized stochastic modeling of constraint-based grammars using log-linear measures and EM training

2000

Stefan Riezler
Jonas Kuhn
Detlef Prescher
Mark Johnson

We present a new approach to stochastic modeling of constraint-based grammars that is based on loglinear models and uses EM for estimation from unannotated data. The techniques are applied to an LFG grammar for German. Evaluation on an exact match task yields 86% precision for an ambiguity rate of 5.4, and 90% precision on a subcat frame match for an ambiguity rate of 25. Experimental comparison to training from a parsebank shows a 10% gain from EM training. Also, a new class-based grammar lexicalization is presented, showing a 10% gain over unlexicalized models.

Keywords:

Log-linear model
Ambiguity
Machine learning
Natural language processing
Computer science
Artificial intelligence
Grammar
Lexicalization
Stochastic modelling
Rule-based machine translation
Pattern recognition
exact match
German

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations