Structured Discriminative Models for Sequential Data Classification

2010 
The use of discriminative models for structured classification tasks, such as automatic speech recognition is becoming increasingly popular. The major contribution of this first-year work is we proposed a large margin structured log-linear model for noise robust continuous ASR. An important aspect of log-linear models is the form of the features. The features used in our structured log linear model are derived from generative kernels. This provides an elegant way of combining generative and discriminative models to handle time-varying data. Additionally, since the features are based on the generative models, model-based compensation can be easily performed for noise robustness. Third, the designed joint feature space can be decomposed at the arc level. This allows efficient decoding and training with lattices, which is important for any larger vocabulary extensions. Previous work in this area is extended in two important directions. First, instead of using CML training which is commonly used for discriminative models, this paper describes efficient large margin training based on lattices. Second, efficient lattice-based classification of continuous data is also performed incorporating a joint feature space. Depending on the nature of the joint feature-space and labels, we have proved that this form of model is closely related to structured SVMs and Multi-class SVMs. The model is evaluated on a noise corrupted continuous digit task: AURORA 2.0. Results on the AURORA 2 demonstrate that modelling the structure information yields significant improvements. Current joint features are constructed based on the “most likely” alignment which is given by Viterbi likelihood. An exponential mixture model is proposed as a latent variable extension of structured log linear model. A new joint large margin training framework is proposed for this model which allows both discriminative and generative parameters can be estimated together in a large margin fashion with optimal alignments. An augmented Viterbi searching algorithm is also proposed for efficient training and decoding. The implementation of these extensions will be a part of future work.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    70
    References
    0
    Citations
    NaN
    KQI
    []