The matching pursuit algorithm revisited: A variant for big data and new stopping rules

2019 
Abstract The matching pursuit algorithm (MPA) is used in many applications for selecting the best predictors for a vector of measurements of size n from a dictionary that contains p n atoms, where usually n  ≤  p n . A major unsolved problem is to determine the optimal stopping rule. In this work, we investigate various stopping rules which are modifications of the information theoretic (IT) criteria derived for Gaussian linear regression. Because all of them involve the degrees of freedom (df) given by the trace of the hat matrix, we provide some theoretical results concerning this matrix. We also propose novel stopping rules. An important contribution of this paper is a method for computing the df efficiently when big data ( n  ≫  p n ) are processed. The significance of the auxiliary variables appearing in MPA for big data is clarified via a theoretical analysis. The superiority of the new stopping rules in comparison with the traditional approaches is demonstrated in simulations involving big data ( n  ≫  p n ) or overcomplete dictionaries ( n p n ) and in experiments with air pollution data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    1
    Citations
    NaN
    KQI
    []