Learning with BOT - Bregman and Optimal Transport divergences
2021
The introduction of the Kullback-Leibler divergence in PAC-Bayesian theory can be traced back to the work of [1]. It allows to design learning procedure with generalization errors based on an optimal trade-off between accuracy on the training set, and complexity. This complexity is penalized thanks to the Kullback-Leibler divergence from a prior distribution, modeling a domain knowledge over the set of candidates or weak learners. In the context of high dimensional statistics, it gives rise to sparsity oracle inequalities or more recently sparsity regret bounds, where the complexity is measured thanks to 0 or 1 −norms. In this paper, we propose to extend the PAC-Bayesian theory to get more generic regret bounds for sequential weighted averages, where (1) the measure of complexity is based on any ad-hoc criterion and (2) the prior distribution could be very simple. These results arise by introducing a new measure of divergences from the prior in terms of Bregman divergence or Optimal Transport.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
1
Citations
NaN
KQI