Empirical Algorithms for General Stochastic Systems with Continuous States and Actions

2019 
In this paper, we present Randomized Empirical Value Learning (RAEVL) algorithm for MDPs with continuous state and action spaces. This algorithm combines the ideas of random search over action space with randomized function approximation method to generalize the value functions over state space . Our theoretical analysis is done under a random operator framework combined with stochastic dominance argument. This provides finite-time analysis of the proposed algorithm as well as give the sample complexity.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []