An Anytime Algorithm for Reachability on Uncountable MDP.

2020 
We provide an algorithm for reachability on Markov decision processes with uncountable state and action spaces, which, under mild assumptions, approximates the optimal value to any desired precision. It is the first such anytime algorithm, meaning that at any point in time it can return the current approximation with its precision. Moreover, it simultaneously is the first algorithm able to utilize \emph{learning} approaches without sacrificing guarantees and it further allows for combination with existing heuristics.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    42
    References
    0
    Citations
    NaN
    KQI
    []