Maximizing the Conditional Expected Reward for Reaching the Goal

2017 
The paper addresses the problem of computing maximal conditional expected accumulated rewards until reaching a target state briefly called maximal conditional expectations in finite-state Markov decision processes where the condition is given as a reachability constraint. Conditional expectations of this type can, e.g., stand for the maximal expected termination time of probabilistic programs with non-determinism, under the condition that the program eventually terminates, or for the worst-case expected penalty to be paid, assuming that at least three deadlines are missed. The main results of the paper are i a polynomial-time algorithm to check the finiteness of maximal conditional expectations, ii PSPACE-completeness for the threshold problem in acyclic Markov decision processes where the task is to check whether the maximal conditional expectation exceeds a given threshold, iii a pseudo-polynomial-time algorithm for the threshold problem in the general cyclic case, and iv an exponential-time algorithm for computing the maximal conditional expectation and an optimal scheduler.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    43
    References
    17
    Citations
    NaN
    KQI
    []