Maximizing the Conditional Expected Reward for Reaching the Goal
2017
The paper addresses the problem of computing maximal conditional expected accumulated rewards until reaching a target state briefly called maximal conditional expectations in finite-state Markov decision processes where the condition is given as a reachability constraint. Conditional expectations of this type can, e.g., stand for the maximal expected termination time of probabilistic programs with non-determinism, under the condition that the program eventually terminates, or for the worst-case expected penalty to be paid, assuming that at least three deadlines are missed. The main results of the paper are i a polynomial-time algorithm to check the finiteness of maximal conditional expectations, ii PSPACE-completeness for the threshold problem in acyclic Markov decision processes where the task is to check whether the maximal conditional expectation exceeds a given threshold, iii a pseudo-polynomial-time algorithm for the threshold problem in the general cyclic case, and iv an exponential-time algorithm for computing the maximal conditional expectation and an optimal scheduler.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
43
References
17
Citations
NaN
KQI