Solving Continuous-Time Transition-Independent DEC-MDP with Temporal Constraints

2011 
Despite the impact of DEC-MDPs over the past decade, scal-ing to large problem domains has been dicult to achieve.The scale-up problem is exacerbated in DEC-MDPs withcontinuous states, which are critical in domains involvingtime; the latest algorithm (M-DPFP) does not scale-up be-yond two agents and a handful of unordered tasks per agent.This paper is focused on meeting this challenge in contin-uous resource DEC-MDPs with two predominant contribu-tions. First, it introduces a novel continuous time model formulti-agent planning problems that exploits transition in-dependence in domains with graphical agent dependenciesand temporal constraints. More importantly, it presents anew, iterative, locally optimal algorithm called SPAC thatis a combination of the following key ideas: (1) de ninga novel augmented CT-MDP such that solving this single-agent continuous time MDP provably provides an automaticbest response to neighboring agents’ policies; (2) fast con-volution to eciently generate such augmented MDPs; (3)new enhanced lazy approximation algorithm to solve theseaugmented MDPs; (4) intelligent seeding of initial policiesin the iterative process; (5) exploiting graph structure ofreward dependencies to exploit local interactions for scala-bility. Our experiments show SPAC not only nds solutionssubstantially faster than M-DPFP with comparable quality,but also scales well to large teams of agents.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    1
    Citations
    NaN
    KQI
    []