In multi-agent MDPs, it is generally necessary to consider the joint state space of all agents, making the size of the problem and the solution exponential in the number of agents. However, often interactions between the agents are only local, which suggests a more compact problem representation. We consider a subclass of multi-agent MDPs with local interactions where dependencies between agents are asymmetric, meaning that agents can affect others in a unidirectional manner. This asymmetry, which often occurs in domains with authority-driven relationships between agents, allows us to make better use of the locality of agentsý interactions. We present and analyze a graphical model of such problems and show that, for some classes of problems, it can be exploited to yield significant (sometimes exponential) savings in problem and solution size, as well as in computational efficiency of solution algorithms.
Moments of light cone quark density, helicity, and transversity distributions are calculated in unquenched lattice QCD at {beta} = 5.5 and {beta} = 5.3 using Wilson fermions on 16{sup 3} x 32 lattices. These results are combined with earlier calculations at {beta} = 5.6 using SESAM configurations to study the continuum limit.
The problem of optimal policy formulation for teams of resource-limited agents in stochastic environments is composed of two strongly-coupled subproblems: a resource allocation problem and a policy optimization problem. We show how to combine the two problems into a single constrained optimization problem that yields optimal resource allocations and policies that are optimal under these allocations. We model the system as a multiagent Markov decision process (MDP), with social welfare of the group as the optimization criterion. The straightforward approach of modeling both the resource allocation and the actual operation of the agents as a multiagent MDP on the joint state and action spaces of all agents is not feasible, because of the exponential increase in the size of the state space. As an alternative, we describe a technique that exploits problem structure by recognizing that agents are only loosely-coupled via the shared resource constraints. This allows us to formulate a constrained policy optimization problem that yields optimal policies among the class of realizable ones given the shared resource limitations. Although our complexity analysis shows the constrained optimization problem to be NP-complete, our results demonstrate that, by exploiting problem structure and via a reduction to a mixed integer program, we are able to solve problems orders of magnitude larger than what is possible using a traditional multiagent MDP formulation.
Distributing scarce resources among agents in a way that maximizes the social welfare of the group is a computationally hard problem when the value of a resource bundle is not linearly decomposable. Furthermore, the problem of determining the value of a resource bundle can be a significant computational challenge in itself, such as for an agent operating in a stochastic environment, where the value of a resource bundle is the expected payoff of the optimal policy realizable given these resources. Recent work has shown that the structure in agents' preferences induced by stochastic policy-optimization problems (modeled as MDPs) can be exploited to solve the resource-allocation and the policy-optimization problems simultaneously, leading to drastic (often exponential) improvements in computational efficiency. However, previous work used a flat MDP model that scales very poorly. In this work, we present and empirically evaluate a resource-allocation mechanism that achieves much better scaling by using factored MDP models, thus exploiting both the structure in agents' MDP-induced preferences, as well as the structure within agents' MDPs.
We consider the problem of policy optimization for a resource-limited agent with multiple time-dependent objectives, represented as an MDP with multiple discount factors in the objective function and constraints. We show that limiting search to stationary deterministic policies, coupled with a novel problem reduction to mixed integer programming, yields an algorithm for finding such policies that is computationally feasible, where no such algorithm has heretofore been identified. In the simpler case where the constrained MDP has a single discount factor, our technique provides a new way for finding an optimal deterministic policy, where previous methods could only find randomized policies. We analyze the properties of our approach and describe implementation results.