Efficient Solution Algorithms for Factored MDPs (2002)

Carlos Guestrin, Daphne Koller, Ronald Parr, and Shobha Venkataraman

Abstract -- Markov Decision Processes (MDPs) provide a coherent mathematical framework for planning under uncertainty. In this framework, the system is modeled via a set of states which evolve stochastically. However, in virtually any real-life domain, the state space is too large for such a representation to be feasible. Fortunately, many large MDPs have significant internal structure, and can be modeled compactly by a factored MDP. Here, a state is implicitly described by an assignment to some set of state variables. A dynamic Bayesian network (DBN)\/ can then allow for a compact representation of the transition model. Although, a factored MDP can represent a planning problem compactly, finding the exact solution may still be intractable. In this paper, we propose three new algorithms for finding linear approximate solution to factored MDPs, where we use a linear combination of basis functions as a compact approximation to the value function. These algorithms can be divided into two approaches: approximate dynamic programming and approximate linear programming. In terms of approximate dynamic programming, almost all of the previous algorithms use an approximation based on the (weighted) L_2-norm (Euclidean distance); this approach prevents the application of standard convergence results for MDP algorithms, all of which are based on max-norm. Here, this paper makes two contributions. First, it presents the first approximate MDP solution algorithms - both value and policy iteration - that use max-norm projection, thereby directly optimizing the quantity required to obtain the best error bounds. Second, it shows how these algorithms can be applied efficiently in the context of factored MDPs. In terms of approximate linear programming, we propose a new algorithm, which also exploits the structure in factored MDPs. This new algorithm builds on the basic ideas of our approximate dynamic programming approaches, but it is simpler, more efficient and more general than the previous methods. Our experimental results demonstrate that our algorithms scale well to problems with over 10^{15} states.

download information

Carlos Guestrin, Daphne Koller, Ronald Parr, and Shobha Venkataraman (2002). "Efficient Solution Algorithms for Factored MDPs." Accepted in Journal of Artificial Intelligence Research (JAIR).   ps  

bibtex citation

@article{Guestrin+al:2002f,
   author = {Carlos Guestrin and Daphne Koller and Ronald Parr and Shobha Venkataraman},
   title = {Efficient Solution Algorithms for Factored {MDP}s},
   journal = {Accepted in Journal of Artificial Intelligence Research (JAIR)},
   year = {2002},
   volume = {},
   number = {},
   pages = {},
   month = {},
}