Algorithm-Directed Exploration for Model-Based Reinforcement Learning in Factored MDPs (2002)

Carlos Guestrin, Relu Patrascu, and Dale Schuurmans

Abstract -- One of the central challenges in reinforcement learning is to balance the exploration/exploitation tradeoff while scaling up to large problems. Although model-based reinforcement learning has been less prominent than value-based methods in addressing these challenges, recent progress has generated renewed interest in pursuing model-based approaches: Theoretical work on the exploration/exploitation tradeoff has yielded provably sound model-based algorithms such as E³ and RMAX, while work on factored MDP representations has yielded model-based algorithms that can scale up to large problems. Recently the benefits of both achievements have been combined in the Factored E³ algorithm of Kearns and Koller. In this paper, we address a significant shortcoming of Factored E³: namely that it requires an oracle planner that cannot be feasibly implemented. We propose an alternative approach that uses a practical approximate planner, approximate linear programming, that maintains desirable properties. Further, we develop an exploration strategy that is targeted toward improving the performance of the linear programming algorithm, rather than an oracle planner. This leads to a simple exploration strategy that visits states relevant to tightening the LP solution, and achieves sample efficiency logarithmic in the size of the problem description. Our experimental results show that the targeted approach performs better than using approximate planning for implementing either Factored E³ or Factored R_{max}.

download information

Carlos Guestrin, Relu Patrascu, and Dale Schuurmans (2002). "Algorithm-Directed Exploration for Model-Based Reinforcement Learning in Factored MDPs." The Nineteenth International Conference on Machine Learning (ICML-2002) (pp. 235-242).   ps  

bibtex citation

@inproceedings{Guestrin+al:2002c,
   author = {Carlos Guestrin and Relu Patrascu and Dale Schuurmans},
   title = {Algorithm-Directed Exploration for Model-Based Reinforcement Learning in Factored {MDP}s},
   year = {2002},
   month = {July},
   booktitle = {The Nineteenth International Conference on Machine Learning (ICML-2002)},
   address = {Sydney, Australia},
   pages = {235--242},
}