R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning
Ronen I. Brafman and Moshe Tennenholtz


Send Feedback