Jun 12, 2019 · We look at model-based approaches to Reinforcement Learning.We discuss State-value and State-action value functions, Model-based iterative policy evaluation, and improvement, MDP R examples of moving a pawn, how the discount factor, gamma, “works” and an R example illustrating how the discount factor and relative rewards affect policy. Current development includes MDPs, POMDPs and related algorithms. This toolbox was originally developed taking inspiration from the Matlab MDPToolbox, which you can find here, and from the pomdp-solve software written by A. R. Cassandra, which you can find here. If you use this toolbox for research, please consider citing our JMLR article: The list of algorithms that have been implemented includes backwards induction, linear programming, policy iteration, q-learning and value iteration along with several variations. The classes and functions were developped based on the MATLAB MDP toolbox by the Biometry and Artificial Intelligence Unit of INRA Toulouse (France).