Skip to content

mcres/rl-mdatos

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rl-mdatos

This repository contains my final project for the Data Mining subject — Minería de Datos in Spanish, that's why mdatos, taught in the Master's Degree In Systems And Control Engineering at UNED (Universidad Nacional de Educación a Distancia) and UCM (Universidad Complutense de Madrid), from Spain.

It is an implementation of several tabular Reinforcement Learning algorithms, which are then applied to OpenAI Gym environments. The algorithms and environments implemented are the following:

Environment Sarsa Q-Learning n-step Sarsa Dyna-Q
NChain-v0 ✔️ ✔️ ✔️ ✔️
FrozenLake-v0 ✔️ ✔️ ✔️ ✔️
CartPole-v0 ✔️ ✔️ ✔️ ✖️
MountainCar-v0 ✔️ ✔️ ✔️ ✖️

The goal of this repo is purely educational:

A Jupyter Notebook written in Spanish that provides basic explanations of RL concepts making use of this repo can be found here.

The bibliography I used is probably the most common entry point if you want to learn Reinforcement Learning.

How to use this repo

In order to train and evaluate the agents in this repo, follow these steps:

Create and activate a virtual environment:

$ cd rl-mdatos
$ virtualenv .venv
$ source .venv/bin/activate

Install the required packages:

$ (.venv) pip install -r requirements.txt

Install this very repo in editable mode:

$ (.venv) pip install -e .

Go to the desired environment. For each environment, there's a script to train, execute and/or record a specific algorithm:

$ (.venv) cd rl_mdatos/envs/desired_env

To train a Q-Learning agent in CartPole-v0:

$ (.venv) python cp_q_learning.py --train

To execute the trained agent:

$ (.venv) python cp_q_learning.py --run

To record the execution (this only works for CartPole-v0 and MountainCar-v0):

$ (.venv) python cp_q_learning.py --run --record

3 types of files are stored in rl-mdatos/data:

  • logs: data generated during training, which can be visualized with tensorboard (tensorboard --logdir data/...)
  • trained_agents: files with final parameters of the trained agents, which are loaded at execution time.
  • videos: videos of the recorded episodes.

Output

After successfully training the agents, these should be the results.

NChain-v0

INFO:root:Running Q-Learning agent
INFO:root:Episode 1
INFO:root:Total reward: 9960
INFO:root:Mean reward: 9.96

FrozenLake-v0

CartPole-v0

MountainCar-v0

Bibliography

[1] Richard S. Sutton and Andrew G. Barto. Reinforcement learning: An introduction. MIT press, 2018.

[2] David Silver. Lectures on Reinforcement Learning. URL:https://www.davidsilver.uk/teaching/. 2015.

[3] Stuart J. Russell and Peter Norvig. Artificial Intelligence - A Modern Approach, Third International Edition. Pearson Education London, 2010.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published