This Repository is Reinforcement Learning Agent FrameWork

This repository is designed to provide an easy demo reinforcement learning framework for those studying deep reinforcement learning.

This framework is based on a tensorflow. And the basic model is implemented in example_model directory. If you want to use your own model, please refer provided model in example_model directory

We provide a tutorial to train the agent for the environment, and tutorials by action and input shape are provided as follows.

Environment

Continuous Action MLP - bipedalwalker, pendulum
Discrete Action MLP - LunarLander
Discrete Action CNN - Breakout

Algorithms

Continuous Action MLP - DDPG, TD3, PPO, PPO2
Discrete Action MLP - Vanilla PG, A2C, PPO, DQN, QRDQN, IQN
Discrete Action CNN - Vanilla PG, A2C, PPO, DQN, QRDQN, IQN

Our tutorial is being done in the gym environment provided by openai and you need to install the openai gym and box2d to run the tutorial code.

Installation

from git repository

https://github.com/RLOpensource/tensorflow_RL
pip install .

cpu version

pip install tensorflow-rl[tf-cpu]

gpu version

pip install tensorflow-rl[tf-gpu]

If you install this repository by only

pip install tensorflow-rl

tensorflow is not installed

Requirements

tensorflow
box2d
gym
numpy
tensorboardX

Implemented

Demonstration

1. Continuous Action BipedalWalker

Script : bipedalwalker_td3.py, bipedalwalker_ddpg.py, bipedalwalker_ppo.py, bipedalwalker_ppo2.py
Environment : BipedalWalker-v2
Orange : td3, Blue: ddpg, SkyBlue: ppo, Pink: ppo2
Episode : 600
Image : td3

BipedalWalker

2. Continuous Action Pendulum

Script : pendulum_td3.py, pendulum_ddpg.py
Environment : Pendulum-v0
Orange : ddpg, Blue: td3
Episode : 300
Image : td3

Pendulum

3. Discrete Action CNN Breakout

Script : breakout_rollout_a2c.py, breakout_rollout_ppo.py, breakout_rollout_vpg.py
Environment : BreakoutDeterministic-v4 with Multi-processing
Blue : ppo, Orange : a2c, Red : vpg
Episode : 600
Image : PPO

Breakout

4. Discrete Action MLP LunarLander

Script : lunarLander_rollout_a2c.py, lunarLander_rollout_ppo.py, lunarLander_rollout_vpg.py
Environment : LunarLander-v2 with Multi-processing
Blue : ppo, Orange : a2c, Red : vpg
Episode : 350
Image : PPO

LunarLander

5. Value Based Reinforcement Learning with CNN

Script : breakout_value_dqn.py, breakout_value_qrdqn.py, breakout_value_iqn.py
Environment : BreakoutDeterministic-v4 with Multi-processing
Green : IQN, Blue : QRDQN, Pink : DQN
Episode : 280
Image : IQN

Breakout

6. Value Based Reinforcement Learning with MLP

Script : lunarLander_value_dqn.py, lunarLander_value_qrdqn.py, lunarLander_value_iqn.py
Environment : LunarLander-v2 with Multi-processing
Orange : IQN, Blue : QRDQN, Red : DQN
Episode : 250
Image : IQN

Breakout

7. Discrete Action CNN LSTM Breakout inspired from drqn

Script : breakout_rollout_ppo_1stack_lstm.py, breakout_rollout_ppo_1stack.py
Environment : BreakoutDeterministic-v4 with Multi-processing
Orange : PPOLSTM, Blue : PPO-1stack
Episode : 1000
Image : PPOLSTM

Breakout

Member

License

We do not have the copyright to this repository.

Please 'just' use these code and just 'refer' the url of repository in any form.

MIT License

Reference

[1] mario_rl

[2] Proximal Policy Optimization

[3] Efficient Parallel Methods for Deep Reinforcement Learning

[4] High-Dimensional Continuous Control Using Generalized Advantage Estimation

[5] Asynchronous Methods for Deep Reinforcement Learning

[6] Continuous Control With Deep Reinforcement Learning

[7] Vanilla Policy Gradient

[8] Deep Recurrent Q-Learning for Partially Observable MDPs

[9] Playing Atari with Deep Reinforcement Learning

[10] Distributional Reinforcement Learning with Quantile Regression

[11] Implicit Quantile Networks for Distributional Reinforcement Learning

[12] OpenAI Spinningup

[13] Reinforcement Learning Korea PG Travel

[14] Medipixel Reinforcement Learning Repository

Please fork this repository and contribute to strengthen the tensorflow reinforcement learning ecosystem

Support us in any form. Thank you

Content us to chagmgang@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

This Repository is Reinforcement Learning Agent FrameWork

Installation

Requirements

Implemented

Demonstration

1. Continuous Action BipedalWalker

BipedalWalker

2. Continuous Action Pendulum

Pendulum

3. Discrete Action CNN Breakout

Breakout

4. Discrete Action MLP LunarLander

LunarLander

5. Value Based Reinforcement Learning with CNN

Breakout

6. Value Based Reinforcement Learning with MLP

Breakout

7. Discrete Action CNN LSTM Breakout inspired from drqn

Breakout

Member

License

Reference

Please fork this repository and contribute to strengthen the tensorflow reinforcement learning ecosystem

Support us in any form. Thank you

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
agent		agent
example_model		example_model
sources		sources
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
bipedal_walker_environment.py		bipedal_walker_environment.py
bipedalwalker_ddpg.py		bipedalwalker_ddpg.py
bipedalwalker_ppo.py		bipedalwalker_ppo.py
bipedalwalker_ppo2.py		bipedalwalker_ppo2.py
bipedalwalker_td3.py		bipedalwalker_td3.py
breakout_environment.py		breakout_environment.py
breakout_environment_custom.py		breakout_environment_custom.py
breakout_rollout_a2c.py		breakout_rollout_a2c.py
breakout_rollout_ppo.py		breakout_rollout_ppo.py
breakout_rollout_ppo_1stack.py		breakout_rollout_ppo_1stack.py
breakout_rollout_ppo_1stack_lstm.py		breakout_rollout_ppo_1stack_lstm.py
breakout_rollout_vpg.py		breakout_rollout_vpg.py
breakout_value_dqn.py		breakout_value_dqn.py
breakout_value_iqn.py		breakout_value_iqn.py
breakout_value_qrdqn.py		breakout_value_qrdqn.py
lunarLander_environment.py		lunarLander_environment.py
lunarLander_rollout_a2c.py		lunarLander_rollout_a2c.py
lunarLander_rollout_ppo.py		lunarLander_rollout_ppo.py
lunarLander_rollout_vpg.py		lunarLander_rollout_vpg.py
lunarLander_value_dqn.py		lunarLander_value_dqn.py
lunarLander_value_iqn.py		lunarLander_value_iqn.py
lunarLander_value_qrdqn.py		lunarLander_value_qrdqn.py
pendulum_ddpg.py		pendulum_ddpg.py
pendulum_td3.py		pendulum_td3.py
setup.cfg		setup.cfg
setup.py		setup.py

License

RLOpensource/tensorflow_RL

Folders and files

Latest commit

History

Repository files navigation

This Repository is Reinforcement Learning Agent FrameWork

Installation

Requirements

Implemented

Demonstration

1. Continuous Action BipedalWalker

BipedalWalker

2. Continuous Action Pendulum

Pendulum

3. Discrete Action CNN Breakout

Breakout

4. Discrete Action MLP LunarLander

LunarLander

5. Value Based Reinforcement Learning with CNN

Breakout

6. Value Based Reinforcement Learning with MLP

Breakout

7. Discrete Action CNN LSTM Breakout inspired from drqn

Breakout

Member

License

Reference

Please fork this repository and contribute to strengthen the tensorflow reinforcement learning ecosystem

Support us in any form. Thank you

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages