PG Agents: Policy Gradient Algorithms with Tensorflow

The idea behind pg_agents is to provide an easy to understand python package containing the state the art policy gradient algorithms.

Implemented algorithms

VPG: Vanilla Policy Gradient Also known as REINFORCE
TNPG: Truncated Natural Policy Gradient Reformulation of the batch RL problem in terms of a contrained optimization problem
TRPO: Trust Region Policy Optimization Extension of TNPG to ensure robustness
GAE: Generalized Advantage Estimator Method to estimate the advantage function from experience. Helps to reduce the variance of the gradient estimator.
PPO: Proximal Policy Optimization Simple but efficient extension of VPG.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
algos		algos
gae		gae
models		models
policies		policies
utils		utils
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
__init__.py		__init__.py
experiment.py		experiment.py