Skip to content

Latest commit

 

History

History
17 lines (14 loc) · 434 Bytes

README.md

File metadata and controls

17 lines (14 loc) · 434 Bytes

tensorflow-policy-gradient

Still under construction...

Dependencies

  • Python 2.7
  • TensorFlow >= 0.8.0
  • NumPy >= 1.10.0
  • openai gym
  • matplotlib

Quick try

Run

python gym_experiment.py

to train a softmax policy (without bias) using vanilla policy gradient on CartPole task. You can see that the return is stochastically increasing until it reaches the maximum (200).