Pendulum_PPO

Quickstarted

Run this command to use the pretrained model to play the game

>python pendulum.py play

Or run this command to train the model

>python pendulum.py anything-(not-play)

The model in pendulum.py was able to solved Pendulum-v0 after about 110 episodes

Total rewards in 140 steps of traing:
You're free to edit the model hyperparameters and some constansts to make it better

Special thanks to Morvan Zhou for the explanation of the PPO