A minimalistic and friendly implementation of DDPG using PyTorch.
https://spinningup.openai.com/en/latest/algorithms/ddpg.html
TODO:
- simplify buffer
- requirements
- think out how to easier switch between different variants
- important features and variants
- hyperparameters
- training curves and standard errors
- action wrapper
- some design choices and justifications
Credit:
- I checked my code against https://github.com/Pechckin/MountainCar and borrowed its implementation of replay buffer which has faster sampling speed.
Leadboard performances https://github.com/openai/gym/wiki/Leaderboard#pendulum-v0