Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vanilla REINFORCE implementation #200

Open
alek5k opened this issue May 8, 2019 · 2 comments
Open

Vanilla REINFORCE implementation #200

alek5k opened this issue May 8, 2019 · 2 comments

Comments

@alek5k
Copy link
Contributor

alek5k commented May 8, 2019

Hello,

Is there any benefit to having a vanilla REINFORCE algorithm for people trying to learn the concepts? REINFORCE with Baseline includes a value function approximator which has a lot of similarities to the Actor Critic.

I think being able to see a pure policy gradient method could be useful as a learning tool, otherwise people may assume Policy Gradient methods have to have some kind of value function approximation too.

@makaveli10
Copy link

Look at this if you want to see the high variance results of Vanilla reinforce

@vieveks
Copy link

vieveks commented Feb 7, 2023

Can I implement the vanilla REINFORCE ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants