Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] HER and prioritized experience replay #751

Closed
johannes-dornheim opened this issue Mar 20, 2020 · 4 comments
Closed

[question] HER and prioritized experience replay #751

johannes-dornheim opened this issue Mar 20, 2020 · 4 comments
Labels
enhancement New feature or request question Further information is requested v3 Discussion about V3

Comments

@johannes-dornheim
Copy link

Hi

in the stable-baselines implementation, HER does not support prioritized replay buffer. In the HER Paper they state that: "Prioritized experience replay (....) is orthogonal to our work and both approaches can be easily combined". So my question is: Are there 'deeper reasons' for the lack of support or is it just a currently missing feature?

Best Regards,
Johannes

@johannes-dornheim johannes-dornheim changed the title HER and prioritized experience replay [question] HER and prioritized experience replay Mar 20, 2020
@Miffyli
Copy link
Collaborator

Miffyli commented Mar 20, 2020

I believe there are no bigger reasons to lack of support, other than lack of implementation. It would require coming up with prioritizes for the samples in the buffer, and then updating the replay_buffer.py in HER. I am not too familiar with HER to know how easy of a feat this would be. On the first glance it does not sound as straight-forward as with DQNs.

@Miffyli Miffyli added the question Further information is requested label Mar 20, 2020
@RyanRizzo96
Copy link

RyanRizzo96 commented Mar 27, 2020

Actually, PER has been shown not to improve performance over HER, hence there is no real motivation to imlpement. Not only does PER not improve performance, but it actually increases computational time substantially.

PER works by prioritising transitions with higher TD-error, which means that the TD-error must be computed for each transition, hence the expensive computational time.

Prioritised Sequence Experience Replay (PSER) outperforms PER but has not been imlpemented with HER.

There are other methods which improve the sampling efficiency of HER (such as Energy Based Prioritisation), but PER is not one of them. I have put my name forward to implement this in the new PyTorch version.

@Miffyli
Copy link
Collaborator

Miffyli commented Mar 27, 2020

Ok, thanks for the info! Indeed any such new features would be things for the PyTorch version :).

@araffin
Copy link
Collaborator

araffin commented May 9, 2020

I added it to possible features for Stable-Baselines3 1.1+ in DLR-RM/stable-baselines3#1

@araffin araffin closed this as completed May 9, 2020
@araffin araffin added enhancement New feature or request v3 Discussion about V3 labels May 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested v3 Discussion about V3
Projects
None yet
Development

No branches or pull requests

4 participants