How many training steps used to obtain the pre-trained model? #70

xinghua-qu · 2020-03-19T00:41:00Z

Is there any document illustrating how many training steps used to obtain the pre-trained model? Some pretrained model seems far less than the start-of-the-art. For instance, the dqn model on BeamRider and Qbert only achieve 948.0 and 550.0. However, using other policies (e.g., PPO2 and ACKTR), such reward values could be 10,000+.
It would be better if you can provide these pre-trained models as a trustworthy baseline for benchmarking.

araffin · 2020-03-19T07:38:45Z

duplicate of #38

araffin closed this as completed Mar 19, 2020

araffin reopened this Mar 19, 2020

araffin closed this as completed Mar 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How many training steps used to obtain the pre-trained model? #70

How many training steps used to obtain the pre-trained model? #70

xinghua-qu commented Mar 19, 2020

araffin commented Mar 19, 2020

How many training steps used to obtain the pre-trained model? #70

How many training steps used to obtain the pre-trained model? #70

Comments

xinghua-qu commented Mar 19, 2020

araffin commented Mar 19, 2020