Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training SAC with raw image as input #25

Open
ChunJyeBehBeh opened this issue Mar 7, 2020 · 8 comments
Open

Training SAC with raw image as input #25

ChunJyeBehBeh opened this issue Mar 7, 2020 · 8 comments
Labels
question Further information is requested

Comments

@ChunJyeBehBeh
Copy link

ChunJyeBehBeh commented Mar 7, 2020

The policy that I have tried is DDPG and SAC. I used master branch and below is the two command to reproduce the error.
python train.py --algo sac -n 5000
python train.py --algo ddpg -n 5000

  • Tensorflow version == 1.15.0
  • stable baseline == 2.9.0

Thanks for this good repo. Is a very good start to learning reinforcement learning in autonomous driving area. I had successfully trained a SAC model using VAE as input.

Now I want to try using raw image as input. I have set N_COMMAND_HISTORY to zero. I use the master branch. For the first 300 steps, the steering and throttle will be varied between -1 and 1 because of the sampling random action.
https://github.com/araffin/learning-to-drive-in-5-minutes/blob/fb82bc77593605711289e03f95dcfb6d3ea9e6c3/algos/custom_sac.py#L89

But after that, the policy will keep output the extreme value either 1 or 1 for the steering value. So the donkey car will go out the lane quickly and it will keep repeat without showing any learning progress.

The image below showed that the episode step drop 95 to 50 after the policy start to output the action.
image

Below is the plot of throttle value output [SAC with raw image input]. It keep constant at 1 after few episode.
Figure_1

Below is the plot of throttle value output [SAC with vae input]. The model tried to learn how to steer and vary the output between -1 and 1.
Figure_1

Sorry for keep open issues.

@araffin
Copy link
Owner

araffin commented Mar 7, 2020

hello,
what policy are you using?
please fill the issue template completely

@ChunJyeBehBeh
Copy link
Author

The policy that I used is DDPG and SAC. I have updated on the issue above. Thanks for your reply~

@araffin
Copy link
Owner

araffin commented Mar 7, 2020

I wanted to say "policy architecture", it seems that you are not using a CNN if you are using the default hyperparameters... This explains your results.

@araffin araffin added the question Further information is requested label Mar 7, 2020
@ChunJyeBehBeh
Copy link
Author

ChunJyeBehBeh commented Mar 7, 2020

Yes I am using the default hyperparameters.... May I know which part should I change in order to using raw image to train a SAC model?

In the sac.yml, change the policy from policy: 'MlpPolicy' to policy: 'CnnPolicy' ?

@araffin
Copy link
Owner

araffin commented Mar 7, 2020

I would recommend you to read stable-baselines documentation and look at the rl zoo, you have plenty of examples of RL with images.

@ChunJyeBehBeh
Copy link
Author

ChunJyeBehBeh commented Mar 21, 2020

Hello, I change the policy to CnnPolicy and increase the layer to policy_kwargs: “dict(layers=[64,64,64,64])”. However, I still didn't manage to train the agent with raw image input... Any other parameters that I miss out?

@Adnan-annan
Copy link

@ChunJyeBehBeh did you manage to train without VAE ?

@eliork
Copy link

eliork commented Jan 21, 2021

@ChunJyeBehBeh @Adnan-annan I am also trying to train without VAE. Did you have any success yet? would you mind sharing your results and methods you've tried?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants