Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BadZipFile when running PPO2. #109

Closed
lubosz opened this issue Oct 5, 2020 · 5 comments
Closed

BadZipFile when running PPO2. #109

lubosz opened this issue Oct 5, 2020 · 5 comments

Comments

@lubosz
Copy link

lubosz commented Oct 5, 2020

Describe the bug
I can't run anything using the ppo2 algo.
The file that it tries to open trained_agents/ppo2/CartPole-v1.pkl is not a valid zip file.
It produces a zipfile.BadZipFile: File is not a zip file exception.

a2c, acer and acktr seem work, even though the pkl files there are also invalid zip files.
The problem seems to be that it wants to open the pkls as zip file for ppo2.

$ python enjoy.py
WARNING:tensorflow:From /usr/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Traceback (most recent call last):
  File "stable_baselines/common/base_class.py", line 657, in _load_from_file
    with zipfile.ZipFile(load_path, "r") as file_:
  File "/usr/lib/python3.8/zipfile.py", line 1269, in __init__
    self._RealGetContents()
  File "/usr/lib/python3.8/zipfile.py", line 1336, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "enjoy.py", line 182, in <module>
    main()
  File "enjoy.py", line 101, in main
    model = ALGOS[algo].load(model_path, env=load_env)
  File "stable_baselines/common/base_class.py", line 938, in load
    data, params = cls._load_from_file(load_path, custom_objects=custom_objects)
  File stable_baselines/common/base_class.py", line 689, in _load_from_file
    data, params = BaseRLModel._load_from_file_cloudpickle(load_path)
  File "stable_baselines/common/base_class.py", line 622, in _load_from_file_cloudpickle
    data, params = cloudpickle.load(file_)
TypeError: an integer is required (got type bytes)

System Info
Describe the characteristic of your environment:

  • Describe how stable baselines was installed (pip, docker, source, ...)
    pip install -e . from git, with TF2 compatibility patches
  • GPU models and configuration
    I use CPU
  • Python version
    3.8.5
  • Tensorflow version
    2.3.0
  • Gym version
    0.15.3
@araffin
Copy link
Owner

araffin commented Oct 5, 2020

Hello,
Stable-Baselines does not support TF 2.0+... but issue might come from your python version: see DLR-RM/stable-baselines3#171

The issue seems to come from cloudpickle due to a change in internal python3 API.

It should work with python 3.6 and tensorflow 1.8.0+

@lubosz
Copy link
Author

lubosz commented Oct 5, 2020

Thanks for your reply.

Stable-Baselines does not support TF 2.0+...
Actually you can pretty easily run most TF 1.X code by changing an import:

-import tensorflow as tf
+import tensorflow.compat.v1 as tf

I currently patched stable-baselines, since the original baselines TF2 branch doesn't seem to go anywhere.
I also found that you have your own baselines fork, I will look into it.

but issue might come from your python version: see DLR-RM/stable-baselines3#171

I will try to run it with an older Python version.

The issue seems to come from cloudpickle due to a change in internal python3 API.

It's really weird that only ppo2 and sac seem to fail because of this and I am getting the extra zip exception, in addition to the one from cloudpickle.

It should work with python 3.6 and tensorflow 1.8.0+

@araffin
Copy link
Owner

araffin commented Oct 5, 2020

Actually you can pretty easily run most TF 1.X code by changing an import:

nice, but this probably won't work with the contrib imports.

You might take a look at hill-a/stable-baselines#366 and hill-a/stable-baselines#984

you have your own baselines fork,

?

I am getting the extra zip exception

Probably because the model was saved using the old format: https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/base_class.py#L676

See documentation: https://stable-baselines.readthedocs.io/en/master/guide/save_format.html

It's really weird that only ppo2 and sac seem to fail because of this and I am getting the extra zip exception, in addition to the one from cloudpickle.

Look at the linked issue and at DLR-RM/stable-baselines3#172, cloudpickle fail for only some objects like learning rate schedules (defined as lambdas) which are not present in a2c, acer or acktr.

@lubosz
Copy link
Author

lubosz commented Oct 5, 2020

Thanks for the info.

Probably because the model was saved using the old format: https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/base_class.py#L676

This is the solution to my issue. Version incompatibility for playing back the recorded plks. I assume that it will work when I record new ones on my setup.

It is very unfortunate that baselines sees this extensive forking without a clear upstream and maintainership.

?

I was describing stable-baselines3 as "your" fork, since you were linking the issue from it.
https://github.com/DLR-RM/stable-baselines3

This looks interesting, unfortunately this is again some fork and not a branch that will get upstreamed:
https://github.com/Stable-Baselines-Team/stable-baselines-tf2

@lubosz lubosz closed this as completed Oct 5, 2020
@araffin
Copy link
Owner

araffin commented Oct 5, 2020

It is very unfortunate that baselines sees this extensive forking without a clear upstream and maintainership.

?

I was describing stable-baselines3 as "your" fork, since you were linking the issue from it.
https://github.com/DLR-RM/stable-baselines3

Yep, it is not a fork but the next version of Stable-Baselines, same API but much cleaner codebase (and so much easier to maintain and extend).
It is now actively developped, v1 of SB3 is coming soon.

This looks interesting, unfortunately this is again some fork and not a branch that will get upstreamed:
https://github.com/Stable-Baselines-Team/stable-baselines-tf2

oh, it's not a fork but an experimental repo that I made while we were thinking about the future of SB (see hill-a/stable-baselines#733)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants