Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoint: OverflowError: cannot serialize a string larger than 4GiB #1769

Closed
chnsh opened this issue May 9, 2020 · 2 comments
Closed

Checkpoint: OverflowError: cannot serialize a string larger than 4GiB #1769

chnsh opened this issue May 9, 2020 · 2 comments
Labels
help wanted Open to be worked on question Further information is requested won't fix This will not be worked on

Comments

@chnsh
Copy link

chnsh commented May 9, 2020

🐛 Bug

Model checkpointing fails with the error: OverflowError: cannot serialize a string larger than 4GiB and halts training

  • PyTorch Version (e.g., 1.0): 1.5
  • OS (e.g., Linux): Linux
  • How you installed PyTorch (conda, pip, source): conda
  • Build command you used (if compiling from source):
  • Python version: 3.7
  • CUDA/cuDNN version: 10.1
  • GPU models and configuration: GTX 2080Ti
  • Any other relevant information:

Additional context

This is a known Python issue pytorch/pytorch#12085

Possible fix, set the protocol correctly

@chnsh chnsh added bug Something isn't working help wanted Open to be worked on labels May 9, 2020
@Borda Borda changed the title **Checkpoint**: OverflowError: cannot serialize a string larger than 4GiB Checkpoint: OverflowError: cannot serialize a string larger than 4GiB May 11, 2020
@Borda
Copy link
Member

Borda commented Jun 5, 2020

I guess there is some pickle limitation or the file system you are using...
can you specify your devel environment?

@Borda Borda added information needed question Further information is requested and removed bug Something isn't working labels Jun 5, 2020
@stale
Copy link

stale bot commented Aug 4, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the won't fix This will not be worked on label Aug 4, 2020
@stale stale bot closed this as completed Aug 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Open to be worked on question Further information is requested won't fix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants