Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FFCV with CIFAR10 #13668

Closed
salelkafrawy opened this issue Jul 15, 2022 · 1 comment
Closed

FFCV with CIFAR10 #13668

salelkafrawy opened this issue Jul 15, 2022 · 1 comment
Labels
3rd party Related to a 3rd-party

Comments

@salelkafrawy
Copy link

salelkafrawy commented Jul 15, 2022

🐛 Bug

ValueError: Tried to step 527 times. The specified number of total steps is 525

a bug surfaced when I used FFCV loader with CIFAR10 dataset. I read that FFCV hasn't been fully supported by PL but as @carmocca mentioned here that we can use FFCV and remove the ToDevice transformations with bugs that could surface, so this issue is one of two bugs I've been facing.

I tried FFCV with MosaicML and it worked fine (the code is in the same notebook), but it had errors when used with PL

To Reproduce

The last part in this Colab has the bug:
https://colab.research.google.com/drive/1DKJfDsDLAGrJLcSB9-kn7G29vc_6EHuU?usp=sharing

Expected behavior

I expected to run smoothly as the Pytorch's DataLoader has.

Environment

  • CUDA:
    - GPU:
    - Quadro RTX 8000
    - available: True
    - version: 11.3
  • Packages:
    - numpy: 1.22.4
    - pyTorch_debug: False
    - pyTorch_version: 1.12.0
    - pytorch-lightning: 1.6.5
    - mosaicml: 0.8.0
    - tqdm: 4.64.0
  • System:
    - OS: Linux
    - architecture:
    - 64bit
    - ELF
    - processor: x86_64
    - python: 3.9.13
    - version: Logging of GPU memory utilization can significantly slow down training #189-Ubuntu SMP Wed May 18 14:13:57 UTC 2022

Additional context

if you could explain how PL is different than MosaicML in handling dataloading I'd appreciate it.

@salelkafrawy salelkafrawy added the needs triage Waiting to be triaged by maintainers label Jul 15, 2022
@salelkafrawy salelkafrawy changed the title FFCV with CIFAR10 bug FFCV with CIFAR10 Jul 15, 2022
@salelkafrawy
Copy link
Author

I figured out that using ffcv.transforms.NormalizeImage instead of torchvision.transforms.Normalize caused the errors and now it works with PL

@akihironitta akihironitta added 3rd party Related to a 3rd-party and removed needs triage Waiting to be triaged by maintainers labels Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3rd party Related to a 3rd-party
Projects
None yet
Development

No branches or pull requests

2 participants