Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

amp & channels_last #50

Closed
xoiga123 opened this issue Jul 8, 2022 · 0 comments · Fixed by #68
Closed

amp & channels_last #50

xoiga123 opened this issue Jul 8, 2022 · 0 comments · Fixed by #68
Assignees
Labels
enhancement New feature or request

Comments

@xoiga123
Copy link
Collaborator

xoiga123 commented Jul 8, 2022

channels_last:

While PyTorch operators expect all tensors to be in Channels First (NCHW) dimension format, PyTorch operators support 3 output memory formats.
Contiguous: Tensor memory is in the same order as the tensor’s dimensions.
ChannelsLast: Irrespective of the dimension order, the 2d (image) tensor is laid out as an HWC or NHWC (N: batch, H: height, W: width, C: channels) tensor in memory. The dimensions could be permuted in any order.

amp:

extra:

prefetch:

Going to train from scratch to see what's good, with a working log this time.
UPDATE 12/07/2022: Seems like the bottleneck is in dataloading, which takes an unholy amount of time even though I cached everything in RAM. Currently profiling CPU & GPU and trying out this dataloader which allegedly actually does prefetch.
UPDATE: It all makes sense now, Pytorch's Dataloader can only prefetch batches in the current running epoch. For the next epoch, there is apparently no prefetch whatsoever.

@xoiga123 xoiga123 added the enhancement New feature or request label Jul 8, 2022
@xoiga123 xoiga123 self-assigned this Jul 8, 2022
@xoiga123 xoiga123 linked a pull request Oct 1, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant