Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🍱 Extra data and pre-batch shuffle on train datapipe #14

Merged
merged 3 commits into from
Jun 1, 2023

Conversation

weiji14
Copy link
Member

@weiji14 weiji14 commented May 30, 2023

What I am changing

How I did it

How you can test it

  • Run python trainer.py fit --trainer.max_epochs=30 --data.batch_size=6 locally.

Related Issues

Note that the shuffling operation is slower than in-batch shuffling. There is a longer delay at the start as the image chips are added to the shuffle buffer, and each mini-batch is now processing about 2x slower (one iteration used to take ~1s, now it takes ~2s).

Randomizing the order of the chips before creating mini-batches, because the train_eval.hdf5 contains all the non-zero labels while california_*.hdf5 contain all zero labels. The shuffling causes a roughly 2x slowdown from 1it/s to 2it/s. Also cherry-picked a9b3b95 to have a buffer_size of -1 in the demux DataPipe.
@weiji14 weiji14 added the enhancement New feature or request label May 30, 2023
@weiji14 weiji14 self-assigned this May 30, 2023
@weiji14 weiji14 requested a review from srmsoumya May 30, 2023 08:27
self.datapipe_train = (
dp_train.map(fn=_pre_post_mask_tuple)
dp_train.shuffle(buffer_size=100)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default buffer size of 10000 was too slow (waited for minutes but the model never starts training). @srmsoumya, could you try a few other variations of this buffer_size and see how performant the model is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I was facing some error with buffers & set it to -1 in my experiment, I will look at other options as well

Copy link
Member

@srmsoumya srmsoumya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, feel free to merge.

self.datapipe_train = (
dp_train.map(fn=_pre_post_mask_tuple)
dp_train.shuffle(buffer_size=100)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I was facing some error with buffers & set it to -1 in my experiment, I will look at other options as well

Comment on lines 137 to 141
"https://huggingface.co/datasets/chabud-team/chabud-extra/resolve/main/california_0.hdf5",
"https://huggingface.co/datasets/chabud-team/chabud-extra/resolve/main/california_1.hdf5",
"https://huggingface.co/datasets/chabud-team/chabud-extra/resolve/main/california_2.hdf5",
"https://huggingface.co/datasets/chabud-team/chabud-extra/resolve/main/california_3.hdf5",
"https://huggingface.co/datasets/chabud-team/chabud-extra/resolve/main/california_4.hdf5",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weiji14 we can ignore the california_*.hdf5 files for now, as the dataset is currently imbalanced. We can add them back once we implement the mixup & cutmix augmentations.

Copy link
Member Author

@weiji14 weiji14 Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll comment those lines out for now, done at e9b7255. The data imbalance can be tracked at #11 or #12.

Commented out the extra california_*.hdf5 data for now.
@weiji14 weiji14 marked this pull request as ready for review June 1, 2023 02:58
@weiji14 weiji14 merged commit 6ca3381 into main Jun 1, 2023
@weiji14 weiji14 deleted the extra-data-and-pre-batch-shuffle branch June 1, 2023 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants