Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
🍱 Extra data and pre-batch shuffle on train datapipe (#14)
* 🍱 Extra datasets california_3.hdf5 and california_4.hdf5 More sample imagery datasets for training, added in https://huggingface.co/datasets/chabud-team/chabud-extra/commit/7da36fcb240ef39beed1f877acc837b98746f35b. * 👔 Shuffle chips before batching instead of in-batch shuffling Randomizing the order of the chips before creating mini-batches, because the train_eval.hdf5 contains all the non-zero labels while california_*.hdf5 contain all zero labels. The shuffling causes a roughly 2x slowdown from 1it/s to 2it/s.
- Loading branch information