Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify algorithms part 2: mixup, cutmix, label smoothing #658

Merged
merged 39 commits into from
Mar 8, 2022

Conversation

dblalock
Copy link
Contributor

@dblalock dblalock commented Mar 3, 2022

Follow-up to #524 to continue addressing #343. This is mostly cleaning up the docs to be consistent across methods + match the code's behavior, fixing the doctests, and modifying the docstrings in accordance with some slight API changes (mostly param renaming). These changes are:

Mixup:

  • n_classes -> num_classes in mixup_batch, for consitency with torch + rest of composer rather than sklearn.
  • interpolation_lambda -> mixing in mixup_batch
  • x -> input in mixup_batch for consistency with torch
  • y -> target in mixup_batch for consistency with torch
  • Removed the wrapper properties for attributes in the algorithm. These weren't documented and were seemingly (?) only there so that the tests didn't access private attrs

Cutmix:

  • n_classes -> num_classes again
  • X -> input in cutmix_batch for consistency with torch
  • y -> target in cutmix_batch for consistency with torch
  • cutmix_lambda -> length in cutmix_batch, matching cutout's length argument. As part of this change, I also added changed the logic to 1) correctly compute the internal cutmix_lambda when bbox is provided, and 2) throw if both length and bbox are provided.
  • cutmix_batch now returns the permutation it used for consistency with mixup_batch. @coryMosaicML , do we need to return these? We apparently didn't here, but seemingly did in mixup_batch?
  • Removed the wrapper properties for attributes for same reasons as in mixup

Label smoothing:

  • alpha -> interpolation in smooth_labels and algorithm
  • targets -> target in smooth_labels for consistency with torch

Parts I'm not sure about (updated after slack discussion):

  • Whether to return the permutation in functional forms (see above)
  • How best to handle cutmix allowing 3 different ways to specify the box to use (alpha, length / cutmix_lambda, and bbox). Right now alpha is just ignored if others are provided, while the other two now raise a ValueError if both are not None. Seems like the least bad way to do this, but a little inconsistent. Maybe we could remove bbox as a parameter?
  • For removed wrapper properties, we'd ideally just not have the tests depend on class internals. I went with having tests access the private attrs rather than cluttering the code + public API, but we could also just make these attrs public and document them. Probably should have put this change in a separate PR, but I was already renaming the attrs and I was on a roll / changing a lot anyway, so I just went with it.

NOTE: need to fix jenkins error + double-check docstring output in the docs, but this is ready for feedback, esp regarding the API questions above.

@hanlint
Copy link
Contributor

hanlint commented Mar 3, 2022

I like where this is going! Agreed on interpolation... how strictly do we want to maintain cross-functional consistency versus what the original paper/authors terminology. e.g. alpha for label smoothing?

@dblalock dblalock marked this pull request as ready for review March 3, 2022 20:44
@coryMosaicML
Copy link
Contributor

cutmix_batch now returns the permutation it used for consistency with mixup_batch. @coryMosaicML , do we need to return these? We apparently didn't here, but seemingly did in mixup_batch?

I think this behavior in mixup_batch is a holdover from the old repo that got ported over. I don't think the original reason for doing that exists in composer, and not returning the permutation should be fine. It makes the functional interface feel a lot cleaner to me. However, two possible reasons to return permutations: 1) If someone wants to do further analysis/debugging of what mixup_batch and cutmix_batch are doing to the inputs, this could be handy, and 2) returning the permutation makes it possible to use the loss interpolation trick people often do to use mixup and cutmix with index labels after using mixup_batch and cutmix_batch if they so choose.

composer/algorithms/cutmix/cutmix.py Show resolved Hide resolved
composer/algorithms/cutmix/cutmix.py Outdated Show resolved Hide resolved
composer/algorithms/mixup/mixup.py Outdated Show resolved Hide resolved
composer/algorithms/mixup/mixup.py Outdated Show resolved Hide resolved
tests/algorithms/test_mixup.py Outdated Show resolved Hide resolved
Copy link
Contributor

@hanlint hanlint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed two instances of significant test setup code to create the model and dataloaders. Are those needed, given that some of these are auto-loaded in https://github.com/mosaicml/composer/blob/dev/docs/source/doctest_fixtures.py ?

composer/algorithms/cutmix/cutmix.py Outdated Show resolved Hide resolved
composer/algorithms/cutmix/cutmix.py Outdated Show resolved Hide resolved
composer/algorithms/cutmix/cutmix.py Outdated Show resolved Hide resolved
@hanlint
Copy link
Contributor

hanlint commented Mar 7, 2022

Also -- update yamls?

@dblalock
Copy link
Contributor Author

dblalock commented Mar 8, 2022

Also -- update yamls?

The algorithm yamls? The label_smoothing one is updated, but cutmix and mixup fortunately don't have to change anything since they only use alpha and num_classes, and these are unchanged.

Copy link
Contributor

@hanlint hanlint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dblalock dblalock merged commit 7bfac7a into dev Mar 8, 2022
@dblalock dblalock deleted the davis/unify-functional-2 branch March 8, 2022 20:00
dblalock added a commit that referenced this pull request Mar 10, 2022
Addresses #343, or at least well enough for present purposes.

Changes:

* colout.py: rename X -> input
* cutout.py: rename X -> input, n_holes -> num_holes, make cutout_batch length fractional-only to match cutmix.py (see discussion in #658)
* account for n_holes -> num_holes rename, as well as no int lengths in:
    * test_cutout.py
    * algorithms/hparams.py
    * yamls/algorithms/cutout.yaml
    * test_algorithm_registry.py
    * test_load.py
    * test_trainer.py
    * examples/adding_custom_models.py
    * notebooks/medical_image_segmentation_composer.ipynb
* progressive_resizing.py: rename X -> input, y -> target
* selective_backprop.py: rename X -> input, y -> target

Minor docstring cleanup:
- augmix_image now has correct input type shown in docstring
- randaugment_image now has correct input type shown in docstring
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants