Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Stacking Recipes #1897

Merged
merged 25 commits into from
Dec 21, 2023
Merged

Support for Stacking Recipes #1897

merged 25 commits into from
Dec 21, 2023

Conversation

Satrat
Copy link
Contributor

@Satrat Satrat commented Dec 12, 2023

First PR towards alternating finetuning/one-shot support. Covers recipe and staging changes necessary to enable running recipes stage by stage between training and one-shot. Previously there was no way to differentiate between stages that had been previously applied and the active stage.

Summary of Changes

  • Adding an applied attribute to StageModifiers, used to differentiate which stages are initialized for structure only (already applied stages). If a stage has already been applied, we skip skip it during initialization and updates
  • Keep track of applied_stages in the RecipeContainer, so that when we reload the recipe after a stage is added we remember which have already been run
  • Adding helper functions to the QuantizationModifier for determining if a module is already quantized. This allows us to stack quantization modifiers as long as they don't affect a common module. If attempting to quantize an already quantized module, we raise an exception
  • Update recipe serialization code to add index's to stages with the same name, so they aren't overwritten during recipe export

Testing

Added new unit tests to tests/sparseml/transformers/obcq/test_repeats.py

  • Test one-shot SparseGPT recipes are able to be stacked, even if their stage names clash
  • Test we error out on clashing quantization modifiers
  • Test we allow multiple quantization modifiers as long as they don't overlap

@Satrat Satrat marked this pull request as ready for review December 13, 2023 15:40
@Satrat Satrat mentioned this pull request Dec 15, 2023
Copy link
Member

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Satrat Satrat merged commit 0eaf565 into main Dec 21, 2023
12 checks passed
@Satrat Satrat deleted the sparse_auto_recipe branch December 21, 2023 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants