Layer-Wise Distillation #1272

rahul-tuli · 2022-12-19T16:21:38Z

This PR represents the main branch for all layer-wise distillation work

Update `teacher_names` -> `teacher_layer_names`

src/sparseml/pytorch/sparsification/distillation/modifier_per_layer.py

tests/sparseml/pytorch/sparsification/distillation/test_per_layer_distillation.py

src/sparseml/pytorch/sparsification/distillation/modifier_per_layer.py

KSGulin

Looks great. Just a few minor comments

src/sparseml/pytorch/sparsification/distillation/modifier_per_layer.py

Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com>

src/sparseml/pytorch/sparsification/distillation/modifier_per_layer.py

…layer.py

bfineran

Looks great @rahul-tuli @corey-nm - few small comments and need to add that small change for serialization then LGTM!

tests/sparseml/pytorch/sparsification/distillation/test_per_layer_distillation.py

src/sparseml/pytorch/sparsification/distillation/modifier_per_layer.py

tests/sparseml/pytorch/sparsification/distillation/test_per_layer_distillation.py

src/sparseml/pytorch/sparsification/distillation/modifier_per_layer.py

bfineran

new state dict logic looks much better - LGTM pending comments

src/sparseml/pytorch/sparsification/distillation/modifier_per_layer.py

* Add to `DISTILL_PARAM_GROUP_KEY` to `__all__`

bfineran

great work @rahul-tuli @corey-nm

corey-nm

woohoo!

@bfineran

* Initial Commit with Alex's Work * Update `student_names` -> `student_layer_names` Update `teacher_names` -> `teacher_layer_names` * Intermediate commit * Styling * Reorg initialize * More cleanups * Update docstring * Moving finalize logic to update * Tests passing a bit * Fixing lifecycle tests * Changing projection to dict * Cleanup * Adding quantization hooks test * Add failing test for optimizer serialization * Monkey patching optimizer state_dict method * Apply suggestions from code review Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com> * Update src/sparseml/pytorch/sparsification/distillation/modifier_per_layer.py * Adding missing docstrings * Respond to review on modifier/optimizer state_dict * Add a test for modifier load before forward pass * Updating comments * Fix failing test * Add more asserts based on @bfineran 's comments * * Rename `_DISTILL_PARAM_GROUP_KEY` -> `DISTILL_PARAM_GROUP_KEY` * Add to `DISTILL_PARAM_GROUP_KEY` to `__all__` * Move state dict patching to a helper function * Quality Co-authored-by: Corey Lowman <corey@neuralmagic.com> Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com> Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com>

@bfineran

* Saving all hooks during quantization block fusing (#1280) * Saving all hooks during quantization block fusing * Clean up delete get block hooks * Layer-Wise Distillation (#1272) * Initial Commit with Alex's Work * Update `student_names` -> `student_layer_names` Update `teacher_names` -> `teacher_layer_names` * Intermediate commit * Styling * Reorg initialize * More cleanups * Update docstring * Moving finalize logic to update * Tests passing a bit * Fixing lifecycle tests * Changing projection to dict * Cleanup * Adding quantization hooks test * Add failing test for optimizer serialization * Monkey patching optimizer state_dict method * Apply suggestions from code review Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com> * Update src/sparseml/pytorch/sparsification/distillation/modifier_per_layer.py * Adding missing docstrings * Respond to review on modifier/optimizer state_dict * Add a test for modifier load before forward pass * Updating comments * Fix failing test * Add more asserts based on @bfineran 's comments * * Rename `_DISTILL_PARAM_GROUP_KEY` -> `DISTILL_PARAM_GROUP_KEY` * Add to `DISTILL_PARAM_GROUP_KEY` to `__all__` * Move state dict patching to a helper function * Quality Co-authored-by: Corey Lowman <corey@neuralmagic.com> Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com> Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com> Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com> Co-authored-by: Corey Lowman <corey@neuralmagic.com> Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com>

rahul-tuli added 2 commits December 19, 2022 11:15

Initial Commit with Alex's Work

b87a66b

Update student_names -> student_layer_names

8cb9da0

Update `teacher_names` -> `teacher_layer_names`

rahul-tuli requested a review from corey-nm December 19, 2022 16:21

rahul-tuli assigned corey-nm and rahul-tuli Dec 19, 2022