Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split SparseGPT and GPTQ modifiers #2272

Merged
merged 2 commits into from
May 20, 2024
Merged

Conversation

rahul-tuli
Copy link
Member

@rahul-tuli rahul-tuli commented May 8, 2024

This PR introduces a structural change by separating concerns between quantization and sparsification. A new GPTQModifier is extracted from the existing SparseGPTModifier. This ensures that each class now has a focused responsibility — GPTQModifier manages quantization, while SparseGPTModifier is dedicated to sparsification.

Changes

  • Extraction of GPTQModifier: Carved out from SparseGPTModifier, this new class handles all aspects related to quantization, including arguments specifically for quantization processes.

  • Refinement of SparseGPTModifier and SparseGPTWrapper: These have been updated to solely focus on sparsification. All quantization-related functionalities have been removed.

  • Creation of GPTQWrapper: Implemented to apply quantization using OBQ

  • Update on Test Recipes: Modified the OBCQ test recipes to align with the new Modifiers.

  • Addition of Tests: Introduced new tests specifically for GPTQModifier to ensure functionality and stability.

Test Plan

  • Automated Tests: The changes have been covered by automated tests, which are currently passing without issues (green status).

@rahul-tuli rahul-tuli force-pushed the create-gptq-modifier branch 2 times, most recently from 863b7a7 to 49cd9e5 Compare May 9, 2024 15:00
@rahul-tuli rahul-tuli marked this pull request as ready for review May 9, 2024 16:05
@rahul-tuli rahul-tuli self-assigned this May 9, 2024
@rahul-tuli rahul-tuli changed the title [WIP] Split SparseGPT and GPTQ modifiers Split SparseGPT and GPTQ modifiers May 9, 2024
Copy link
Contributor

@Satrat Satrat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall structure looks good to me, just saw a couple small issues. Also, now that sparsegpt is broken out its not really obcq anywhere since there is no q. Could we rename all those folders from obcq -> sparsegpt?

src/sparseml/modifiers/quantization/gptq/base.py Outdated Show resolved Hide resolved
src/sparseml/modifiers/quantization/gptq/pytorch.py Outdated Show resolved Hide resolved
tests/sparseml/modifiers/quantization/gptq/test_base.py Outdated Show resolved Hide resolved
tests/sparseml/transformers/obcq/test_consecutive_runs.py Outdated Show resolved Hide resolved
@rahul-tuli rahul-tuli force-pushed the quant-modifier-ux branch 2 times, most recently from 4230fb7 to d4d85ff Compare May 20, 2024 13:09
@rahul-tuli
Copy link
Member Author

Overall structure looks good to me, just saw a couple small issues. Also, now that sparsegpt is broken out its not really obcq anywhere since there is no q. Could we rename all those folders from obcq -> sparsegpt?

Absolutely, Good catch! I will add another PR once everything is in the feature branch, to prevent this PR from blowing up

@rahul-tuli rahul-tuli merged commit 5dd9985 into quant-modifier-ux May 20, 2024
@rahul-tuli rahul-tuli deleted the create-gptq-modifier branch May 20, 2024 18:56
bfineran pushed a commit that referenced this pull request May 22, 2024
* Split WandaPruningModifier and SparseGPTModifier
Make sparsegpt not inherit from wanda modifier
Decouple SparseGPTModifierPyTorch from WandaPruningModifier
Fix docstrings

* Split SparseGPT and GPTQ modifiers (#2272)

* Update OBCQ

* Extract GPTQ Modifier

* [GPTQ Modifier UX] Update tests to use GPTQModifier for obcq style quantization (#2294)

* Update OBCQ

* Extract GPTQ Modifier

* Update test recipes

* GPTQ UX config groups support (#2273)

* Update OBCQ

* Extract GPTQ Modifier

* Update test recipes

* Add config_groups support to GPTQModifier

* mask_structure preservation test (#2284)

* test

* Preserve weight sparsity if greater than threshold

* Add argument to preserve sparsity mask in SPARSEGPT

* fix case when mask is none

* Add test to check mask_structure
- initial mask structure should be preserved
b/w consecutive runs; added test to check this

* Update tensor_follows_mask_structure to check for atleast n zeros

---------

Co-authored-by: Sara Adkins <sara@neuralmagic.com>

* PR comments

---------

Co-authored-by: Sara Adkins <sara@neuralmagic.com>

* Fix default case

* Update test to use new vLLMQuantizationModifier

* Style

---------

Co-authored-by: Sara Adkins <sara@neuralmagic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants