Split SparseGPT and GPTQ modifiers #2272

rahul-tuli · 2024-05-08T14:24:16Z

This PR introduces a structural change by separating concerns between quantization and sparsification. A new GPTQModifier is extracted from the existing SparseGPTModifier. This ensures that each class now has a focused responsibility — GPTQModifier manages quantization, while SparseGPTModifier is dedicated to sparsification.

Changes

Extraction of GPTQModifier: Carved out from SparseGPTModifier, this new class handles all aspects related to quantization, including arguments specifically for quantization processes.
Refinement of SparseGPTModifier and SparseGPTWrapper: These have been updated to solely focus on sparsification. All quantization-related functionalities have been removed.
Creation of GPTQWrapper: Implemented to apply quantization using OBQ
Update on Test Recipes: Modified the OBCQ test recipes to align with the new Modifiers.
Addition of Tests: Introduced new tests specifically for GPTQModifier to ensure functionality and stability.

Test Plan

Automated Tests: The changes have been covered by automated tests, which are currently passing without issues (green status).

Satrat

Overall structure looks good to me, just saw a couple small issues. Also, now that sparsegpt is broken out its not really obcq anywhere since there is no q. Could we rename all those folders from obcq -> sparsegpt?

src/sparseml/modifiers/quantization/gptq/base.py

src/sparseml/modifiers/quantization/gptq/pytorch.py

src/sparseml/modifiers/quantization/gptq/base.py

src/sparseml/modifiers/quantization/gptq/utils/gptq_wrapper.py

tests/sparseml/modifiers/quantization/gptq/test_base.py

tests/sparseml/transformers/obcq/test_consecutive_runs.py

rahul-tuli · 2024-05-20T18:56:07Z

Overall structure looks good to me, just saw a couple small issues. Also, now that sparsegpt is broken out its not really obcq anywhere since there is no q. Could we rename all those folders from obcq -> sparsegpt?

Absolutely, Good catch! I will add another PR once everything is in the feature branch, to prevent this PR from blowing up

* Split WandaPruningModifier and SparseGPTModifier Make sparsegpt not inherit from wanda modifier Decouple SparseGPTModifierPyTorch from WandaPruningModifier Fix docstrings * Split SparseGPT and GPTQ modifiers (#2272) * Update OBCQ * Extract GPTQ Modifier * [GPTQ Modifier UX] Update tests to use GPTQModifier for obcq style quantization (#2294) * Update OBCQ * Extract GPTQ Modifier * Update test recipes * GPTQ UX config groups support (#2273) * Update OBCQ * Extract GPTQ Modifier * Update test recipes * Add config_groups support to GPTQModifier * mask_structure preservation test (#2284) * test * Preserve weight sparsity if greater than threshold * Add argument to preserve sparsity mask in SPARSEGPT * fix case when mask is none * Add test to check mask_structure - initial mask structure should be preserved b/w consecutive runs; added test to check this * Update tensor_follows_mask_structure to check for atleast n zeros --------- Co-authored-by: Sara Adkins <sara@neuralmagic.com> * PR comments --------- Co-authored-by: Sara Adkins <sara@neuralmagic.com> * Fix default case * Update test to use new vLLMQuantizationModifier * Style --------- Co-authored-by: Sara Adkins <sara@neuralmagic.com>

rahul-tuli force-pushed the quant-modifier-ux branch from dfb3d7f to a55f50c Compare May 9, 2024 14:45

rahul-tuli force-pushed the create-gptq-modifier branch 2 times, most recently from 863b7a7 to 49cd9e5 Compare May 9, 2024 15:00

rahul-tuli marked this pull request as ready for review May 9, 2024 16:05

rahul-tuli requested review from Satrat, dsikka, bfineran, horheynm and dbogunowicz May 9, 2024 16:06

rahul-tuli self-assigned this May 9, 2024

rahul-tuli changed the title ~~[WIP] Split SparseGPT and GPTQ modifiers~~ Split SparseGPT and GPTQ modifiers May 9, 2024

This was referenced May 9, 2024

[Feature Branch] Quant modifier UX #2263

Merged

Preserve sparsity GPTQ #2281

Merged

Satrat requested changes May 15, 2024

View reviewed changes

rahul-tuli force-pushed the quant-modifier-ux branch 2 times, most recently from 4230fb7 to d4d85ff Compare May 20, 2024 13:09

rahul-tuli added 2 commits May 20, 2024 13:55

Update OBCQ

dd8241b

Extract GPTQ Modifier

4409730

rahul-tuli force-pushed the create-gptq-modifier branch from 53b96e5 to 4409730 Compare May 20, 2024 14:17

bfineran approved these changes May 20, 2024

View reviewed changes

rahul-tuli merged commit 5dd9985 into quant-modifier-ux May 20, 2024

rahul-tuli deleted the create-gptq-modifier branch May 20, 2024 18:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split SparseGPT and GPTQ modifiers #2272

Split SparseGPT and GPTQ modifiers #2272

rahul-tuli commented May 8, 2024 •

edited

Loading

Satrat left a comment

rahul-tuli commented May 20, 2024

Split SparseGPT and GPTQ modifiers #2272

Split SparseGPT and GPTQ modifiers #2272

Conversation

rahul-tuli commented May 8, 2024 • edited Loading

Changes

Test Plan

Satrat left a comment

Choose a reason for hiding this comment

rahul-tuli commented May 20, 2024

rahul-tuli commented May 8, 2024 •

edited

Loading