Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Fix] Fix pruning for partially quantized models #1792

Merged
merged 1 commit into from
Oct 27, 2023
Merged

Conversation

Satrat
Copy link
Contributor

@Satrat Satrat commented Oct 25, 2023

If quantization was set to True in the SparseGPTModifier, only quantized layers would be compressed by the algorithm. This fix cleans up the code quite a bit by using the existing get_prunable_layers utility function to search for both quantized and unquantized prunable layers. Now if a layer is ignored by the QuantizationModifier the SparseGPTModifier will still prune it.

Example Recipe:

test_stage:
  obcq_modifiers:
    QuantizationModifier:
      ignore:
        - LlamaRotaryEmbedding
        - LlamaRMSNorm
        - SiLUActivation
        - model.layers.1.mlp.down_proj
        - model.layers.5.mlp.down_proj
      post_oneshot_calibration: True
      scheme_overrides:
        Embedding:
          input_activations: null
          weights:
            num_bits: 8
            symmetric: False
    SparseGPTModifier:
      sparsity: 0.5
      block_size: 128
      sequential_update: False
      quantize: True
      percdamp: 0.01
      prunen: 0
      prunem: 0
      targets: [
        "model.layers.0",
        "model.layers.1",
        "model.layers.2",
        "model.layers.3",
        "model.layers.4",
        "model.layers.5"
      ]
      target_ids: ["attention_mask", "position_ids"]  

Testing

python src/sparseml/transformers/sparsification/obcq/obcq.py Xenova/llama2.c-stories15M open_platypus --recipe tiny_recipe.yaml

Even though model.layer.5.down_proj is ignored during quantization, it is still pruned by SparseGPT

===== Compressing layer 5/5 =====
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.layer_compressor INFO     Compressing self_attn.q_proj.module...
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.sparsegpt INFO     time 0.07
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.sparsegpt INFO     error 1215.84
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.layer_compressor INFO     Compressing self_attn.k_proj.module...
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.sparsegpt INFO     time 0.07
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.sparsegpt INFO     error 1160.68
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.layer_compressor INFO     Compressing self_attn.v_proj.module...
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.sparsegpt INFO     time 0.07
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.sparsegpt INFO     error 1167.86
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.layer_compressor INFO     Compressing self_attn.o_proj.module...
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.sparsegpt INFO     time 0.07
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.sparsegpt INFO     error 48.98
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.layer_compressor INFO     Compressing mlp.gate_proj.module...
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.sparsegpt INFO     time 0.07
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.sparsegpt INFO     error 1703.80
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.layer_compressor INFO     Compressing mlp.up_proj.module...
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.sparsegpt INFO     time 0.07
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.sparsegpt INFO     error 1946.11
2023-10-25 18:00:40 sparseml.modifiers.obcq.utils.layer_compressor INFO     Compressing mlp.down_proj...
2023-10-25 18:00:41 sparseml.modifiers.obcq.utils.sparsegpt INFO     time 0.15
2023-10-25 18:00:41 sparseml.modifiers.obcq.utils.sparsegpt INFO     error 128.81

Copy link
Member

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go!

Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works for me, thanks!

@mgoin mgoin merged commit 63e7740 into main Oct 27, 2023
10 of 11 checks passed
@mgoin mgoin deleted the mixed_quant_bug_fix branch October 27, 2023 15:24
bfineran pushed a commit that referenced this pull request Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants