Fix for NaNs in Smooth Quant #1872

Satrat · 2023-12-01T14:55:59Z

Issue first noticed on teknium/OpenHermes-2.5-Mistral-7B. When calculating the activation scales, its possible to get a scale of 0, which causes a NaN weight that errors out when trying to run the forward pass during quantization calibration.

The fix is to set a minimum scale of 1e-5 to avoid a divide by 0.

Also adding a seqlen argument to the OBCQ script, using the max sequence length Mistral was running out of memory during the perplexity eval.

See slack thread for more info on bug: https://neuralmagic.slack.com/archives/C04SRPGT5MW/p1700515011493959

Testing

src/sparseml/transformers/sparsification/obcq/obcq.py teknium/OpenHermes-2.5-Mistral-7B open_platypus --recipe recipe_mistral.yaml --precision float16 --seqlen 512 --eval wikitext2

Runs to completion now, previously failed with:

assert min_val <= max_val, "min {} should be less than max {}".format(
AssertionError: min nan should be less than max nan

recipe_mistral.yaml

test_stage:
  obcq_modifiers:
    LogarithmicEqualizationModifier:
      mappings: [
        [["re:.*q_proj", "re:.*k_proj", "re:.*v_proj"], "re:.*input_layernorm"],
        [["re:.*gate_proj", "re:.*up_proj"], "re:.*post_attention_layernorm"]
      ]
    QuantizationModifier:
      ignore:
        # These operations don't make sense to quantize
        - MistralRotaryEmbedding
        - MistralRMSNorm
        - SiLUActivation
        # Skip quantizing the BMMs
        # - QuantizableMatMul
        # Skip quantizing the layers with the most sensitive activations
        - model.layers.1.mlp.down_proj
        - model.layers.31.mlp.down_proj
        - model.layers.30.mlp.down_proj
        - model.layers.30.mlp.gate_proj
        - model.layers.30.mlp.up_proj
      post_oneshot_calibration: true
      scheme_overrides:
        Embedding:
          input_activations: null
          weights:
            num_bits: 8
            symmetric: false
    SparseGPTModifier:
      sparsity: 0.5
      block_size: 128
      sequential_update: true
      quantize: true
      percdamp: 0.01
      mask_structure: "0:0"
      targets: ["re:model.layers.\\d*$"]

Perplexity results:

2023-12-01 16:07:27 sparseml.modifiers.obcq.utils.helpers INFO     Evaluating perplexity...
2023-12-01 16:07:34 sparseml.modifiers.obcq.utils.helpers INFO     tensor(16.5364, device='cuda:4')
2023-12-01 16:07:41 sparseml.modifiers.obcq.utils.helpers INFO     tensor(19.9614, device='cuda:4')
2023-12-01 16:07:49 sparseml.modifiers.obcq.utils.helpers INFO     tensor(17.2977, device='cuda:4')
2023-12-01 16:07:56 sparseml.modifiers.obcq.utils.helpers INFO     tensor(14.8696, device='cuda:4')
2023-12-01 16:08:04 sparseml.modifiers.obcq.utils.helpers INFO     tensor(15.0391, device='cuda:4')
2023-12-01 16:08:11 sparseml.modifiers.obcq.utils.helpers INFO     tensor(15.0188, device='cuda:4')

src/sparseml/modifiers/smoothquant/pytorch.py

Satrat added 3 commits December 1, 2023 03:02

minimum activation range

5a2c46b

adding sequence length argument

a52968a

quality

5675c51

Satrat marked this pull request as ready for review December 1, 2023 15:05

Satrat requested review from mgoin, bfineran, anmarques, dsikka, rahul-tuli and dbogunowicz December 1, 2023 15:06

anmarques reviewed Dec 1, 2023

View reviewed changes

src/sparseml/modifiers/smoothquant/pytorch.py Outdated Show resolved Hide resolved

clarify naming

3842e07

Satrat requested a review from anmarques December 1, 2023 16:15

anmarques approved these changes Dec 1, 2023

View reviewed changes

mgoin approved these changes Dec 1, 2023

View reviewed changes

mgoin merged commit c722fc3 into main Dec 1, 2023
12 checks passed

mgoin deleted the smooth_nan_fix branch December 1, 2023 18:22

rahul-tuli mentioned this pull request Dec 13, 2023

[BugFix] TypeError with OBCQ #1899

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix for NaNs in Smooth Quant #1872

Fix for NaNs in Smooth Quant #1872

Satrat commented Dec 1, 2023 •

edited

Loading

Fix for NaNs in Smooth Quant #1872

Fix for NaNs in Smooth Quant #1872

Conversation

Satrat commented Dec 1, 2023 • edited Loading

Testing

Satrat commented Dec 1, 2023 •

edited

Loading