Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Export][Transformers] Implementation of correctness validation #1935

Conversation

dbogunowicz
Copy link
Contributor

@dbogunowicz dbogunowicz commented Jan 3, 2024

Feature description

Implements validate_correctness function to assert the given the same input, the outputs from the torch and onnx model are the same. The function uses top-k predictions match wrt the ground truth to assert correctness.
This will sometimes not be the case for the quantized model, given the different rounding behavior of torch and onnx quantization ops.

Testing

Tests are in-place for transformers and image-classification models.
Note: will also add tests for "generative transformers" once the parallel PR #1938 is approved.

Example

For LLMs:

from sparseml.export.export import export
from huggingface_hub import snapshot_download

hf_model = "roneneldan/TinyStories-1M"
source_path = snapshot_download(hf_model)
target_path = "."
export(
    source_path=source_path,
    target_path=target_path,
    task='text-generation',
    num_export_samples=2,
    validate_correctness=True,
    **dict(
        data_args=dict(
            dataset_name="wikitext", dataset_config_name="wikitext-2-raw-v1"
        )
    ),
)
Fetching 10 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 29.52it/s]
2024-01-05 20:00:13 sparseml.pytorch.image_classification.utils.helpers WARNING  Model: /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac/model.pth not an image classification model: [Errno 2] No such file or directory: '/home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac/model.pth'
2024-01-05 20:00:13 sparseml.pytorch.image_classification.utils.helpers WARNING  Model: /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac/model.pth not an image classification model: [Errno 2] No such file or directory: '/home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac/model.pth'
2024-01-05 20:00:13 sparseml.export.export INFO     Starting export for transformers model...
2024-01-05 20:00:13 sparseml.export.export INFO     Starting export for transformers model...
2024-01-05 20:00:13 sparseml.export.export INFO     Creating model for the export...
2024-01-05 20:00:13 sparseml.export.export INFO     Creating model for the export...
2024-01-05 20:00:13 sparseml.transformers.integration_helper_functions WARNING  trust_remote_code is set to False. It is possible, that the model will not be loaded correctly.
2024-01-05 20:00:13 sparseml.transformers.integration_helper_functions WARNING  trust_remote_code is set to False. It is possible, that the model will not be loaded correctly.
2024-01-05 20:00:14 sparseml.pytorch.model_load.helpers INFO     Loaded model from /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac with 3745984 total params. Of those there are 3609664 prunable params which have 0.0 avg sparsity.
2024-01-05 20:00:14 sparseml.pytorch.model_load.helpers INFO     Loaded model from /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac with 3745984 total params. Of those there are 3609664 prunable params which have 0.0 avg sparsity.
2024-01-05 20:00:15 sparseml.pytorch.model_load.helpers INFO     dense model detected, all sparsification info: {"params_summary": {"total": 3745984, "sparse": 0, "sparsity_percent": 0.0, "prunable": 3609664, "prunable_sparse": 0, "prunable_sparsity_percent": 0.0, "quantizable": 3612736, "quantized": 0, "quantized_percent": 0.0}, "params_info": {"transformer.h.0.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.0.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.1.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.2.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.3.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.4.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.5.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.6.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.7.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "lm_head.weight": {"numel": 3216448, "sparsity": 0.0, "quantized": false}}}
2024-01-05 20:00:15 sparseml.pytorch.model_load.helpers INFO     dense model detected, all sparsification info: {"params_summary": {"total": 3745984, "sparse": 0, "sparsity_percent": 0.0, "prunable": 3609664, "prunable_sparse": 0, "prunable_sparsity_percent": 0.0, "quantizable": 3612736, "quantized": 0, "quantized_percent": 0.0}, "params_info": {"transformer.h.0.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.0.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.1.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.2.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.3.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.4.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.5.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.6.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.7.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "lm_head.weight": {"numel": 3216448, "sparsity": 0.0, "quantized": false}}}
Running tokenizer on dataset: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3760/3760 [00:00<00:00, 4770.75 examples/s]
Adding labels: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3760/3760 [00:00<00:00, 5633.39 examples/s]
2024-01-05 20:00:17 sparseml.core.logger INFO     Logging all SparseML modifier-level logs to sparse_logs/05-01-2024_20.00.17.log
2024-01-05 20:00:17 sparseml.core.logger INFO     Logging all SparseML modifier-level logs to sparse_logs/05-01-2024_20.00.17.log
No recipes were applied for /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac, check to make sure recipe(s) are stored in the model_path
2024-01-05 20:00:17 sparseml.export.export INFO     Created additional items that will be used for the export: ['trainer', 'tokenizer', 'input_names']
2024-01-05 20:00:17 sparseml.export.export INFO     Created additional items that will be used for the export: ['trainer', 'tokenizer', 'input_names']
2024-01-05 20:00:17 sparseml.export.export INFO     Exporting model.onnx to ....
2024-01-05 20:00:17 sparseml.export.export INFO     Exporting model.onnx to ....
/nm/drive0/damian/sparseml/src/sparseml/pytorch/torch_to_onnx_exporter.py:132: UserWarning: Sample inputs passed into the ONNX exporter should be in the same order defined in the model forward function. Consider using OrderedDict for this purpose.
  warnings.warn(
/nm/drive0/damian/sparseml/venv/lib/python3.10/site-packages/transformers/models/gpt_neo/modeling_gpt_neo.py:557: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if batch_size <= 0:
/nm/drive0/damian/sparseml/venv/lib/python3.10/site-packages/transformers/models/gpt_neo/modeling_gpt_neo.py:196: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  mask_value = torch.tensor(mask_value, dtype=attn_weights.dtype).to(attn_weights.device)
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FoldIdentityInitializers] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FoldIdentityInitializers] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FlattenQParams] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FlattenQParams] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [UnwrapBatchNorms] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [UnwrapBatchNorms] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [DeleteTrivialOnnxAdds] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [DeleteTrivialOnnxAdds] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [ConstantsToInitializers] Transformed 411 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [ConstantsToInitializers] Transformed 411 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FoldIdentityInitializers] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FoldIdentityInitializers] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [InitializersToUint8] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [InitializersToUint8] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FlattenQParams] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FlattenQParams] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FoldConvDivBn] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FoldConvDivBn] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [DeleteRepeatedQdq] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [DeleteRepeatedQdq] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [QuantizeQATEmbedding] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [QuantizeQATEmbedding] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [PropagateEmbeddingQuantization] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [PropagateEmbeddingQuantization] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [PropagateDequantThroughSplit] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [PropagateDequantThroughSplit] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [MatMulAddToMatMulIntegerAddCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [MatMulAddToMatMulIntegerAddCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [MatMulToMatMulIntegerCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [MatMulToMatMulIntegerCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [FoldReLUQuants] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [FoldReLUQuants] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [ConvToConvIntegerAddCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [ConvToConvIntegerAddCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [GemmToQLinearMatMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [GemmToQLinearMatMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [GemmToMatMulIntegerAddCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [GemmToMatMulIntegerAddCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [QuantizeResiduals] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [QuantizeResiduals] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [RemoveDuplicateQConvWeights] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [RemoveDuplicateQConvWeights] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [RemoveDuplicateQuantizeOps] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [RemoveDuplicateQuantizeOps] Transformed 0 matches
2024-01-05 20:00:37 sparseml.export.export INFO     Successfully exported model.onnx to ./model.onnx...
2024-01-05 20:00:37 sparseml.export.export INFO     Successfully exported model.onnx to ./model.onnx...
2024-01-05 20:00:37 sparseml.export.export INFO     Exporting 2 samples...
2024-01-05 20:00:37 sparseml.export.export INFO     Exporting 2 samples...
2it [00:12,  6.38s/it]
2024-01-05 20:00:50 sparseml.export.export_data INFO     Exporting sample-inputs to ....
2024-01-05 20:00:50 sparseml.export.export_data INFO     Exporting sample-inputs to ....
2024-01-05 20:00:50 sparseml.export.export_data INFO     Successfully exported sample-inputs to .!
2024-01-05 20:00:50 sparseml.export.export_data INFO     Successfully exported sample-inputs to .!
2024-01-05 20:00:50 sparseml.export.export_data INFO     Exporting sample-outputs to ....
2024-01-05 20:00:50 sparseml.export.export_data INFO     Exporting sample-outputs to ....
2024-01-05 20:00:57 sparseml.export.export_data INFO     Successfully exported sample-outputs to .!
2024-01-05 20:00:57 sparseml.export.export_data INFO     Successfully exported sample-outputs to .!
2024-01-05 20:00:57 sparseml.export.export INFO     Creating deployment folder deployment at directory: ....
2024-01-05 20:00:57 sparseml.export.export INFO     Creating deployment folder deployment at directory: ....
2024-01-05 20:00:57 sparseml.export.helpers WARNING  Optional file tokenizer.model not found in source path /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac
2024-01-05 20:00:57 sparseml.export.helpers WARNING  Optional file tokenizer.model not found in source path /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac
2024-01-05 20:00:57 sparseml.export.export INFO     Validating model structure...
2024-01-05 20:00:57 sparseml.export.export INFO     Validating model structure...
2024-01-05 20:00:57 sparseml.export.validators WARNING  File ./deployment/tokenizer.model is missing.
2024-01-05 20:00:57 sparseml.export.validators WARNING  File ./deployment/tokenizer.model is missing.
2024-01-05 20:00:57 sparseml.export.validators WARNING  File ./sample-labels is missing.
2024-01-05 20:00:57 sparseml.export.validators WARNING  File ./sample-labels is missing.
2024-01-05 20:00:57 sparseml.export.export INFO     Validating model correctness...
2024-01-05 20:00:57 sparseml.export.export INFO     Validating model correctness...
2024-01-05 20:01:11 sparseml.export.validators INFO     Successfully validated the exported model on all 2 samples.
2024-01-05 20:01:11 sparseml.export.validators INFO     Successfully validated the exported model on all 2 samples.
2024-01-05 20:01:11 sparseml.export.export INFO     Applying optimizations: all to the exported model...
2024-01-05 20:01:11 sparseml.export.export INFO     Applying optimizations: all to the exported model...
2024-01-05 20:01:11 sparseml.export.helpers INFO     Attempting to apply optimization: kv_cache_injection... 
2024-01-05 20:01:11 sparseml.export.helpers INFO     Attempting to apply optimization: kv_cache_injection... 
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.configs INFO     Loaded config file deployment/config.json for model: gpt_neo
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.configs INFO     Loaded config file deployment/config.json for model: gpt_neo
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.configs INFO     Properly configured arguments for KV Cache Transformation
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.configs INFO     Properly configured arguments for KV Cache Transformation
2024-01-05 20:01:11 sparseml.exporters.transforms.onnx_transform INFO     [CacheKeysAndValues] Transformed 16 matches
2024-01-05 20:01:11 sparseml.exporters.transforms.onnx_transform INFO     [CacheKeysAndValues] Transformed 16 matches
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Inserted positions input to the ONNX model
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Inserted positions input to the ONNX model
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Inserted causal_mask input to the ONNX model
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Inserted causal_mask input to the ONNX model
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Successfully swapped 1 nodes for input 'positions'
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Successfully swapped 1 nodes for input 'positions'
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Successfully swapped 8 nodes for input 'causal_mask'
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Successfully swapped 8 nodes for input 'causal_mask'
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_codegen INFO     Successfully adjusted the causal_mask input
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_codegen INFO     Successfully adjusted the causal_mask input
2024-01-05 20:01:11 sparseml.exporters.transforms.onnx_transform INFO     [AdditionalTransformsCodeGen] Transformed 10 matches
2024-01-05 20:01:11 sparseml.exporters.transforms.onnx_transform INFO     [AdditionalTransformsCodeGen] Transformed 10 matches
2024-01-05 20:01:11 sparseml.export.helpers INFO     Optimization: kv_cache_injection has been successfully applied to the ONNX model: ./deployment/model.onnx
2024-01-05 20:01:11 sparseml.export.helpers INFO     Optimization: kv_cache_injection has been successfully applied to the ONNX model: ./deployment/model.onnx
2024-01-05 20:01:11 sparseml.export.export INFO     Successfully exported model from:
.
to
./deployment
for integration: transformers
2024-01-05 20:01:11 sparseml.export.export INFO     Successfully exported model from:
.
to
./deployment
for integration: transformers

@dbogunowicz dbogunowicz changed the base branch from main to feature/damian/feature_branch_export January 3, 2024 11:37
@@ -45,7 +45,6 @@ def export(
opset: int = TORCH_DEFAULT_ONNX_OPSET,
single_graph_file: bool = True,
num_export_samples: int = 0,
batch_size: int = 1,
Copy link
Contributor Author

@dbogunowicz dbogunowicz Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing batch_size argument from the export.
It does not matter for the model export.
It also does not matter for the sample export (by convention, all our sample inputs/outputs/labeled are stored in the "batchless" arrays, e.g. inp-0000.npz has shape (3, 244, 244))

@dbogunowicz dbogunowicz force-pushed the feature/damian/validate_correctness_finish branch from 3936104 to 5fe442c Compare January 3, 2024 12:52
…port' into feature/damian/validate_correctness_finish
@dbogunowicz dbogunowicz marked this pull request as ready for review January 4, 2024 11:21
@dbogunowicz dbogunowicz changed the title Feature/damian/validate correctness finish [Export][Transformers] Implementation of correctness validation Jan 4, 2024
top_k_ground_truth = numpy.argsort(ground_truth.flatten())[-k:]
return numpy.all(top_k_prediction == top_k_ground_truth)


def validate_correctness(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bfineran this could be in the future moved to integration_helper_functions, but top_k_match feels like the right validation metric for all our use cases so far (to my best knowledge).

Comment on lines +114 to +119
def test_export_validate_correctness(self, caplog, setup):
if self.is_model_quantized:
pytest.skip(
"Skipping since quantized models may not pass this test"
"due to differences in rounding between quant ops in PyTorch and ONNX"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an expected error range here that we could check for rather than skipping entirely?

…o feature/damian/validate_correctness_finish
@dbogunowicz dbogunowicz changed the base branch from feature/damian/feature_branch_export to feature/damian/samples_llms January 5, 2024 16:39
outputs = outputs[0]
# outputs_ contains (logits, scores)
outputs = OrderedDict(logits=outputs[0], scores=outputs[1])
if len(inputs.size()) == 4:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add a comment that this is IC specific

@dbogunowicz dbogunowicz merged commit e0c1068 into feature/damian/samples_llms Jan 5, 2024
@dbogunowicz dbogunowicz deleted the feature/damian/validate_correctness_finish branch January 5, 2024 20:12
dbogunowicz added a commit that referenced this pull request Jan 5, 2024
* add suport for past_key_values in sample-outputs

* [Export][Transformers] Implementation of correctness validation (#1935)

* fix tests with help from sara

* Update src/sparseml/transformers/utils/initializers.py

* swap sparsezoo validator for custom one (top k match)

* add more informative error message

* add correctness validation for LLMs

* remove past_key_values from outputs

* remove past_key_values from outputs (2)

* small note comment for the future
bfineran added a commit that referenced this pull request Jan 10, 2024
* initial commit

* respond to PR comments

* [Export Refactor][Image Classification] `create_model` function (#1878)

* initial commit

* looking good, time to cleanup

* Delete src/sparseml/export/helpers.py

* Delete tests/sparseml/export/test_helpers.py

* ready for review

* improve design

* tests pass

* reuse _validate_dataset_num_classes

* [Export Refactor][Image Classification] `create_dummy_input` function (#1880)

* initial commit

* looking good, time to cleanup

* Delete src/sparseml/export/helpers.py

* Delete tests/sparseml/export/test_helpers.py

* ready for review

* improve design

* tests pass

* reuse _validate_dataset_num_classes

* initial commit

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* ready for review

* Update src/sparseml/export/export.py

* Update src/sparseml/integration_helper_functions.py

* [Export Refactor][Image Classification] `export_model` function (#1883)

* initial commit

* looking good, time to cleanup

* Delete src/sparseml/export/helpers.py

* Delete tests/sparseml/export/test_helpers.py

* ready for review

* improve design

* tests pass

* reuse _validate_dataset_num_classes

* initial commit

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* ready for review

* Update src/sparseml/export/export.py

* Update src/sparseml/integration_helper_functions.py

* initial commit

* fixes

* ready for review

* nit

* add return

* make export function more general

* [Export Refactor][Image Classification] `apply_optimizations` function (#1884)

* initial commit

* looking good, time to cleanup

* Delete src/sparseml/export/helpers.py

* Delete tests/sparseml/export/test_helpers.py

* ready for review

* improve design

* tests pass

* reuse _validate_dataset_num_classes

* initial commit

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* ready for review

* Update src/sparseml/export/export.py

* Update src/sparseml/integration_helper_functions.py

* initial commit

* fixes

* ready for review

* nit

* add return

* initial commit

* [Export Refactor][Image Classification] `export_sample_inputs_outputs` function (#1888)

* initial commit

* looking good, time to cleanup

* Delete src/sparseml/export/helpers.py

* Delete tests/sparseml/export/test_helpers.py

* ready for review

* improve design

* tests pass

* reuse _validate_dataset_num_classes

* initial commit

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* ready for review

* Update src/sparseml/export/export.py

* Update src/sparseml/integration_helper_functions.py

* initial commit

* fixes

* ready for review

* nit

* add return

* initial commit

* initial commit

* PR comments

* beautification

* remove duplicated function

* [Export Refactor][Image Classification] `create_deployment_folder` function (#1889)

* initial commit

* looking good, time to cleanup

* Delete src/sparseml/export/helpers.py

* Delete tests/sparseml/export/test_helpers.py

* ready for review

* improve design

* tests pass

* reuse _validate_dataset_num_classes

* initial commit

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* ready for review

* Update src/sparseml/export/export.py

* Update src/sparseml/integration_helper_functions.py

* initial commit

* fixes

* ready for review

* nit

* add return

* initial commit

* initial commit

* initial commit

* fix rebase, tests_work

* ready to push

* [Export Refactor][Image Classification] `validate_correctness` function (#1890)

* initial commit

* looking good, time to cleanup

* Delete src/sparseml/export/helpers.py

* Delete tests/sparseml/export/test_helpers.py

* ready for review

* improve design

* tests pass

* reuse _validate_dataset_num_classes

* initial commit

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* ready for review

* Update src/sparseml/export/export.py

* Update src/sparseml/integration_helper_functions.py

* initial commit

* fixes

* ready for review

* nit

* add return

* initial commit

* initial commit

* initial commit

* initial commit

* Delete tests/sparseml/test_integration_helper_functions.py

* ready to merge

* [Export Refactor] End to end testing (#1898)

* initial commit

* looking good, time to cleanup

* Delete src/sparseml/export/helpers.py

* Delete tests/sparseml/export/test_helpers.py

* ready for review

* improve design

* tests pass

* reuse _validate_dataset_num_classes

* initial commit

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* ready for review

* Update src/sparseml/export/export.py

* Update src/sparseml/integration_helper_functions.py

* initial commit

* fixes

* ready for review

* nit

* add return

* initial commit

* initial commit

* initial commit

* initial commit

* Delete tests/sparseml/test_integration_helper_functions.py

* ready to merge

* add structure validator

* ready for review

* Delete tests/sparseml/export/model.onnx

* Delete tests/sparseml/export/image_classification/model.onnx

* Delete tests/sparseml/export/image_classification/conftest.py

* PR comments

* remove onnx

* [Export Refactor] Prepare the module to be more general (before including `transformers`) (#1908)

* adapt the export script to handle transformers

* Update src/sparseml/pytorch/image_classification/integration_helper_functions.py

* Delete tests/sparseml/export/transformers/__init__.py

* Delete tests/sparseml/export/transformers/test_generative_transformers.py

* Delete tests/sparseml/export/transformers/test_transformers.py

* Update src/sparseml/export/export.py

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

* addressing review comments

* [Export Refactor] Export `transformers` (#1909)

* cleanup

* Delete src/sparseml/transformers/integration_helper_functions_generative.py

* Delete src/sparseml/transformers/utils/optimizations.py

* Delete tests/sparseml/export/transformers/test_generative_transformers.py

* Delete tests/sparseml/transformers/test_integration_helper_functions_generative.py

* addressing PR reviews

* [Export Refactor] Export generative transformers(#1910)

* make tests green, remove using task to resolve the integration type

* fix all the tests after the merge, make integration resolution independent of the task name

* fold generative transformers into transformer helper functions

* complete tests for export_data.py

* Update src/sparseml/export/export.py

* add tests that confirms that kv cache injection has been added

* move applying optimizations into integration helper functions

---------

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

* [Export Refactor][Transformers] Enable loading SparseModels (#1921)

* initial commit

* adressing review comments

* Fix the tests

* fix tests with help from sara

* [Export][Transformers] Enable loading `text-generation` datasets (#1938)

* add suport for past_key_values in sample-outputs

* [Export][Transformers] Implementation of correctness validation (#1935)

* fix tests with help from sara

* Update src/sparseml/transformers/utils/initializers.py

* swap sparsezoo validator for custom one (top k match)

* add more informative error message

* add correctness validation for LLMs

* remove past_key_values from outputs

* remove past_key_values from outputs (2)

* small note comment for the future

* tests fixed

* fix test

* [Export refactor] final manual testing fixes (#1948)

* [Export refactor] final manual testing fixes

* review

---------

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants