Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cherry Pick 1.5.4] Fix Index error with nm-transformers 1.5.1 upgrade #1708

Merged
merged 4 commits into from
Aug 21, 2023

Conversation

rahul-tuli
Copy link
Member

@rahul-tuli rahul-tuli commented Aug 21, 2023

During 1.5.1 upgrade of transformers the needed changes on sparseml side were not cherrypicked to release/1.5 branch; this PR is a minimal version of those changes needed for latest sparseml ~1.5 wheels to work with latest nm-transformers ~1.5 wheels.

The original problem commit that was missed: 4ec5133

Test command:

#!/usr/bin/env bash

# Exit on error, undefined variables, and errors in piped commands

set -euf -o pipefail

export SPARSEZOO_TEST_MODE="true"
export NM_BIND_THREADS_TO_CORES=1
export NM_DISABLE_ANALYTICS=1

sparseml.transformers.train.text_classification \
    --output_dir sparse_quantized_bert-text_classification_sst2 \
    --model_name_or_path "zoo:nlp/sentiment_analysis/obert-base/pytorch/huggingface/sst2/pruned90_quant-none" \
    --task_name sst2 --max_seq_length 128 --per_device_train_batch_size 32 --per_device_eval_batch_size 32 --preprocessing_num_workers 6 \
    --do_eval 2>&1 | tee test-final.log

Error before this PR:

2023-08-18 13:52:43 sparseml.pytorch.utils.logger INFO     Logging all SparseML modifier-level logs to sparse_logs/18-08-2023_13.52.43.log
INFO:sparseml.pytorch.utils.logger:Logging all SparseML modifier-level logs to sparse_logs/18-08-2023_13.52.43.log
2023-08-18 13:52:43 sparseml.transformers.sparsification.trainer INFO     Loaded 1 SparseML checkpoint recipe stage(s) from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training/recipe.yaml to replicate model sparse state
INFO:sparseml.transformers.sparsification.trainer:Loaded 1 SparseML checkpoint recipe stage(s) from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training/recipe.yaml to replicate model sparse state
2023-08-18 13:52:45 sparseml.transformers.sparsification.trainer INFO     Applied structure from 1 previous recipe stage(s) to model and finalized (recipes saved with model_path)
INFO:sparseml.transformers.sparsification.trainer:Applied structure from 1 previous recipe stage(s) to model and finalized (recipes saved with model_path)
Traceback (most recent call last):
  File "/home/rahul/projects/.venv/bin/sparseml.transformers.train.text_classification", line 8, in <module>
    sys.exit(main())
  File "/home/rahul/projects/.venv/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
  File "/home/rahul/projects/.venv/lib/python3.8/site-packages/sparseml/transformers/text_classification.py", line 575, in main
    metrics = trainer.evaluate(eval_dataset=eval_dataset)
  File "/home/rahul/projects/.venv/lib/python3.8/site-packages/sparseml/transformers/sparsification/trainer.py", line 868, in evaluate
    applied = self.apply_manager(epoch=math.inf, checkpoint=None)
  File "/home/rahul/projects/.venv/lib/python3.8/site-packages/sparseml/transformers/sparsification/trainer.py", line 221, in apply_manager
    self._reload_model_state(load_path, orig_state_dict)
  File "/home/rahul/projects/.venv/lib/python3.8/site-packages/sparseml/transformers/sparsification/trainer.py", line 676, in _reload_model_state
    _, missing, unexpected, _, _ = self.model._load_pretrained_model(
  File "/home/rahul/projects/.venv/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3150, in _load_pretrained_model
    folder = os.path.sep.join(resolved_archive_file[0].split(os.path.sep)[:-1])
IndexError: list index out of range

After this PR, along the transformers PR from above:

2023-08-21 13:07:15 sparseml.pytorch.utils.logger INFO     Logging all SparseML modifier-level logs to sparse_logs/21-08-2023_13.07.15.log
INFO:sparseml.pytorch.utils.logger:Logging all SparseML modifier-level logs to sparse_logs/21-08-2023_13.07.15.log
2023-08-21 13:07:15 sparseml.transformers.sparsification.trainer INFO     Loaded 1 SparseML checkpoint recipe stage(s) from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training/recipe.yaml to replicate model sparse state
INFO:sparseml.transformers.sparsification.trainer:Loaded 1 SparseML checkpoint recipe stage(s) from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training/recipe.yaml to replicate model sparse state
2023-08-21 13:07:17 sparseml.transformers.sparsification.trainer INFO     Applied structure from 1 previous recipe stage(s) to model and finalized (recipes saved with model_path)
INFO:sparseml.transformers.sparsification.trainer:Applied structure from 1 previous recipe stage(s) to model and finalized (recipes saved with model_path)
2023-08-21 13:07:18 sparseml.transformers.sparsification.trainer INFO     Reloaded 1784 model params for SparseML Recipe from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training
INFO:sparseml.transformers.sparsification.trainer:Reloaded 1784 model params for SparseML Recipe from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training
2023-08-21 13:07:18 sparseml.transformers.utils.model INFO     Loaded model from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training with 109483778 total params. Of those there are 85526016 prunable params which have 89.3777046740959 avg sparsity.
INFO:sparseml.transformers.utils.model:Loaded model from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training with 109483778 total params. Of those there are 85526016 prunable params which have 89.3777046740959 avg sparsity.
2023-08-21 13:07:18 sparseml.transformers.utils.model INFO     sparse model detected, all sparsification info: {"params_summary": {"total": 109483778, "sparse": 76441190, "sparsity_percent": 69.819649446149, "prunable": 85526016, "prunable_sparse": 76441190, "prunable_sparsity_percent": 89.3777046740959, "quantizable": 85609730, "quantized": 85609730, "quantized_percent": 100.0}, "params_info": {"bert.encoder.layer.0.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8644917607307434, "quantized": true}, "bert.encoder.layer.0.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8680216670036316, "quantized": true}, "bert.encoder.layer.0.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9312151074409485, "quantized": true}, "bert.encoder.layer.0.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9232262372970581, "quantized": true}, "bert.encoder.layer.0.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9153103232383728, "quantized": true}, "bert.encoder.layer.0.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9356380105018616, "quantized": true}, "bert.encoder.layer.1.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8620554804801941, "quantized": true}, "bert.encoder.layer.1.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8625064492225647, "quantized": true}, "bert.encoder.layer.1.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9279242753982544, "quantized": true}, "bert.encoder.layer.1.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9245097041130066, "quantized": true}, "bert.encoder.layer.1.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8945091962814331, "quantized": true}, "bert.encoder.layer.1.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9265751242637634, "quantized": true}, "bert.encoder.layer.2.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8451063632965088, "quantized": true}, "bert.encoder.layer.2.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8532799482345581, "quantized": true}, "bert.encoder.layer.2.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9295671582221985, "quantized": true}, "bert.encoder.layer.2.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9288228154182434, "quantized": true}, "bert.encoder.layer.2.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8895581364631653, "quantized": true}, "bert.encoder.layer.2.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9237624406814575, "quantized": true}, "bert.encoder.layer.3.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.871110737323761, "quantized": true}, "bert.encoder.layer.3.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8704121708869934, "quantized": true}, "bert.encoder.layer.3.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9085676670074463, "quantized": true}, "bert.encoder.layer.3.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9130028486251831, "quantized": true}, "bert.encoder.layer.3.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8868213295936584, "quantized": true}, "bert.encoder.layer.3.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9209082126617432, "quantized": true}, "bert.encoder.layer.4.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8635711669921875, "quantized": true}, "bert.encoder.layer.4.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8665059208869934, "quantized": true}, "bert.encoder.layer.4.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8824039101600647, "quantized": true}, "bert.encoder.layer.4.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8957400918006897, "quantized": true}, "bert.encoder.layer.4.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8822059631347656, "quantized": true}, "bert.encoder.layer.4.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9172935485839844, "quantized": true}, "bert.encoder.layer.5.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.868516743183136, "quantized": true}, "bert.encoder.layer.5.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8675944209098816, "quantized": true}, "bert.encoder.layer.5.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8843333125114441, "quantized": true}, "bert.encoder.layer.5.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8958757519721985, "quantized": true}, "bert.encoder.layer.5.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8838331699371338, "quantized": true}, "bert.encoder.layer.5.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.91917884349823, "quantized": true}, "bert.encoder.layer.6.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8699256181716919, "quantized": true}, "bert.encoder.layer.6.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8717482089996338, "quantized": true}, "bert.encoder.layer.6.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.892473042011261, "quantized": true}, "bert.encoder.layer.6.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9078572392463684, "quantized": true}, "bert.encoder.layer.6.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8827946782112122, "quantized": true}, "bert.encoder.layer.6.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9222526550292969, "quantized": true}, "bert.encoder.layer.7.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8792538046836853, "quantized": true}, "bert.encoder.layer.7.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8780840039253235, "quantized": true}, "bert.encoder.layer.7.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8773871660232544, "quantized": true}, "bert.encoder.layer.7.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8886498212814331, "quantized": true}, "bert.encoder.layer.7.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8991622924804688, "quantized": true}, "bert.encoder.layer.7.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9271066784858704, "quantized": true}, "bert.encoder.layer.8.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8583950400352478, "quantized": true}, "bert.encoder.layer.8.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.856842041015625, "quantized": true}, "bert.encoder.layer.8.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8692152500152588, "quantized": true}, "bert.encoder.layer.8.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8843451738357544, "quantized": true}, "bert.encoder.layer.8.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9019758701324463, "quantized": true}, "bert.encoder.layer.8.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9253442883491516, "quantized": true}, "bert.encoder.layer.9.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8518574833869934, "quantized": true}, "bert.encoder.layer.9.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8523983359336853, "quantized": true}, "bert.encoder.layer.9.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8783705234527588, "quantized": true}, "bert.encoder.layer.9.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8867306113243103, "quantized": true}, "bert.encoder.layer.9.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9011484980583191, "quantized": true}, "bert.encoder.layer.9.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.91860032081604, "quantized": true}, "bert.encoder.layer.10.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8570064902305603, "quantized": true}, "bert.encoder.layer.10.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8588087558746338, "quantized": true}, "bert.encoder.layer.10.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.873399555683136, "quantized": true}, "bert.encoder.layer.10.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8768836259841919, "quantized": true}, "bert.encoder.layer.10.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9072990417480469, "quantized": true}, "bert.encoder.layer.10.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9249801635742188, "quantized": true}, "bert.encoder.layer.11.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8541887402534485, "quantized": true}, "bert.encoder.layer.11.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8596123456954956, "quantized": true}, "bert.encoder.layer.11.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8792198896408081, "quantized": true}, "bert.encoder.layer.11.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8855014443397522, "quantized": true}, "bert.encoder.layer.11.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8986706137657166, "quantized": true}, "bert.encoder.layer.11.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9309417009353638, "quantized": true}, "bert.pooler.dense.module.weight": {"numel": 589824, "sparsity": 0.0, "quantized": true}, "classifier.module.weight": {"numel": 1536, "sparsity": 0.0, "quantized": true}}}
INFO:sparseml.transformers.utils.model:sparse model detected, all sparsification info: {"params_summary": {"total": 109483778, "sparse": 76441190, "sparsity_percent": 69.819649446149, "prunable": 85526016, "prunable_sparse": 76441190, "prunable_sparsity_percent": 89.3777046740959, "quantizable": 85609730, "quantized": 85609730, "quantized_percent": 100.0}, "params_info": {"bert.encoder.layer.0.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8644917607307434, "quantized": true}, "bert.encoder.layer.0.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8680216670036316, "quantized": true}, "bert.encoder.layer.0.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9312151074409485, "quantized": true}, "bert.encoder.layer.0.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9232262372970581, "quantized": true}, "bert.encoder.layer.0.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9153103232383728, "quantized": true}, "bert.encoder.layer.0.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9356380105018616, "quantized": true}, "bert.encoder.layer.1.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8620554804801941, "quantized": true}, "bert.encoder.layer.1.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8625064492225647, "quantized": true}, "bert.encoder.layer.1.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9279242753982544, "quantized": true}, "bert.encoder.layer.1.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9245097041130066, "quantized": true}, "bert.encoder.layer.1.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8945091962814331, "quantized": true}, "bert.encoder.layer.1.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9265751242637634, "quantized": true}, "bert.encoder.layer.2.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8451063632965088, "quantized": true}, "bert.encoder.layer.2.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8532799482345581, "quantized": true}, "bert.encoder.layer.2.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9295671582221985, "quantized": true}, "bert.encoder.layer.2.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9288228154182434, "quantized": true}, "bert.encoder.layer.2.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8895581364631653, "quantized": true}, "bert.encoder.layer.2.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9237624406814575, "quantized": true}, "bert.encoder.layer.3.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.871110737323761, "quantized": true}, "bert.encoder.layer.3.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8704121708869934, "quantized": true}, "bert.encoder.layer.3.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9085676670074463, "quantized": true}, "bert.encoder.layer.3.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9130028486251831, "quantized": true}, "bert.encoder.layer.3.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8868213295936584, "quantized": true}, "bert.encoder.layer.3.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9209082126617432, "quantized": true}, "bert.encoder.layer.4.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8635711669921875, "quantized": true}, "bert.encoder.layer.4.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8665059208869934, "quantized": true}, "bert.encoder.layer.4.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8824039101600647, "quantized": true}, "bert.encoder.layer.4.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8957400918006897, "quantized": true}, "bert.encoder.layer.4.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8822059631347656, "quantized": true}, "bert.encoder.layer.4.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9172935485839844, "quantized": true}, "bert.encoder.layer.5.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.868516743183136, "quantized": true}, "bert.encoder.layer.5.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8675944209098816, "quantized": true}, "bert.encoder.layer.5.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8843333125114441, "quantized": true}, "bert.encoder.layer.5.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8958757519721985, "quantized": true}, "bert.encoder.layer.5.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8838331699371338, "quantized": true}, "bert.encoder.layer.5.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.91917884349823, "quantized": true}, "bert.encoder.layer.6.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8699256181716919, "quantized": true}, "bert.encoder.layer.6.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8717482089996338, "quantized": true}, "bert.encoder.layer.6.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.892473042011261, "quantized": true}, "bert.encoder.layer.6.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9078572392463684, "quantized": true}, "bert.encoder.layer.6.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8827946782112122, "quantized": true}, "bert.encoder.layer.6.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9222526550292969, "quantized": true}, "bert.encoder.layer.7.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8792538046836853, "quantized": true}, "bert.encoder.layer.7.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8780840039253235, "quantized": true}, "bert.encoder.layer.7.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8773871660232544, "quantized": true}, "bert.encoder.layer.7.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8886498212814331, "quantized": true}, "bert.encoder.layer.7.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8991622924804688, "quantized": true}, "bert.encoder.layer.7.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9271066784858704, "quantized": true}, "bert.encoder.layer.8.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8583950400352478, "quantized": true}, "bert.encoder.layer.8.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.856842041015625, "quantized": true}, "bert.encoder.layer.8.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8692152500152588, "quantized": true}, "bert.encoder.layer.8.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8843451738357544, "quantized": true}, "bert.encoder.layer.8.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9019758701324463, "quantized": true}, "bert.encoder.layer.8.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9253442883491516, "quantized": true}, "bert.encoder.layer.9.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8518574833869934, "quantized": true}, "bert.encoder.layer.9.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8523983359336853, "quantized": true}, "bert.encoder.layer.9.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8783705234527588, "quantized": true}, "bert.encoder.layer.9.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8867306113243103, "quantized": true}, "bert.encoder.layer.9.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9011484980583191, "quantized": true}, "bert.encoder.layer.9.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.91860032081604, "quantized": true}, "bert.encoder.layer.10.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8570064902305603, "quantized": true}, "bert.encoder.layer.10.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8588087558746338, "quantized": true}, "bert.encoder.layer.10.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.873399555683136, "quantized": true}, "bert.encoder.layer.10.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8768836259841919, "quantized": true}, "bert.encoder.layer.10.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9072990417480469, "quantized": true}, "bert.encoder.layer.10.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9249801635742188, "quantized": true}, "bert.encoder.layer.11.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8541887402534485, "quantized": true}, "bert.encoder.layer.11.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8596123456954956, "quantized": true}, "bert.encoder.layer.11.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8792198896408081, "quantized": true}, "bert.encoder.layer.11.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8855014443397522, "quantized": true}, "bert.encoder.layer.11.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8986706137657166, "quantized": true}, "bert.encoder.layer.11.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9309417009353638, "quantized": true}, "bert.pooler.dense.module.weight": {"numel": 589824, "sparsity": 0.0, "quantized": true}, "classifier.module.weight": {"numel": 1536, "sparsity": 0.0, "quantized": true}}}
2023-08-21 13:07:18 sparseml.transformers.sparsification.trainer INFO     Reloaded model state after SparseML recipe structure modifications from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training
INFO:sparseml.transformers.sparsification.trainer:Reloaded model state after SparseML recipe structure modifications from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training

  0%|          | 0/28 [00:00<?, ?it/s]
  7%|| 2/28 [00:00<00:01, 19.52it/s]
 14%|█▍        | 4/28 [00:00<00:01, 12.77it/s]
 21%|██▏       | 6/28 [00:00<00:01, 11.72it/s]
 29%|██▊       | 8/28 [00:00<00:01, 11.23it/s]
 36%|███▌      | 10/28 [00:00<00:01, 11.01it/s]
 43%|████▎     | 12/28 [00:01<00:01, 10.87it/s]
 50%|█████     | 14/28 [00:01<00:01, 10.79it/s]
 57%|█████▋    | 16/28 [00:01<00:01, 10.74it/s]
 64%|██████▍   | 18/28 [00:01<00:00, 10.70it/s]
 71%|███████▏  | 20/28 [00:01<00:00, 10.67it/s]
 79%|███████▊  | 22/28 [00:01<00:00, 10.66it/s]
 86%|████████▌ | 24/28 [00:02<00:00, 10.64it/s]
 93%|█████████▎| 26/28 [00:02<00:00, 10.63it/s]
100%|██████████| 28/28 [00:02<00:00, 11.69it/s]
100%|██████████| 28/28 [00:02<00:00, 11.17it/s]
***** eval metrics *****
  eval_accuracy           =     0.9128
  eval_loss               =     0.3192
  eval_runtime            = 0:00:03.10
  eval_samples            =        872
  eval_samples_per_second =    280.444
  eval_steps_per_second   =      9.005

Explanation: The fix was three fold

  • default resolve_archive_file to None instead of empty list
  • Accept 6 inputs from _load_pretrained_model functions instead of 5
  • Remove unsupported tensor flow_v1 tests from workflow file

@rahul-tuli rahul-tuli assigned rahul-tuli and bfineran and unassigned bfineran Aug 21, 2023
@bfineran bfineran changed the title [Cherry Pick] Fix Index error with 1.5.1 transformers upgrade [Cherry Pick 1.5.4] Fix Index error with nm-transformers 1.5.1 upgrade Aug 21, 2023
Copy link
Member

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending updating version.py

@rahul-tuli rahul-tuli marked this pull request as ready for review August 21, 2023 20:05
@bfineran bfineran merged commit d0abbf3 into release/1.5 Aug 21, 2023
9 of 10 checks passed
@bfineran bfineran deleted the index-error-cp branch August 21, 2023 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants