[Cherry Pick 1.5.4] Fix Index error with nm-transformers 1.5.1 upgrade #1708

rahul-tuli · 2023-08-21T17:12:47Z

During 1.5.1 upgrade of transformers the needed changes on sparseml side were not cherrypicked to release/1.5 branch; this PR is a minimal version of those changes needed for latest sparseml ~1.5 wheels to work with latest nm-transformers ~1.5 wheels.

The original problem commit that was missed: 4ec5133

Test command:

#!/usr/bin/env bash

# Exit on error, undefined variables, and errors in piped commands

set -euf -o pipefail

export SPARSEZOO_TEST_MODE="true"
export NM_BIND_THREADS_TO_CORES=1
export NM_DISABLE_ANALYTICS=1

sparseml.transformers.train.text_classification \
    --output_dir sparse_quantized_bert-text_classification_sst2 \
    --model_name_or_path "zoo:nlp/sentiment_analysis/obert-base/pytorch/huggingface/sst2/pruned90_quant-none" \
    --task_name sst2 --max_seq_length 128 --per_device_train_batch_size 32 --per_device_eval_batch_size 32 --preprocessing_num_workers 6 \
    --do_eval 2>&1 | tee test-final.log

Error before this PR:

2023-08-18 13:52:43 sparseml.pytorch.utils.logger INFO     Logging all SparseML modifier-level logs to sparse_logs/18-08-2023_13.52.43.log
INFO:sparseml.pytorch.utils.logger:Logging all SparseML modifier-level logs to sparse_logs/18-08-2023_13.52.43.log
2023-08-18 13:52:43 sparseml.transformers.sparsification.trainer INFO     Loaded 1 SparseML checkpoint recipe stage(s) from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training/recipe.yaml to replicate model sparse state
INFO:sparseml.transformers.sparsification.trainer:Loaded 1 SparseML checkpoint recipe stage(s) from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training/recipe.yaml to replicate model sparse state
2023-08-18 13:52:45 sparseml.transformers.sparsification.trainer INFO     Applied structure from 1 previous recipe stage(s) to model and finalized (recipes saved with model_path)
INFO:sparseml.transformers.sparsification.trainer:Applied structure from 1 previous recipe stage(s) to model and finalized (recipes saved with model_path)
Traceback (most recent call last):
  File "/home/rahul/projects/.venv/bin/sparseml.transformers.train.text_classification", line 8, in <module>
    sys.exit(main())
  File "/home/rahul/projects/.venv/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
  File "/home/rahul/projects/.venv/lib/python3.8/site-packages/sparseml/transformers/text_classification.py", line 575, in main
    metrics = trainer.evaluate(eval_dataset=eval_dataset)
  File "/home/rahul/projects/.venv/lib/python3.8/site-packages/sparseml/transformers/sparsification/trainer.py", line 868, in evaluate
    applied = self.apply_manager(epoch=math.inf, checkpoint=None)
  File "/home/rahul/projects/.venv/lib/python3.8/site-packages/sparseml/transformers/sparsification/trainer.py", line 221, in apply_manager
    self._reload_model_state(load_path, orig_state_dict)
  File "/home/rahul/projects/.venv/lib/python3.8/site-packages/sparseml/transformers/sparsification/trainer.py", line 676, in _reload_model_state
    _, missing, unexpected, _, _ = self.model._load_pretrained_model(
  File "/home/rahul/projects/.venv/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3150, in _load_pretrained_model
    folder = os.path.sep.join(resolved_archive_file[0].split(os.path.sep)[:-1])
IndexError: list index out of range

After this PR, along the transformers PR from above:

2023-08-21 13:07:15 sparseml.pytorch.utils.logger INFO     Logging all SparseML modifier-level logs to sparse_logs/21-08-2023_13.07.15.log
INFO:sparseml.pytorch.utils.logger:Logging all SparseML modifier-level logs to sparse_logs/21-08-2023_13.07.15.log
2023-08-21 13:07:15 sparseml.transformers.sparsification.trainer INFO     Loaded 1 SparseML checkpoint recipe stage(s) from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training/recipe.yaml to replicate model sparse state
INFO:sparseml.transformers.sparsification.trainer:Loaded 1 SparseML checkpoint recipe stage(s) from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training/recipe.yaml to replicate model sparse state
2023-08-21 13:07:17 sparseml.transformers.sparsification.trainer INFO     Applied structure from 1 previous recipe stage(s) to model and finalized (recipes saved with model_path)
INFO:sparseml.transformers.sparsification.trainer:Applied structure from 1 previous recipe stage(s) to model and finalized (recipes saved with model_path)
2023-08-21 13:07:18 sparseml.transformers.sparsification.trainer INFO     Reloaded 1784 model params for SparseML Recipe from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training
INFO:sparseml.transformers.sparsification.trainer:Reloaded 1784 model params for SparseML Recipe from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training
2023-08-21 13:07:18 sparseml.transformers.utils.model INFO     Loaded model from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training with 109483778 total params. Of those there are 85526016 prunable params which have 89.3777046740959 avg sparsity.
INFO:sparseml.transformers.utils.model:Loaded model from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training with 109483778 total params. Of those there are 85526016 prunable params which have 89.3777046740959 avg sparsity.
2023-08-21 13:07:18 sparseml.transformers.utils.model INFO     sparse model detected, all sparsification info: {"params_summary": {"total": 109483778, "sparse": 76441190, "sparsity_percent": 69.819649446149, "prunable": 85526016, "prunable_sparse": 76441190, "prunable_sparsity_percent": 89.3777046740959, "quantizable": 85609730, "quantized": 85609730, "quantized_percent": 100.0}, "params_info": {"bert.encoder.layer.0.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8644917607307434, "quantized": true}, "bert.encoder.layer.0.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8680216670036316, "quantized": true}, "bert.encoder.layer.0.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9312151074409485, "quantized": true}, "bert.encoder.layer.0.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9232262372970581, "quantized": true}, "bert.encoder.layer.0.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9153103232383728, "quantized": true}, "bert.encoder.layer.0.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9356380105018616, "quantized": true}, "bert.encoder.layer.1.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8620554804801941, "quantized": true}, "bert.encoder.layer.1.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8625064492225647, "quantized": true}, "bert.encoder.layer.1.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9279242753982544, "quantized": true}, "bert.encoder.layer.1.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9245097041130066, "quantized": true}, "bert.encoder.layer.1.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8945091962814331, "quantized": true}, "bert.encoder.layer.1.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9265751242637634, "quantized": true}, "bert.encoder.layer.2.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8451063632965088, "quantized": true}, "bert.encoder.layer.2.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8532799482345581, "quantized": true}, "bert.encoder.layer.2.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9295671582221985, "quantized": true}, "bert.encoder.layer.2.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9288228154182434, "quantized": true}, "bert.encoder.layer.2.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8895581364631653, "quantized": true}, "bert.encoder.layer.2.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9237624406814575, "quantized": true}, "bert.encoder.layer.3.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.871110737323761, "quantized": true}, "bert.encoder.layer.3.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8704121708869934, "quantized": true}, "bert.encoder.layer.3.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9085676670074463, "quantized": true}, "bert.encoder.layer.3.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9130028486251831, "quantized": true}, "bert.encoder.layer.3.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8868213295936584, "quantized": true}, "bert.encoder.layer.3.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9209082126617432, "quantized": true}, "bert.encoder.layer.4.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8635711669921875, "quantized": true}, "bert.encoder.layer.4.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8665059208869934, "quantized": true}, "bert.encoder.layer.4.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8824039101600647, "quantized": true}, "bert.encoder.layer.4.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8957400918006897, "quantized": true}, "bert.encoder.layer.4.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8822059631347656, "quantized": true}, "bert.encoder.layer.4.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9172935485839844, "quantized": true}, "bert.encoder.layer.5.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.868516743183136, "quantized": true}, "bert.encoder.layer.5.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8675944209098816, "quantized": true}, "bert.encoder.layer.5.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8843333125114441, "quantized": true}, "bert.encoder.layer.5.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8958757519721985, "quantized": true}, "bert.encoder.layer.5.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8838331699371338, "quantized": true}, "bert.encoder.layer.5.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.91917884349823, "quantized": true}, "bert.encoder.layer.6.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8699256181716919, "quantized": true}, "bert.encoder.layer.6.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8717482089996338, "quantized": true}, "bert.encoder.layer.6.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.892473042011261, "quantized": true}, "bert.encoder.layer.6.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9078572392463684, "quantized": true}, "bert.encoder.layer.6.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8827946782112122, "quantized": true}, "bert.encoder.layer.6.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9222526550292969, "quantized": true}, "bert.encoder.layer.7.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8792538046836853, "quantized": true}, "bert.encoder.layer.7.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8780840039253235, "quantized": true}, "bert.encoder.layer.7.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8773871660232544, "quantized": true}, "bert.encoder.layer.7.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8886498212814331, "quantized": true}, "bert.encoder.layer.7.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8991622924804688, "quantized": true}, "bert.encoder.layer.7.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9271066784858704, "quantized": true}, "bert.encoder.layer.8.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8583950400352478, "quantized": true}, "bert.encoder.layer.8.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.856842041015625, "quantized": true}, "bert.encoder.layer.8.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8692152500152588, "quantized": true}, "bert.encoder.layer.8.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8843451738357544, "quantized": true}, "bert.encoder.layer.8.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9019758701324463, "quantized": true}, "bert.encoder.layer.8.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9253442883491516, "quantized": true}, "bert.encoder.layer.9.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8518574833869934, "quantized": true}, "bert.encoder.layer.9.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8523983359336853, "quantized": true}, "bert.encoder.layer.9.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8783705234527588, "quantized": true}, "bert.encoder.layer.9.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8867306113243103, "quantized": true}, "bert.encoder.layer.9.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9011484980583191, "quantized": true}, "bert.encoder.layer.9.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.91860032081604, "quantized": true}, "bert.encoder.layer.10.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8570064902305603, "quantized": true}, "bert.encoder.layer.10.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8588087558746338, "quantized": true}, "bert.encoder.layer.10.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.873399555683136, "quantized": true}, "bert.encoder.layer.10.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8768836259841919, "quantized": true}, "bert.encoder.layer.10.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9072990417480469, "quantized": true}, "bert.encoder.layer.10.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9249801635742188, "quantized": true}, "bert.encoder.layer.11.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8541887402534485, "quantized": true}, "bert.encoder.layer.11.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8596123456954956, "quantized": true}, "bert.encoder.layer.11.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8792198896408081, "quantized": true}, "bert.encoder.layer.11.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8855014443397522, "quantized": true}, "bert.encoder.layer.11.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8986706137657166, "quantized": true}, "bert.encoder.layer.11.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9309417009353638, "quantized": true}, "bert.pooler.dense.module.weight": {"numel": 589824, "sparsity": 0.0, "quantized": true}, "classifier.module.weight": {"numel": 1536, "sparsity": 0.0, "quantized": true}}}
INFO:sparseml.transformers.utils.model:sparse model detected, all sparsification info: {"params_summary": {"total": 109483778, "sparse": 76441190, "sparsity_percent": 69.819649446149, "prunable": 85526016, "prunable_sparse": 76441190, "prunable_sparsity_percent": 89.3777046740959, "quantizable": 85609730, "quantized": 85609730, "quantized_percent": 100.0}, "params_info": {"bert.encoder.layer.0.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8644917607307434, "quantized": true}, "bert.encoder.layer.0.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8680216670036316, "quantized": true}, "bert.encoder.layer.0.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9312151074409485, "quantized": true}, "bert.encoder.layer.0.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9232262372970581, "quantized": true}, "bert.encoder.layer.0.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9153103232383728, "quantized": true}, "bert.encoder.layer.0.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9356380105018616, "quantized": true}, "bert.encoder.layer.1.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8620554804801941, "quantized": true}, "bert.encoder.layer.1.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8625064492225647, "quantized": true}, "bert.encoder.layer.1.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9279242753982544, "quantized": true}, "bert.encoder.layer.1.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9245097041130066, "quantized": true}, "bert.encoder.layer.1.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8945091962814331, "quantized": true}, "bert.encoder.layer.1.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9265751242637634, "quantized": true}, "bert.encoder.layer.2.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8451063632965088, "quantized": true}, "bert.encoder.layer.2.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8532799482345581, "quantized": true}, "bert.encoder.layer.2.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9295671582221985, "quantized": true}, "bert.encoder.layer.2.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9288228154182434, "quantized": true}, "bert.encoder.layer.2.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8895581364631653, "quantized": true}, "bert.encoder.layer.2.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9237624406814575, "quantized": true}, "bert.encoder.layer.3.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.871110737323761, "quantized": true}, "bert.encoder.layer.3.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8704121708869934, "quantized": true}, "bert.encoder.layer.3.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.9085676670074463, "quantized": true}, "bert.encoder.layer.3.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9130028486251831, "quantized": true}, "bert.encoder.layer.3.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8868213295936584, "quantized": true}, "bert.encoder.layer.3.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9209082126617432, "quantized": true}, "bert.encoder.layer.4.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8635711669921875, "quantized": true}, "bert.encoder.layer.4.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8665059208869934, "quantized": true}, "bert.encoder.layer.4.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8824039101600647, "quantized": true}, "bert.encoder.layer.4.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8957400918006897, "quantized": true}, "bert.encoder.layer.4.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8822059631347656, "quantized": true}, "bert.encoder.layer.4.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9172935485839844, "quantized": true}, "bert.encoder.layer.5.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.868516743183136, "quantized": true}, "bert.encoder.layer.5.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8675944209098816, "quantized": true}, "bert.encoder.layer.5.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8843333125114441, "quantized": true}, "bert.encoder.layer.5.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8958757519721985, "quantized": true}, "bert.encoder.layer.5.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8838331699371338, "quantized": true}, "bert.encoder.layer.5.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.91917884349823, "quantized": true}, "bert.encoder.layer.6.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8699256181716919, "quantized": true}, "bert.encoder.layer.6.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8717482089996338, "quantized": true}, "bert.encoder.layer.6.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.892473042011261, "quantized": true}, "bert.encoder.layer.6.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.9078572392463684, "quantized": true}, "bert.encoder.layer.6.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8827946782112122, "quantized": true}, "bert.encoder.layer.6.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9222526550292969, "quantized": true}, "bert.encoder.layer.7.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8792538046836853, "quantized": true}, "bert.encoder.layer.7.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8780840039253235, "quantized": true}, "bert.encoder.layer.7.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8773871660232544, "quantized": true}, "bert.encoder.layer.7.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8886498212814331, "quantized": true}, "bert.encoder.layer.7.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8991622924804688, "quantized": true}, "bert.encoder.layer.7.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9271066784858704, "quantized": true}, "bert.encoder.layer.8.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8583950400352478, "quantized": true}, "bert.encoder.layer.8.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.856842041015625, "quantized": true}, "bert.encoder.layer.8.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8692152500152588, "quantized": true}, "bert.encoder.layer.8.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8843451738357544, "quantized": true}, "bert.encoder.layer.8.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9019758701324463, "quantized": true}, "bert.encoder.layer.8.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9253442883491516, "quantized": true}, "bert.encoder.layer.9.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8518574833869934, "quantized": true}, "bert.encoder.layer.9.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8523983359336853, "quantized": true}, "bert.encoder.layer.9.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8783705234527588, "quantized": true}, "bert.encoder.layer.9.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8867306113243103, "quantized": true}, "bert.encoder.layer.9.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9011484980583191, "quantized": true}, "bert.encoder.layer.9.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.91860032081604, "quantized": true}, "bert.encoder.layer.10.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8570064902305603, "quantized": true}, "bert.encoder.layer.10.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8588087558746338, "quantized": true}, "bert.encoder.layer.10.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.873399555683136, "quantized": true}, "bert.encoder.layer.10.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8768836259841919, "quantized": true}, "bert.encoder.layer.10.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.9072990417480469, "quantized": true}, "bert.encoder.layer.10.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9249801635742188, "quantized": true}, "bert.encoder.layer.11.attention.self.query.module.weight": {"numel": 589824, "sparsity": 0.8541887402534485, "quantized": true}, "bert.encoder.layer.11.attention.self.key.module.weight": {"numel": 589824, "sparsity": 0.8596123456954956, "quantized": true}, "bert.encoder.layer.11.attention.self.value.module.weight": {"numel": 589824, "sparsity": 0.8792198896408081, "quantized": true}, "bert.encoder.layer.11.attention.output.dense.module.weight": {"numel": 589824, "sparsity": 0.8855014443397522, "quantized": true}, "bert.encoder.layer.11.intermediate.dense.module.weight": {"numel": 2359296, "sparsity": 0.8986706137657166, "quantized": true}, "bert.encoder.layer.11.output.dense.module.weight": {"numel": 2359296, "sparsity": 0.9309417009353638, "quantized": true}, "bert.pooler.dense.module.weight": {"numel": 589824, "sparsity": 0.0, "quantized": true}, "classifier.module.weight": {"numel": 1536, "sparsity": 0.0, "quantized": true}}}
2023-08-21 13:07:18 sparseml.transformers.sparsification.trainer INFO     Reloaded model state after SparseML recipe structure modifications from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training
INFO:sparseml.transformers.sparsification.trainer:Reloaded model state after SparseML recipe structure modifications from /home/rahul/.cache/sparsezoo/1cd4de14-8c1a-471a-860f-213ed8d9ed54/training

  0%|          | 0/28 [00:00<?, ?it/s]
  7%|▋         | 2/28 [00:00<00:01, 19.52it/s]
 14%|█▍        | 4/28 [00:00<00:01, 12.77it/s]
 21%|██▏       | 6/28 [00:00<00:01, 11.72it/s]
 29%|██▊       | 8/28 [00:00<00:01, 11.23it/s]
 36%|███▌      | 10/28 [00:00<00:01, 11.01it/s]
 43%|████▎     | 12/28 [00:01<00:01, 10.87it/s]
 50%|█████     | 14/28 [00:01<00:01, 10.79it/s]
 57%|█████▋    | 16/28 [00:01<00:01, 10.74it/s]
 64%|██████▍   | 18/28 [00:01<00:00, 10.70it/s]
 71%|███████▏  | 20/28 [00:01<00:00, 10.67it/s]
 79%|███████▊  | 22/28 [00:01<00:00, 10.66it/s]
 86%|████████▌ | 24/28 [00:02<00:00, 10.64it/s]
 93%|█████████▎| 26/28 [00:02<00:00, 10.63it/s]
100%|██████████| 28/28 [00:02<00:00, 11.69it/s]
100%|██████████| 28/28 [00:02<00:00, 11.17it/s]
***** eval metrics *****
  eval_accuracy           =     0.9128
  eval_loss               =     0.3192
  eval_runtime            = 0:00:03.10
  eval_samples            =        872
  eval_samples_per_second =    280.444
  eval_steps_per_second   =      9.005

Explanation: The fix was three fold

default resolve_archive_file to None instead of empty list
Accept 6 inputs from _load_pretrained_model functions instead of 5
Remove unsupported tensor flow_v1 tests from workflow file

To see the specific tasks where the Asana app for GitHub is being used, see below:
- https://app.asana.com/0/0/1205286277261623

bfineran

LGTM pending updating version.py

…ed now

rahul-tuli added 2 commits August 21, 2023 13:09

Fix Index error with 1.5.1 transformers upgrade

64fb434

Deafult resolve_archive_file to None instead of empty list

53b5287

rahul-tuli assigned rahul-tuli and bfineran and unassigned bfineran Aug 21, 2023

rahul-tuli requested review from Satrat, bfineran, dsikka and dbogunowicz August 21, 2023 18:06

rahul-tuli added the mle-team label Aug 21, 2023

bfineran changed the title ~~[Cherry Pick] Fix Index error with 1.5.1 transformers upgrade~~ [Cherry Pick 1.5.4] Fix Index error with nm-transformers 1.5.1 upgrade Aug 21, 2023

bfineran reviewed Aug 21, 2023

View reviewed changes

Update Hotfix version

631e1ab

rahul-tuli marked this pull request as ready for review August 21, 2023 20:05

Remove failing tensorflow_v1 tests; they are outdated and not support…

bd3b29c

…ed now

bfineran approved these changes Aug 21, 2023

View reviewed changes

bfineran merged commit d0abbf3 into release/1.5 Aug 21, 2023
9 of 10 checks passed

bfineran deleted the index-error-cp branch August 21, 2023 20:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Cherry Pick 1.5.4] Fix Index error with nm-transformers 1.5.1 upgrade #1708

[Cherry Pick 1.5.4] Fix Index error with nm-transformers 1.5.1 upgrade #1708

rahul-tuli commented Aug 21, 2023 •

edited

Loading

bfineran left a comment

[Cherry Pick 1.5.4] Fix Index error with nm-transformers 1.5.1 upgrade #1708

[Cherry Pick 1.5.4] Fix Index error with nm-transformers 1.5.1 upgrade #1708

Conversation

rahul-tuli commented Aug 21, 2023 • edited Loading

bfineran left a comment

Choose a reason for hiding this comment

rahul-tuli commented Aug 21, 2023 •

edited

Loading