[BugFix] Delay torch import until needed for `deepsparse.transformers.eval_downstream` #1187

rahul-tuli · 2023-08-16T19:17:07Z

Bug Description

A dependency issue where torch isn't installed with deepsparse[transformers]

#!/usr/bin/env bash

# Exit on error, undefined variables, and errors in piped commands
set -euf -o pipefail

# This script is used to reproduce the bug in the following issue and test the fix:
#  https://app.asana.com/0/1205229323407165/1205234254788114/f

# The bug is that torch is not installed as a part of deepsparse[transformers]
#  but is needed for `deepsparse.transformers.eval_downstream`
# for metrics calculation.


# Delete existing virtual environment
rm -rf venv

# Create a new virtual environment
python3 -m venv venv

# Activate the virtual environment
source venv/bin/activate

# Upgrade pip
pip install --upgrade pip

# Install deepsparse[transformers]
pip install -e "./[transformers]"

# Invoke problem command
deepsparse.transformers.eval_downstream --help

Before This PR, above script fails with the following error:

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Traceback (most recent call last):
  File "/home/rahul/projects/deepsparse/venv/bin/deepsparse.transformers.eval_downstream", line 5, in <module>
    from deepsparse.transformers.eval_downstream import main
  File "/home/rahul/projects/deepsparse/src/deepsparse/transformers/eval_downstream.py", line 72, in <module>
    from deepsparse.transformers.metrics import Perplexity, PrecisionRecallF1
  File "/home/rahul/projects/deepsparse/src/deepsparse/transformers/metrics.py", line 26, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'

After this PR, above script succeeds with the following output:

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
usage: deepsparse.transformers.eval_downstream [-h] -d {squad,squad_v2,mnli,qqp,sst2,sst2_zero_shot,imdb,conll2003,go_emotions,openai_humaneval} [-v VAL_RATIO] [-s VAL_SPLIT_SEED] [-c NUM_CORES] [-e {deepsparse,onnxruntime}]
                                               [--max-sequence-length MAX_SEQUENCE_LENGTH] [--max-samples MAX_SAMPLES] [-o OUTPUT_DIR] [--profile] [--max-answer-length MAX_ANSWER_LENGTH]
                                               [--version-2-with-negative VERSION_2_WITH_NEGATIVE] [--pad-to-max-length PAD_TO_MAX_LENGTH] [--n-best-size N_BEST_SIZE] [--zero-shot ZERO_SHOT]
                                               model_path

Evaluate a Hugging Face Transformers ONNX model on a downstream dataset

positional arguments:
  model_path            The path to a directory containing model.onnx, config.json, and tokenizer.json files or SparseZoo stub to the model

optional arguments:
  -h, --help            show this help message and exit
  -d {squad,squad_v2,mnli,qqp,sst2,sst2_zero_shot,imdb,conll2003,go_emotions,openai_humaneval}, --dataset {squad,squad_v2,mnli,qqp,sst2,sst2_zero_shot,imdb,conll2003,go_emotions,openai_humaneval}
  -v VAL_RATIO, --val_ratio VAL_RATIO
                        Ratio between 0.0 and 1.0 representing the proportion of the dataset include in the validation set
  -s VAL_SPLIT_SEED, --val_split_seed VAL_SPLIT_SEED
                        Random seed used to split the validation set, used with the --val_ratio flag. Default to 42.
  -c NUM_CORES, --num-cores NUM_CORES
                        The number of physical cores to run the eval on, defaults to all physical cores available on the system
  -e {deepsparse,onnxruntime}, --engine {deepsparse,onnxruntime}
                        Inference engine backend to run eval on. Choices are 'deepsparse', 'onnxruntime'. Default is 'deepsparse'
  --max-sequence-length MAX_SEQUENCE_LENGTH
                        the max sequence length for model inputs. Default is 384
  --max-samples MAX_SAMPLES
                        the max number of samples to evaluate. Default is None or all samples
  -o OUTPUT_DIR, --output-dir OUTPUT_DIR
                        Folder to save output predictions, used for debugging
  --profile             Run with profiling, used for debugging
  --max-answer-length MAX_ANSWER_LENGTH
                        The maximum length of an answer that can be generated. This is needed because the start and end predictions are not conditioned on one another.
  --version-2-with-negative VERSION_2_WITH_NEGATIVE
                        Whether or not the underlying dataset contains examples with no answers
  --pad-to-max-length PAD_TO_MAX_LENGTH
                        Whether to pad all samples to `max_seq_length`. If False, will pad the samples dynamically when batching to the maximum length in the batch (which can be faster on GPU but will be slower on TPU).
  --n-best-size N_BEST_SIZE
                        The total number of n-best predictions to generate when looking for an answer.
  --zero-shot ZERO_SHOT
                        Whether to run the dataset with a zero shot pipeline. Currently supports sst2. Default is False

…ppropriate error when not installed.

src/deepsparse/transformers/metrics.py

rahul-tuli · 2023-08-17T17:38:22Z

Failing test should be green after: neuralmagic/sparsezoo#357

Delay importing torch until needed from transformers pathway, raise a…

d2ed615

…ppropriate error when not installed.

rahul-tuli requested review from bfineran, dbogunowicz, dsikka and Satrat August 16, 2023 19:17

rahul-tuli self-assigned this Aug 16, 2023

rahul-tuli added the bug Something isn't working label Aug 16, 2023

bfineran approved these changes Aug 16, 2023

View reviewed changes

dsikka approved these changes Aug 16, 2023

View reviewed changes

dbogunowicz approved these changes Aug 16, 2023

View reviewed changes

src/deepsparse/transformers/metrics.py Show resolved Hide resolved

Merge branch 'main' into bugfix-delay-torch-import

df79081

rahul-tuli merged commit 8dbcb0c into main Aug 22, 2023
7 checks passed

rahul-tuli deleted the bugfix-delay-torch-import branch August 22, 2023 18:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Delay torch import until needed for `deepsparse.transformers.eval_downstream` #1187

[BugFix] Delay torch import until needed for `deepsparse.transformers.eval_downstream` #1187

rahul-tuli commented Aug 16, 2023 •

edited

Loading

rahul-tuli commented Aug 17, 2023

[BugFix] Delay torch import until needed for deepsparse.transformers.eval_downstream #1187

[BugFix] Delay torch import until needed for deepsparse.transformers.eval_downstream #1187

Conversation

rahul-tuli commented Aug 16, 2023 • edited Loading

Bug Description

rahul-tuli commented Aug 17, 2023

[BugFix] Delay torch import until needed for `deepsparse.transformers.eval_downstream` #1187

[BugFix] Delay torch import until needed for `deepsparse.transformers.eval_downstream` #1187

rahul-tuli commented Aug 16, 2023 •

edited

Loading