Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Delay torch import until needed for deepsparse.transformers.eval_downstream #1187

Merged
merged 2 commits into from
Aug 22, 2023

Conversation

rahul-tuli
Copy link
Member

@rahul-tuli rahul-tuli commented Aug 16, 2023

Bug Description

A dependency issue where torch isn't installed with deepsparse[transformers]

#!/usr/bin/env bash

# Exit on error, undefined variables, and errors in piped commands
set -euf -o pipefail

# This script is used to reproduce the bug in the following issue and test the fix:
#  https://app.asana.com/0/1205229323407165/1205234254788114/f

# The bug is that torch is not installed as a part of deepsparse[transformers]
#  but is needed for `deepsparse.transformers.eval_downstream`
# for metrics calculation.


# Delete existing virtual environment
rm -rf venv

# Create a new virtual environment
python3 -m venv venv

# Activate the virtual environment
source venv/bin/activate

# Upgrade pip
pip install --upgrade pip

# Install deepsparse[transformers]
pip install -e "./[transformers]"

# Invoke problem command
deepsparse.transformers.eval_downstream --help

Before This PR, above script fails with the following error:

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Traceback (most recent call last):
  File "/home/rahul/projects/deepsparse/venv/bin/deepsparse.transformers.eval_downstream", line 5, in <module>
    from deepsparse.transformers.eval_downstream import main
  File "/home/rahul/projects/deepsparse/src/deepsparse/transformers/eval_downstream.py", line 72, in <module>
    from deepsparse.transformers.metrics import Perplexity, PrecisionRecallF1
  File "/home/rahul/projects/deepsparse/src/deepsparse/transformers/metrics.py", line 26, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'

After this PR, above script succeeds with the following output:

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
usage: deepsparse.transformers.eval_downstream [-h] -d {squad,squad_v2,mnli,qqp,sst2,sst2_zero_shot,imdb,conll2003,go_emotions,openai_humaneval} [-v VAL_RATIO] [-s VAL_SPLIT_SEED] [-c NUM_CORES] [-e {deepsparse,onnxruntime}]
                                               [--max-sequence-length MAX_SEQUENCE_LENGTH] [--max-samples MAX_SAMPLES] [-o OUTPUT_DIR] [--profile] [--max-answer-length MAX_ANSWER_LENGTH]
                                               [--version-2-with-negative VERSION_2_WITH_NEGATIVE] [--pad-to-max-length PAD_TO_MAX_LENGTH] [--n-best-size N_BEST_SIZE] [--zero-shot ZERO_SHOT]
                                               model_path

Evaluate a Hugging Face Transformers ONNX model on a downstream dataset

positional arguments:
  model_path            The path to a directory containing model.onnx, config.json, and tokenizer.json files or SparseZoo stub to the model

optional arguments:
  -h, --help            show this help message and exit
  -d {squad,squad_v2,mnli,qqp,sst2,sst2_zero_shot,imdb,conll2003,go_emotions,openai_humaneval}, --dataset {squad,squad_v2,mnli,qqp,sst2,sst2_zero_shot,imdb,conll2003,go_emotions,openai_humaneval}
  -v VAL_RATIO, --val_ratio VAL_RATIO
                        Ratio between 0.0 and 1.0 representing the proportion of the dataset include in the validation set
  -s VAL_SPLIT_SEED, --val_split_seed VAL_SPLIT_SEED
                        Random seed used to split the validation set, used with the --val_ratio flag. Default to 42.
  -c NUM_CORES, --num-cores NUM_CORES
                        The number of physical cores to run the eval on, defaults to all physical cores available on the system
  -e {deepsparse,onnxruntime}, --engine {deepsparse,onnxruntime}
                        Inference engine backend to run eval on. Choices are 'deepsparse', 'onnxruntime'. Default is 'deepsparse'
  --max-sequence-length MAX_SEQUENCE_LENGTH
                        the max sequence length for model inputs. Default is 384
  --max-samples MAX_SAMPLES
                        the max number of samples to evaluate. Default is None or all samples
  -o OUTPUT_DIR, --output-dir OUTPUT_DIR
                        Folder to save output predictions, used for debugging
  --profile             Run with profiling, used for debugging
  --max-answer-length MAX_ANSWER_LENGTH
                        The maximum length of an answer that can be generated. This is needed because the start and end predictions are not conditioned on one another.
  --version-2-with-negative VERSION_2_WITH_NEGATIVE
                        Whether or not the underlying dataset contains examples with no answers
  --pad-to-max-length PAD_TO_MAX_LENGTH
                        Whether to pad all samples to `max_seq_length`. If False, will pad the samples dynamically when batching to the maximum length in the batch (which can be faster on GPU but will be slower on TPU).
  --n-best-size N_BEST_SIZE
                        The total number of n-best predictions to generate when looking for an answer.
  --zero-shot ZERO_SHOT
                        Whether to run the dataset with a zero shot pipeline. Currently supports sst2. Default is False

@rahul-tuli rahul-tuli self-assigned this Aug 16, 2023
@rahul-tuli rahul-tuli added the bug Something isn't working label Aug 16, 2023
@rahul-tuli
Copy link
Member Author

Failing test should be green after: neuralmagic/sparsezoo#357

@rahul-tuli rahul-tuli merged commit 8dbcb0c into main Aug 22, 2023
7 checks passed
@rahul-tuli rahul-tuli deleted the bugfix-delay-torch-import branch August 22, 2023 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants