Benchmark Script for Pipelines #1150

Satrat · 2023-07-27T21:44:16Z

Adding a script for benchmarking pipelines, which reports the amount of compute time spent in each phase of the pipeline. This allows us to identify bottlenecks outside the engine in pre or post processing.

Example Usage

deepsparse.benchmark_pipeline "text_classification" "zoo:nlp/sentiment_analysis/distilbert-none/pytorch/huggingface/sst2/pruned90-none" -c "config.json"

Based on the pipeline argument, the script will infer what type of data to generate or search for (See get_input_schema_type for details)

Config File Documentation

Configurations for generating or loading data to the pipeline are specified as JSON, documented in the README. A quick summary:

Set data_type to "real" to pull text or image data from data_folder, or set it to "dummy" to use randomly generated data. * In dummy mode, string lengths are set with gen_sequence_length and image shapes are set by input_image_shape
In real mode, max_string_length will truncate input text if >0, set to -1 for no truncation
data_folder is a path to either images(.jpg, .jpeg, .gif) or text(.txt) files, to be read in real mode
set 'recursive_searchto true to recursively searchdata_folder`
additional keyword arguments to pipeline.Pipeline() can be added to pipeline_kwargs
additional keyword arguments to Pipeline.input_schema() can be added to input_schema_kwargs

Example:

{
    "data_type": "dummy",
    "gen_sequence_length": 100,
    "input_image_shape": [500,500,3],
    "data_folder": "/home/sadkins/imagenette2-320/",
    "recursive_search": true,
    "max_string_length": -1, 
    "pipeline_kwargs": {},
    "input_schema_kwargs": {}
}

Testing

Added unit tests to test_pipeline_benchmark.py, and also manually tested the following pipelines:

text_classification: deepsparse.benchmark_pipeline text_classification zoo:nlp/sentiment_analysis/distilbert-none/pytorch/huggingface/sst2/pruned90-none -c tests/test_data/pipeline_bench_config.json
image_classification: deepsparse.benchmark_pipeline image_classification zoo:cv/classification/resnet_v1-50_2x/pytorch/sparseml/imagenet/base-none -c tests/test_data/pipeline_bench_config.json
text_generation: deepsparse.benchmark_pipeline text_generation zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base_quant-none -c tests/test_data/pipeline_bench_config.json
yolo: deepsparse.benchmark_pipeline yolo zoo:cv/detection/yolov5-l/pytorch/ultralytics/coco/pruned_quant-aggressive_95-c tests/test_data/pipeline_bench_config.json
question_answering: deepsparse.benchmark_pipeline question_answering zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni -c tests/test_data/pipeline_bench_config.json
token_generation: deepsparse.benchmark_pipeline token_classification zoo:nlp/token_classification/distilbert-none/pytorch/huggingface/conll2003/pruned90-none -c tests/test_data/pipeline_bench_config.json

bfineran

LGTM overall - could you add an example of the expected output format either to the PR or readme?

Satrat · 2023-08-04T21:54:30Z

LGTM overall - could you add an example of the expected output format either to the PR or readme?

Added to the README!

src/deepsparse/benchmark/benchmark_pipeline.py

…e into pipeline-benchmark

rahul-tuli

The code looks very close, had a few comments, and nits: mostly around docstrings.

Additionally, something to think about, as we generally support yaml for all our recipes and other configs(like deepsparse server); would it make sense to use yaml over json here too for consistency? We could also support both I'm in favor of that, but really think we should atleast support yaml

src/deepsparse/benchmark/helpers.py

src/deepsparse/benchmark/data_creation.py

rahul-tuli

bfineran · 2023-08-17T21:08:29Z

GHA failures unrelated, merging

Satrat added 11 commits July 21, 2023 17:30

WIP pipeline benchmark script

bff0e03

simple script

e26eaa7

Merge branch 'main' into pipeline-benchmark

218db5f

share code and cleanup

7732296

adding additional cmd line arguments

956dbe8

image and text inputs

6cbc99e

json export of statistics

0143d31

clean up printed output

58edf05

adding support for real data

75bda3a

support for additional pipelines

b751e75

expanding input schemas, allowing for kwargs

76a5af9

Satrat requested review from bfineran, dsikka, rahul-tuli and dbogunowicz July 27, 2023 21:44

Satrat added 4 commits July 28, 2023 11:00

README, quality, additional args

6cb6bef

moving code around, update README

75f5173

Merge branch 'main' into pipeline-benchmark

6148962

adding unit tests

9202a6f

Satrat changed the title ~~WIP: Benchmark Script for Pipelines~~ Benchmark Script for Pipelines Jul 28, 2023

Satrat added 2 commits July 28, 2023 16:21

Merge branch 'main' into pipeline-benchmark

5c94bb1

adding missing test file

2ed0185

Satrat marked this pull request as ready for review July 28, 2023 21:13

Satrat added 7 commits July 31, 2023 10:58

skipping test w/high memory usage

729447e

skip test with high memory usage

abb4811

unit test memory

8cdbe9b

add tests back in

1058f0b

add tests back in

249e645

fix async percentages

ba8688b

fix new quality errors

ecf1559

cleanup code, replace argpase with click

50d5a74

bfineran reviewed Aug 4, 2023

View reviewed changes

Update README with example output

70f7440

Satrat added 2 commits August 4, 2023 17:55

Merge branch 'main' into pipeline-benchmark

4c0396b

Merge branch 'main' into pipeline-benchmark

9c398ca

rahul-tuli reviewed Aug 9, 2023

View reviewed changes

src/deepsparse/benchmark/benchmark_pipeline.py Show resolved Hide resolved

Satrat added 8 commits August 9, 2023 16:02

support for multiple timers, adding docstrings

3afeec7

Merge branch 'pipeline-benchmark' of github.com:neuralmagic/deepspars…

86bb3d5

…e into pipeline-benchmark

Merge branch 'main' into pipeline-benchmark

2dfcac2

docstrings

df9a3f7

Merge branch 'pipeline-benchmark' of github.com:neuralmagic/deepspars…

d8238bc

…e into pipeline-benchmark

add text generation example to README

b0bc840

clean up timermanager usage

eba70d6

Merge branch 'main' into pipeline-benchmark

e9b2367

bfineran previously approved these changes Aug 10, 2023

View reviewed changes

rahul-tuli requested changes Aug 14, 2023

View reviewed changes

Satrat added 3 commits August 14, 2023 10:35

Merge branch 'main' into pipeline-benchmark

b5fb5b5

PR comments

1eb3202

style

289f545

Satrat dismissed bfineran’s stale review via 289f545 August 15, 2023 16:42

Satrat added 2 commits August 15, 2023 12:53

PR comments

749a752

Merge branch 'main' into pipeline-benchmark

427e9c0

rahul-tuli approved these changes Aug 17, 2023

View reviewed changes

Satrat requested a review from bfineran August 17, 2023 20:46

Merge branch 'main' into pipeline-benchmark

6264961

bfineran approved these changes Aug 17, 2023

View reviewed changes

bfineran merged commit 545348b into main Aug 17, 2023
5 of 7 checks passed

bfineran deleted the pipeline-benchmark branch August 17, 2023 21:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark Script for Pipelines #1150

Benchmark Script for Pipelines #1150

Satrat commented Jul 27, 2023 •

edited

Loading

bfineran left a comment

Satrat commented Aug 4, 2023

rahul-tuli left a comment

rahul-tuli left a comment

bfineran commented Aug 17, 2023

Benchmark Script for Pipelines #1150

Benchmark Script for Pipelines #1150

Conversation

Satrat commented Jul 27, 2023 • edited Loading

Example Usage

Config File Documentation

Testing

bfineran left a comment

Choose a reason for hiding this comment

Satrat commented Aug 4, 2023

rahul-tuli left a comment

Choose a reason for hiding this comment

rahul-tuli left a comment

Choose a reason for hiding this comment

bfineran commented Aug 17, 2023

Satrat commented Jul 27, 2023 •

edited

Loading