Sparsify.alpha.auto #179

rahul-tuli · 2023-02-22T11:54:20Z

This PR stitches together sparsify.auto with sparsifyml

Consists of the following pre-approved pieces:

And following unapproved pieces:

Run integration tests only if sparsifyml installed locally, else skip.
Additionally remove sparsifyml from nm_dependencies
Max steps propagation to teacher
Fix failing GHA Fix: failing GHA #183

Usage:

(sparsify) ~ sparsify.auto --help
usage: sparsify.auto [-h] --task TASK --dataset DATASET
                     [--save_directory SAVE_DIRECTORY]
                     [--performance PERFORMANCE] [--base_model BASE_MODEL]
                     [--recipe RECIPE] [--recipe_args RECIPE_ARGS]
                     [--distill_teacher DISTILL_TEACHER]
                     [--num_trials NUM_TRIALS]
                     [--max_train_time MAX_TRAIN_TIME]
                     [--maximum_trial_saves MAXIMUM_TRIAL_SAVES]
                     [--no_stopping] [--resume RESUME]
                     [--optimizing_metric OPTIMIZING_METRIC]
                     [--kwargs KWARGS] [--teacher_kwargs TEACHER_KWARGS]
                     [--tuning_parameters TUNING_PARAMETERS]
                     [--teacher_tuning_parameters TEACHER_TUNING_PARAMETERS]
                     [--teacher_only]

optional arguments:
  -h, --help            show this help message and exit
  --task TASK           task to train the sparsified model on
  --dataset DATASET     path to dataset to train on
  --save_directory SAVE_DIRECTORY
                        Absolute path to save directory
  --performance PERFORMANCE
                        Preferred tradeoff between accuracy and performance.
                        Can be a string or a float value in the range [0, 1].
                        Currently supported strings (and their respective
                        float values are `accuracy` (0), `balanced` (0.5),
                        and `performant` (1.0)
  --base_model BASE_MODEL
                        path to base model to begin sparsification from
  --recipe RECIPE       file path to or zoo stub of sparsification recipe to
                        be applied
  --recipe_args RECIPE_ARGS
                        keyword args to override recipe variables with
  --distill_teacher DISTILL_TEACHER
                        teacher to use for distillation. Can be a path to a
                        model file or zoo stub, 'off' for no distillation,
                        and default value of 'auto' to auto-tune base model
                        as teacher
  --num_trials NUM_TRIALS
                        Number of tuning trials to be run before returning
                        best found model. Set to None to not impose a trial
                        limit. max_train_time may limit the actual num_trials
                        ran
  --max_train_time MAX_TRAIN_TIME
                        Maximum number of hours to train before returning
                        best trained model.
  --maximum_trial_saves MAXIMUM_TRIAL_SAVES
                        Number of best trials to save on the drive. Items
                        saved for a trial include the trained model and
                        associated artifacts. If this value is set to n, then
                        at most n+1 models will be saved at any given time on
                        the machine. Default value of None allows for
                        unlimited model saving
  --no_stopping         Set to True to turn off tuning stopping condition,
                        which may end tuning early if no improvement was made
  --resume RESUME       To continue a tuning run, provide path to the high
                        level directory of run you wish to resume
  --optimizing_metric OPTIMIZING_METRIC
                        The criterion to search model for, multiple metrics
                        can be specified, e.g. --optimizing_metric f1
                        --optimizing_metric latency. Supported metrics are
                        ['accuracy', 'f1', 'recall', 'mAP', 'latency',
                        'throughput', 'compression', 'file_size',
                        'memory_usage']
  --kwargs KWARGS       optional task specific arguments to add to config
  --teacher_kwargs TEACHER_KWARGS
                        optional task specific arguments to add to teacher
                        config
  --tuning_parameters TUNING_PARAMETERS
                        path to config file containing custom parameter
                        tuning settings. See example tuning config output for
                        expected format
  --teacher_tuning_parameters TEACHER_TUNING_PARAMETERS
                        path to config file containing custom teacher
                        parameter tuning settings. See example tuning config
                        output for expected format
  --teacher_only        set to True to only auto tune the teacher

Example Commands:

Image Classification

sparsify.auto --task image_classification --dataset /home/ubuntu/datasets/imagenette/imagenette-160

NOTE: Using experimental fast data loading logic. To disable, pass
    "--load_fast=false" and report issues on GitHub. More details:
    https://github.com/tensorflow/tensorboard/issues/4784

TensorBoard listening on http://localhost:6006/

*************************SPARSIFY AUTO**********************
Starting hyperparameter tuning on student model
************************************************************

INFO:auto_banner:Starting hyperparameter tuning on student model

*************************SPARSIFY AUTO**********************
Starting tuning trial #0
************************************************************

INFO:auto_banner:Starting tuning trial #0
Model id is set to sparsify_auto_image_classification
2023-02-22 18:38:59 __main__     INFO     created model with key resnet50: ResNet(
  (input): _Input(
    (conv): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act): ReLU(inplace=True)
    (pool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    
    .
    .
    .
    .
          (activation_post_process): Identity()
        )
      )
    )
  )
  (classifier): _Classifier(
    (avgpool): AdaptiveAvgPool2d(output_size=1)
    (fc): Linear(in_features=2048, out_features=10, bias=True)
    (softmax): Softmax(dim=1)
  )
)
2023-02-22 18:38:59 __main__     INFO     running on device cuda:0
INFO:__main__:running on device cuda:0
Test epoch -1/-1: 100%|██████████████████████████| 8/8 [00:01<00:00,  5.14it/s]
2023-02-22 18:39:01 __main__     INFO
Initial validation results: ModuleRunResults(__loss__=2.218540668487549, top1acc=12.199999809265137, top5acc=69.4000015258789)
INFO:__main__:
Initial validation results: ModuleRunResults(__loss__=2.218540668487549, top1acc=12.199999809265137, top5acc=69.4000015258789)
2023-02-22 18:39:01 __main__     INFO     Starting training from epoch 0
INFO:__main__:Starting training from epoch 0
Train epoch 0/206:  87%|██████████████████▎  | 176/202 [00:21<

Transformers QA

(sparsify) ~ sparsify.auto --task question_answering --dataset squad
.
.
.
.

  warnings.warn(
[INFO|trainer.py:1607] 2023-02-22 21:54:15,467 >> ***** Running training *****
[INFO|trainer.py:1608] 2023-02-22 21:54:15,467 >>   Num examples = 88524
[INFO|trainer.py:1609] 2023-02-22 21:54:15,467 >>   Num Epochs = 3
[INFO|trainer.py:1610] 2023-02-22 21:54:15,467 >>   Instantaneous batch size per device = 8
[INFO|trainer.py:1611] 2023-02-22 21:54:15,467 >>   Total train batch size (w. parallel, distributed & accumulation) = 8
[INFO|trainer.py:1612] 2023-02-22 21:54:15,467 >>   Gradient Accumulation steps = 1
[INFO|trainer.py:1613] 2023-02-22 21:54:15,467 >>   Total optimization steps = 33198
  0%|                                                | 0/33198 [00:00<?, ?it/s][W reducer.cpp:1258] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())
  0%|▏                                   | 157/33198 [00:31<1:49:56,  5.01it/s

Depends on: https://github.com/neuralmagic/sparsifyml/pull/25

Update: stitch functions slightly

bfineran

LGTM pending GHA issues, are these related?

KSGulin

LGTM, good stuff @rahul-tuli! We do need some verification that the end to end runs are producing the expected results. We can land without the full testing, but let's be ready to follow up with another PR in case there's updates to be made

rahul-tuli · 2023-02-28T22:30:49Z

LGTM, good stuff @rahul-tuli! We do need some verification that the end to end runs are producing the expected results. We can land without the full testing, but let's be ready to follow up with another PR in case there's updates to be made

I agree, I ran a few runs that ended fine, but will keep an eye out for solidifying the code

rahul-tuli · 2023-02-28T22:31:57Z

Note: CLOSED this PR by mistake, reopened now

implement own strtobool function

* Update: sparsify.version to match with main * Delete: sparsify.package * Empty commit * Add: stitch functions * Update: Env var name Update: stitch functions slightly * Add: Sparsifyml to dependencies in setup.py * Style: Fixes * Some more fixers * OLD IC integration working * Run Integration Tests only when sparsifyml installed * Fix yolov5 integration * Propagate student args to teacher * Update teacher kwargs only when key not present for safety * Updated: integration_test * Updated: num trials to 2 * Fix: failing GHA * make sparsifyml optional implement own strtobool function

@corey-nm

* Clear existing sparsify source * Add back version file * Port of sparsify.auto from private repository (#124) * remove javascript deps * Initial port of autosparse to sparsify.auto * Initial port autosparse -> sparsify.auto * Added tests and fixes * Add back yarn * Add github workflow for test checks * Update workflows Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com> workflow * Add GHA tests for base, package, and auto (#133) * `sparsify.package` base CLI (#125) * bump up main to 1.2.0 (#128) Co-authored-by: dhuang <dhuang@MacBook-Pro.local> * Adds the following: * Setup directory Structure * `from sparsify import package` importable + callable function * A constants file with supported tasks, criterions, and deployment scenarios (Should probably converted to `Enums` or something better than `python lists`) * Add `click` as a required dependency * Additional CLI helpers for updated acceptance criterion * `sparsify.package` cli utility * setup test directory * Add tests for CLI * Setup Entrypoints * Remove old docstring * - Moved utils outside `package` - Renamed package_ to package - Add more tests - Update Usage command - Rebased on `sparsify.alpha` - Add typing - Add version info to cli Apply review comments from @corey-nm - Remove `cli_helpers.py` and rely on `click` * Remove unintended change added while resolving merge conflicts * Style * Add dataset registry update cli to use dataset registry * Fix failing tests * Centralize task registry (#132) * Centralize task name and alias handeling * Propagate TaskName updates to auto tasks * Fix click parse args call * Fix failing tests after TASK name updates * Prevent auto install of integrations on sparsify import (#134) * * Change `NO_VNNI` --> `DEFAULT` * Refactor CLI arg parsing cause originally `System.exit()` was thrown on invoking help * Rename `scenario` --> `target` * Remove single character shortcuts, as suggested by @bfineran * Default directory to `None` for now, logic to choose an appropriate name will be added to diff #130 * Added show defaults at the top level `click.command()` decorator * Added a `DEFAULT_OPTIMIZNG_METRIC` * Added a `DEFAULT_DEPLOYMENT_SCENARIO` * Changed `optimizing_metric` help message * Updated Tests * - Style - Example Usage Co-authored-by: dhuangnm <74931910+dhuangnm@users.noreply.github.com> Co-authored-by: dhuang <dhuang@MacBook-Pro.local> Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com> * Add DDP support (#126) * `sparsify.package` backend-call (#130) * bump up main to 1.2.0 (#128) Co-authored-by: dhuang <dhuang@MacBook-Pro.local> * Adds the following: * Setup directory Structure * `from sparsify import package` importable + callable function * A constants file with supported tasks, criterions, and deployment scenarios (Should probably converted to `Enums` or something better than `python lists`) * Add `click` as a required dependency * Additional CLI helpers for updated acceptance criterion * `sparsify.package` cli utility * setup test directory * Add tests for CLI * Setup Entrypoints * Remove old docstring * - Moved utils outside `package` - Renamed package_ to package - Add more tests - Update Usage command - Rebased on `sparsify.alpha` - Add typing - Add version info to cli Apply review comments from @corey-nm - Remove `cli_helpers.py` and rely on `click` * Remove unintended change added while resolving merge conflicts * Style * Add dataset registry update cli to use dataset registry * Fix failing tests * Centralize task registry (#132) * Centralize task name and alias handeling * Propagate TaskName updates to auto tasks * Fix click parse args call * Fix failing tests after TASK name updates * Prevent auto install of integrations on sparsify import (#134) * * Change `NO_VNNI` --> `DEFAULT` * Refactor CLI arg parsing cause originally `System.exit()` was thrown on invoking help * Rename `scenario` --> `target` * Remove single character shortcuts, as suggested by @bfineran * Default directory to `None` for now, logic to choose an appropriate name will be added to diff #130 * Added show defaults at the top level `click.command()` decorator * Added a `DEFAULT_OPTIMIZNG_METRIC` * Added a `DEFAULT_DEPLOYMENT_SCENARIO` * Changed `optimizing_metric` help message * Updated Tests * - Style - Example Usage * Add proper commands + gha workflows * Refactor package function to make a call to the backend service * Add template function for output Add importable Backend Base url Remove unnecessary args from package function Add end to end integration test * Updated tests, addressed comments * Base Cli + importable function * Style * Remove files added in faulty rebase * Changed base url, styling Co-authored-by: dhuangnm <74931910+dhuangnm@users.noreply.github.com> Co-authored-by: dhuang <dhuang@MacBook-Pro.local> Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com> Co-authored-by: Konstantin <konstantin@neuralmagic.com> * `sparsify.package` updates (#141) * Update output to also print model metrics Update `--optimizing_metrics` to take in a string containing comma separated metrics for example `--optimizing_metric "compression, accuracy"`(added a `_csv_callback` function for that) Update Usage instructions accordingly Add a log statement to package function Added more tests * Address comments * Rename `normalized_metric` --> `metric_` to avoid potential confusion * Add a getter for TASK_REGISTRY and DATASET_REGISTRY (#142) * Add a getter for TASK_REGISTRY and DATASET_REGISTRY * typing * fix potential bug * Add None to test * Updated tests according to comments from @bfineran * Make test cleaner based on feedback from @corey-nm * Remove config creator (#136) * [Auto] Add Tensorboard Support (#147) * Support for Hyperparameter Tuning (#145) * force convert yolov5 metric keys to float (#151) * [Auto] Update function name and description to be more generic (#149) * rename and flip logic for stopping_condition flag (#152) * [Auto] Support for multi-stage tuning (#157) * Support for updated tuning flow (#159) * Support tuning of CLI args (#158) * Support multiple optimizing metrics (#160) * Log important updates with an easily visible format (#161) * Update the user output for `sparsify.package` (#166) * Add Dockerfile Download deployment directory, and Update instructions for user Update tests * Add volume mount to docker command * [Auto] Update interface for sparsifyml (#173) * Fix: remove debug line * Update sparsify.auto interface for sparsifyml * rename interface -> schemas * Sparsify.alpha.auto (#179) * Update: sparsify.version to match with main * Delete: sparsify.package * Empty commit * Add: stitch functions * Update: Env var name Update: stitch functions slightly * Add: Sparsifyml to dependencies in setup.py * Style: Fixes * Some more fixers * OLD IC integration working * Run Integration Tests only when sparsifyml installed * Fix yolov5 integration * Propagate student args to teacher * Update teacher kwargs only when key not present for safety * Updated: integration_test * Updated: num trials to 2 * Fix: failing GHA * make sparsifyml optional implement own strtobool function * [Create] alpha implementation (#181) * [Create] alpha implementation * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com> --------- Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com> * Adding one shot cli (#184) * [Feature branch] standard clis (#187) * Adding skeleton clis * [CLI standardization] sparsify.run one-shot impl (#188) * [CLI standardization] sparsify.run one-shot impl * Fixing one-shot cli --------- Co-authored-by: Corey Lowman <corey@neuralmagic.com> * [WIP][CLI standardization] sparsify.run training-aware and spares-transfer initial impl (#189) * [CLI standardization] sparsify.run one-shot impl * [WIP][CLI standardization] sparsify.run training-aware and spares-transfer initial impl * Fixing training-aware/sparse-transfer --------- Co-authored-by: Corey Lowman <corey@neuralmagic.com> * Adding docstring to sparsify.run * Moving use case to top arg * Removing apply/init --------- Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com> * Style changes for sparsify.alpha (#194) * Update: Minimum supported Python Version to `3.7` as it's consistent with our other repos (#193) * [Add] `sparsify.login` CLI and function (#180) * Adding sparsify.login entrypoint and function * Adding docstring to exception * Adding pip install of sparsifyml * Respond to review * Adding help message at top * Adding setup python to workflow * Adding checked sparsifyml import * Apply suggestions from code review Co-authored-by: Danny Guinther <dannyguinther@gmail.com> * check against major minor version only * add client_id and other bug fixes * Fix: `--index` --> `--index-url` * Update install command missed during rebase * * Clean up code * Remove Global variables * Update PyPi Server link * Add Logging * Move exceptions to their own file * Style fixes * Apply suggestions from code review Add: suggestion from @KSGulin Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com> * Update src/sparsify/login.py Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com> * remove comment --------- Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com> Co-authored-by: Danny Guinther <dannyguinther@gmail.com> Co-authored-by: Benjamin <ben@neuralmagic.com> Co-authored-by: rahul-tuli <rahul@neuralmagic.com> Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com> * training aware and sparse transfer run mode support (#191) * add sparsifyml dependencies to sparsify install (#195) * update task registry + generalize matching (#201) * rename performance to optim-level in legacy auto api (#199) * [sparsify.run one-shot] CLI propagation of recipe_args (#198) * Remove hardware optimization options (#200) * Remove hardware optimization options * Rename instead of remove optim_level * Add OPTIM_LEVEL back to all list * simple fixes in initial one-shot testing flow (#206) * fixes for initial E2E runs of sparse transfer and training aware (#207) * fixes for initial E2E runs of sparse transfer and training aware * quality * [Alpha] Rework Auto main script into Training-Aware and Sparse-Transfer script (#208) * Initial scratch work * Complete, but untested implementation * Working yolov5 * Working across all integrations * IC path fix * Require model * Remove debug adds * make API KEY an argument (#211) * Update integration and unit tests (#214) * Update integration and unit tests * Update IC base test model * Add login step to test setup (#216) * bump up version to 1.6.0 (#215) (#218) Co-authored-by: dhuang <dhuang@MacBook-Pro-2.local> (cherry picked from commit 699a476) Co-authored-by: dhuangnm <74931910+dhuangnm@users.noreply.github.com> * [BugFixes] Fix failing tests in `sparsify.alpha` (#223) * Intermediate commit should be amended * Remove failing test as synced with @KSGulin * Explicitly pin protobuff depencies. (#225) * Default num_samples to None (#227) * remove legacy UI cmds from `make build` (#229) * Remove dev print statements from IC runner (#231) * Remove dev print statements * Remove logger * Fix incomplete wheel build (#232) * Fix incomplete wheel build * Add license string * Add environment hecks * Address review comments * Catch generic Exception * signal test --------- Co-authored-by: Rahul Tuli <rahul@neuralmagic.com> Co-authored-by: dhuangnm <74931910+dhuangnm@users.noreply.github.com> Co-authored-by: dhuang <dhuang@MacBook-Pro.local> Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com> Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com> Co-authored-by: Danny Guinther <dannyguinther@gmail.com> Co-authored-by: Benjamin <ben@neuralmagic.com>

rahul-tuli added 13 commits February 17, 2023 14:41

Update: sparsify.version to match with main

62346fc

Delete: sparsify.package

0432878

Empty commit

e9a08fa

Add: stitch functions

8fcc4f4

Update: Env var name

8c57ebd

Update: stitch functions slightly

Add: Sparsifyml to dependencies in setup.py

3838959

Merge branch 'sparsify.alpha-delete-package' into sparsify.alpha.auto

6204b89

Merge branch 'sparsify.alpha-sync-version' into sparsify.alpha.auto

56e61a8

Style: Fixes

e59e64b

Some more fixers

67ef84a

OLD IC integration working

1b776de

Run Integration Tests only when sparsifyml installed

91cb0a3

Fix yolov5 integration

3677bda

bfineran previously approved these changes Feb 23, 2023

View reviewed changes

Propagate student args to teacher

29393b5

rahul-tuli dismissed bfineran’s stale review via 29393b5 February 28, 2023 15:51

rahul-tuli self-assigned this Feb 28, 2023

rahul-tuli requested review from KSGulin, corey-nm and dbogunowicz February 28, 2023 15:55

rahul-tuli added the mle-team label Feb 28, 2023

rahul-tuli marked this pull request as ready for review February 28, 2023 15:55

rahul-tuli marked this pull request as draft February 28, 2023 15:56

Update teacher kwargs only when key not present for safety

b2908d5

rahul-tuli marked this pull request as ready for review February 28, 2023 16:12

KSGulin previously approved these changes Feb 28, 2023

View reviewed changes

Updated: integration_test

4aa101e

rahul-tuli dismissed KSGulin’s stale review via 4aa101e February 28, 2023 19:53

Updated: num trials to 2

a271661

bfineran previously approved these changes Feb 28, 2023

View reviewed changes

rahul-tuli closed this Feb 28, 2023

rahul-tuli reopened this Feb 28, 2023

KSGulin previously approved these changes Mar 1, 2023

View reviewed changes

Fix: failing GHA

a1e1fe4

rahul-tuli dismissed stale reviews from KSGulin and bfineran via a1e1fe4 March 1, 2023 14:26

make sparsifyml optional

d9c05b6

implement own strtobool function

bfineran approved these changes Mar 1, 2023

View reviewed changes

KSGulin approved these changes Mar 1, 2023

View reviewed changes

rahul-tuli merged commit 7600111 into sparsify.alpha Mar 1, 2023

rahul-tuli deleted the sparsify.alpha.auto branch March 1, 2023 15:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparsify.alpha.auto #179

Sparsify.alpha.auto #179

rahul-tuli commented Feb 22, 2023 •

edited

Loading

bfineran left a comment

KSGulin left a comment

rahul-tuli commented Feb 28, 2023

rahul-tuli commented Feb 28, 2023

Sparsify.alpha.auto #179

Sparsify.alpha.auto #179

Conversation

rahul-tuli commented Feb 22, 2023 • edited Loading

bfineran left a comment

Choose a reason for hiding this comment

KSGulin left a comment

Choose a reason for hiding this comment

rahul-tuli commented Feb 28, 2023

rahul-tuli commented Feb 28, 2023

rahul-tuli commented Feb 22, 2023 •

edited

Loading