Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparsify.alpha.auto #179

Merged
merged 19 commits into from
Mar 1, 2023
Merged

Sparsify.alpha.auto #179

merged 19 commits into from
Mar 1, 2023

Conversation

rahul-tuli
Copy link
Member

@rahul-tuli rahul-tuli commented Feb 22, 2023

This PR stitches together sparsify.auto with sparsifyml

Consists of the following pre-approved pieces:

And following unapproved pieces:

  • Run integration tests only if sparsifyml installed locally, else skip.
    Additionally remove sparsifyml from nm_dependencies
  • Max steps propagation to teacher
  • Fix failing GHA Fix: failing GHA #183

Usage:

(sparsify) ~ sparsify.auto --help
usage: sparsify.auto [-h] --task TASK --dataset DATASET
                     [--save_directory SAVE_DIRECTORY]
                     [--performance PERFORMANCE] [--base_model BASE_MODEL]
                     [--recipe RECIPE] [--recipe_args RECIPE_ARGS]
                     [--distill_teacher DISTILL_TEACHER]
                     [--num_trials NUM_TRIALS]
                     [--max_train_time MAX_TRAIN_TIME]
                     [--maximum_trial_saves MAXIMUM_TRIAL_SAVES]
                     [--no_stopping] [--resume RESUME]
                     [--optimizing_metric OPTIMIZING_METRIC]
                     [--kwargs KWARGS] [--teacher_kwargs TEACHER_KWARGS]
                     [--tuning_parameters TUNING_PARAMETERS]
                     [--teacher_tuning_parameters TEACHER_TUNING_PARAMETERS]
                     [--teacher_only]

optional arguments:
  -h, --help            show this help message and exit
  --task TASK           task to train the sparsified model on
  --dataset DATASET     path to dataset to train on
  --save_directory SAVE_DIRECTORY
                        Absolute path to save directory
  --performance PERFORMANCE
                        Preferred tradeoff between accuracy and performance.
                        Can be a string or a float value in the range [0, 1].
                        Currently supported strings (and their respective
                        float values are `accuracy` (0), `balanced` (0.5),
                        and `performant` (1.0)
  --base_model BASE_MODEL
                        path to base model to begin sparsification from
  --recipe RECIPE       file path to or zoo stub of sparsification recipe to
                        be applied
  --recipe_args RECIPE_ARGS
                        keyword args to override recipe variables with
  --distill_teacher DISTILL_TEACHER
                        teacher to use for distillation. Can be a path to a
                        model file or zoo stub, 'off' for no distillation,
                        and default value of 'auto' to auto-tune base model
                        as teacher
  --num_trials NUM_TRIALS
                        Number of tuning trials to be run before returning
                        best found model. Set to None to not impose a trial
                        limit. max_train_time may limit the actual num_trials
                        ran
  --max_train_time MAX_TRAIN_TIME
                        Maximum number of hours to train before returning
                        best trained model.
  --maximum_trial_saves MAXIMUM_TRIAL_SAVES
                        Number of best trials to save on the drive. Items
                        saved for a trial include the trained model and
                        associated artifacts. If this value is set to n, then
                        at most n+1 models will be saved at any given time on
                        the machine. Default value of None allows for
                        unlimited model saving
  --no_stopping         Set to True to turn off tuning stopping condition,
                        which may end tuning early if no improvement was made
  --resume RESUME       To continue a tuning run, provide path to the high
                        level directory of run you wish to resume
  --optimizing_metric OPTIMIZING_METRIC
                        The criterion to search model for, multiple metrics
                        can be specified, e.g. --optimizing_metric f1
                        --optimizing_metric latency. Supported metrics are
                        ['accuracy', 'f1', 'recall', 'mAP', 'latency',
                        'throughput', 'compression', 'file_size',
                        'memory_usage']
  --kwargs KWARGS       optional task specific arguments to add to config
  --teacher_kwargs TEACHER_KWARGS
                        optional task specific arguments to add to teacher
                        config
  --tuning_parameters TUNING_PARAMETERS
                        path to config file containing custom parameter
                        tuning settings. See example tuning config output for
                        expected format
  --teacher_tuning_parameters TEACHER_TUNING_PARAMETERS
                        path to config file containing custom teacher
                        parameter tuning settings. See example tuning config
                        output for expected format
  --teacher_only        set to True to only auto tune the teacher

Example Commands:

  • Image Classification
sparsify.auto --task image_classification --dataset /home/ubuntu/datasets/imagenette/imagenette-160

NOTE: Using experimental fast data loading logic. To disable, pass
    "--load_fast=false" and report issues on GitHub. More details:
    https://github.com/tensorflow/tensorboard/issues/4784

TensorBoard listening on http://localhost:6006/

*************************SPARSIFY AUTO**********************
Starting hyperparameter tuning on student model
************************************************************

INFO:auto_banner:Starting hyperparameter tuning on student model

*************************SPARSIFY AUTO**********************
Starting tuning trial #0
************************************************************

INFO:auto_banner:Starting tuning trial #0
Model id is set to sparsify_auto_image_classification
2023-02-22 18:38:59 __main__     INFO     created model with key resnet50: ResNet(
  (input): _Input(
    (conv): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act): ReLU(inplace=True)
    (pool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    
    .
    .
    .
    .
          (activation_post_process): Identity()
        )
      )
    )
  )
  (classifier): _Classifier(
    (avgpool): AdaptiveAvgPool2d(output_size=1)
    (fc): Linear(in_features=2048, out_features=10, bias=True)
    (softmax): Softmax(dim=1)
  )
)
2023-02-22 18:38:59 __main__     INFO     running on device cuda:0
INFO:__main__:running on device cuda:0
Test epoch -1/-1: 100%|██████████████████████████| 8/8 [00:01<00:00,  5.14it/s]
2023-02-22 18:39:01 __main__     INFO
Initial validation results: ModuleRunResults(__loss__=2.218540668487549, top1acc=12.199999809265137, top5acc=69.4000015258789)
INFO:__main__:
Initial validation results: ModuleRunResults(__loss__=2.218540668487549, top1acc=12.199999809265137, top5acc=69.4000015258789)
2023-02-22 18:39:01 __main__     INFO     Starting training from epoch 0
INFO:__main__:Starting training from epoch 0
Train epoch 0/206:  87%|██████████████████▎  | 176/202 [00:21<
  • Transformers QA
(sparsify) ~ sparsify.auto --task question_answering --dataset squad
.
.
.
.

  warnings.warn(
[INFO|trainer.py:1607] 2023-02-22 21:54:15,467 >> ***** Running training *****
[INFO|trainer.py:1608] 2023-02-22 21:54:15,467 >>   Num examples = 88524
[INFO|trainer.py:1609] 2023-02-22 21:54:15,467 >>   Num Epochs = 3
[INFO|trainer.py:1610] 2023-02-22 21:54:15,467 >>   Instantaneous batch size per device = 8
[INFO|trainer.py:1611] 2023-02-22 21:54:15,467 >>   Total train batch size (w. parallel, distributed & accumulation) = 8
[INFO|trainer.py:1612] 2023-02-22 21:54:15,467 >>   Gradient Accumulation steps = 1
[INFO|trainer.py:1613] 2023-02-22 21:54:15,467 >>   Total optimization steps = 33198
  0%|                                                | 0/33198 [00:00<?, ?it/s][W reducer.cpp:1258] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())
  0%|| 157/33198 [00:31<1:49:56,  5.01it/s

Depends on: https://github.com/neuralmagic/sparsifyml/pull/25

bfineran
bfineran previously approved these changes Feb 23, 2023
Copy link
Contributor

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending GHA issues, are these related?

@rahul-tuli rahul-tuli self-assigned this Feb 28, 2023
@rahul-tuli rahul-tuli marked this pull request as ready for review February 28, 2023 15:55
@rahul-tuli rahul-tuli marked this pull request as draft February 28, 2023 15:56
@rahul-tuli rahul-tuli marked this pull request as ready for review February 28, 2023 16:12
KSGulin
KSGulin previously approved these changes Feb 28, 2023
Copy link
Contributor

@KSGulin KSGulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good stuff @rahul-tuli! We do need some verification that the end to end runs are producing the expected results. We can land without the full testing, but let's be ready to follow up with another PR in case there's updates to be made

bfineran
bfineran previously approved these changes Feb 28, 2023
@rahul-tuli
Copy link
Member Author

LGTM, good stuff @rahul-tuli! We do need some verification that the end to end runs are producing the expected results. We can land without the full testing, but let's be ready to follow up with another PR in case there's updates to be made

I agree, I ran a few runs that ended fine, but will keep an eye out for solidifying the code

@rahul-tuli rahul-tuli closed this Feb 28, 2023
@rahul-tuli rahul-tuli reopened this Feb 28, 2023
@rahul-tuli
Copy link
Member Author

Note: CLOSED this PR by mistake, reopened now

KSGulin
KSGulin previously approved these changes Mar 1, 2023
@rahul-tuli rahul-tuli dismissed stale reviews from KSGulin and bfineran via a1e1fe4 March 1, 2023 14:26
implement own strtobool function
@rahul-tuli rahul-tuli merged commit 7600111 into sparsify.alpha Mar 1, 2023
@rahul-tuli rahul-tuli deleted the sparsify.alpha.auto branch March 1, 2023 15:44
bfineran pushed a commit that referenced this pull request Jun 30, 2023
* Update: sparsify.version to match with main

* Delete: sparsify.package

* Empty commit

* Add: stitch functions

* Update: Env var name
Update: stitch functions slightly

* Add: Sparsifyml to dependencies in setup.py

* Style: Fixes

* Some more fixers

* OLD IC integration working

* Run Integration Tests only when sparsifyml installed

* Fix yolov5 integration

* Propagate student args to teacher

* Update teacher kwargs only when key not present for safety

* Updated: integration_test

* Updated: num trials to 2

* Fix: failing GHA

* make sparsifyml optional
implement own strtobool function
bfineran pushed a commit that referenced this pull request Jun 30, 2023
* Update: sparsify.version to match with main

* Delete: sparsify.package

* Empty commit

* Add: stitch functions

* Update: Env var name
Update: stitch functions slightly

* Add: Sparsifyml to dependencies in setup.py

* Style: Fixes

* Some more fixers

* OLD IC integration working

* Run Integration Tests only when sparsifyml installed

* Fix yolov5 integration

* Propagate student args to teacher

* Update teacher kwargs only when key not present for safety

* Updated: integration_test

* Updated: num trials to 2

* Fix: failing GHA

* make sparsifyml optional
implement own strtobool function
KSGulin added a commit that referenced this pull request Jul 7, 2023
* Clear existing sparsify source

* Add back version file

* Port of sparsify.auto from private repository (#124)

* remove javascript deps

* Initial port of autosparse to sparsify.auto

* Initial port autosparse -> sparsify.auto

* Added tests and fixes

* Add back yarn

* Add github workflow for test checks

* Update workflows

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

workflow

* Add GHA tests for base, package, and auto (#133)

* `sparsify.package` base CLI (#125)

* bump up main to 1.2.0 (#128)

Co-authored-by: dhuang <dhuang@MacBook-Pro.local>

* Adds the following:

* Setup directory Structure
* `from sparsify import package` importable + callable function
* A constants file with supported tasks, criterions, and deployment scenarios (Should probably converted to `Enums` or something better than `python lists`)
* Add `click` as a required dependency
* Additional CLI helpers for updated acceptance criterion
* `sparsify.package` cli utility
* setup test directory
* Add tests for CLI
* Setup Entrypoints

* Remove old docstring

* - Moved utils outside `package`
- Renamed package_ to package
- Add more tests
- Update Usage command
- Rebased on `sparsify.alpha`
- Add typing
- Add version info to cli

Apply review comments from @corey-nm
- Remove `cli_helpers.py` and rely on `click`

* Remove unintended change added while resolving merge conflicts

* Style

* Add dataset registry
update cli to use dataset registry

* Fix failing tests

* Centralize task registry (#132)

* Centralize task name and alias handeling

* Propagate TaskName updates to auto tasks

* Fix click parse args call

* Fix failing tests after TASK name updates

* Prevent auto install of integrations on sparsify import (#134)

* * Change `NO_VNNI` --> `DEFAULT`
* Refactor CLI arg parsing cause originally `System.exit()` was thrown on invoking help
* Rename `scenario` --> `target`
* Remove single character shortcuts, as suggested by @bfineran
* Default directory to `None` for now, logic to choose an appropriate name will be added to diff #130
* Added show defaults at the top level `click.command()` decorator
* Added a `DEFAULT_OPTIMIZNG_METRIC`
* Added a `DEFAULT_DEPLOYMENT_SCENARIO`
* Changed `optimizing_metric` help message
* Updated Tests

* - Style
- Example Usage

Co-authored-by: dhuangnm <74931910+dhuangnm@users.noreply.github.com>
Co-authored-by: dhuang <dhuang@MacBook-Pro.local>
Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com>

* Add DDP support (#126)

* `sparsify.package` backend-call (#130)

* bump up main to 1.2.0 (#128)

Co-authored-by: dhuang <dhuang@MacBook-Pro.local>

* Adds the following:

* Setup directory Structure
* `from sparsify import package` importable + callable function
* A constants file with supported tasks, criterions, and deployment scenarios (Should probably converted to `Enums` or something better than `python lists`)
* Add `click` as a required dependency
* Additional CLI helpers for updated acceptance criterion
* `sparsify.package` cli utility
* setup test directory
* Add tests for CLI
* Setup Entrypoints

* Remove old docstring

* - Moved utils outside `package`
- Renamed package_ to package
- Add more tests
- Update Usage command
- Rebased on `sparsify.alpha`
- Add typing
- Add version info to cli

Apply review comments from @corey-nm
- Remove `cli_helpers.py` and rely on `click`

* Remove unintended change added while resolving merge conflicts

* Style

* Add dataset registry
update cli to use dataset registry

* Fix failing tests

* Centralize task registry (#132)

* Centralize task name and alias handeling

* Propagate TaskName updates to auto tasks

* Fix click parse args call

* Fix failing tests after TASK name updates

* Prevent auto install of integrations on sparsify import (#134)

* * Change `NO_VNNI` --> `DEFAULT`
* Refactor CLI arg parsing cause originally `System.exit()` was thrown on invoking help
* Rename `scenario` --> `target`
* Remove single character shortcuts, as suggested by @bfineran
* Default directory to `None` for now, logic to choose an appropriate name will be added to diff #130
* Added show defaults at the top level `click.command()` decorator
* Added a `DEFAULT_OPTIMIZNG_METRIC`
* Added a `DEFAULT_DEPLOYMENT_SCENARIO`
* Changed `optimizing_metric` help message
* Updated Tests

* - Style
- Example Usage

* Add proper commands + gha workflows

* Refactor package function to make a call to the backend service

* Add template function for output
Add importable Backend Base url
Remove unnecessary args from package function
Add end to end integration test

* Updated tests, addressed comments

* Base Cli + importable function

* Style

* Remove files added in faulty rebase

* Changed base url, styling

Co-authored-by: dhuangnm <74931910+dhuangnm@users.noreply.github.com>
Co-authored-by: dhuang <dhuang@MacBook-Pro.local>
Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com>
Co-authored-by: Konstantin <konstantin@neuralmagic.com>

* `sparsify.package` updates (#141)

* Update output to also print model metrics
Update `--optimizing_metrics` to take in a string containing comma separated metrics for example `--optimizing_metric "compression, accuracy"`(added a `_csv_callback` function for that)
Update Usage instructions accordingly
Add a log statement to package function
Added more tests

* Address comments

* Rename `normalized_metric` --> `metric_` to avoid potential confusion

* Add a getter for TASK_REGISTRY and DATASET_REGISTRY (#142)

* Add a getter for TASK_REGISTRY and DATASET_REGISTRY

* typing

* fix potential bug

* Add None to test

* Updated tests according to comments from @bfineran

* Make test cleaner based on feedback from @corey-nm

* Remove config creator (#136)

* [Auto] Add Tensorboard Support (#147)

* Support for Hyperparameter Tuning (#145)

* force convert yolov5 metric keys to float (#151)

* [Auto] Update function name and description to be more generic (#149)

* rename and flip logic for stopping_condition flag (#152)

* [Auto] Support for multi-stage tuning (#157)

* Support for updated tuning flow (#159)

* Support tuning of CLI args (#158)

* Support multiple optimizing metrics (#160)

* Log important updates with an easily visible format (#161)

* Update the user output for `sparsify.package` (#166)

* Add Dockerfile
Download deployment directory, and
Update instructions for user
Update tests

* Add volume mount to docker command

* [Auto] Update interface for sparsifyml (#173)

* Fix: remove debug line

* Update sparsify.auto interface for sparsifyml

* rename interface -> schemas

* Sparsify.alpha.auto (#179)

* Update: sparsify.version to match with main

* Delete: sparsify.package

* Empty commit

* Add: stitch functions

* Update: Env var name
Update: stitch functions slightly

* Add: Sparsifyml to dependencies in setup.py

* Style: Fixes

* Some more fixers

* OLD IC integration working

* Run Integration Tests only when sparsifyml installed

* Fix yolov5 integration

* Propagate student args to teacher

* Update teacher kwargs only when key not present for safety

* Updated: integration_test

* Updated: num trials to 2

* Fix: failing GHA

* make sparsifyml optional
implement own strtobool function

* [Create] alpha implementation (#181)

* [Create] alpha implementation

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com>

---------

Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com>

* Adding one shot cli (#184)

* [Feature branch] standard clis (#187)

* Adding skeleton clis

* [CLI standardization] sparsify.run one-shot impl (#188)

* [CLI standardization] sparsify.run one-shot impl

* Fixing one-shot cli

---------

Co-authored-by: Corey Lowman <corey@neuralmagic.com>

* [WIP][CLI standardization] sparsify.run training-aware and spares-transfer initial impl (#189)

* [CLI standardization] sparsify.run one-shot impl

* [WIP][CLI standardization] sparsify.run training-aware and spares-transfer initial impl

* Fixing training-aware/sparse-transfer

---------

Co-authored-by: Corey Lowman <corey@neuralmagic.com>

* Adding docstring to sparsify.run

* Moving use case to top arg

* Removing apply/init

---------

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

* Style changes for sparsify.alpha (#194)

* Update: Minimum supported Python Version to `3.7` as it's consistent with our other repos (#193)

* [Add] `sparsify.login` CLI and function (#180)

* Adding sparsify.login entrypoint and function

* Adding docstring to exception

* Adding pip install of sparsifyml

* Respond to review

* Adding help message at top

* Adding setup python to workflow

* Adding checked sparsifyml import

* Apply suggestions from code review

Co-authored-by: Danny Guinther <dannyguinther@gmail.com>

* check against major minor version only

* add client_id and other bug fixes

* Fix: `--index` --> `--index-url`

* Update install command missed during rebase

* * Clean up code
* Remove Global variables
* Update PyPi Server link
* Add Logging
* Move exceptions to their own file

* Style fixes

* Apply suggestions from code review

Add: suggestion from @KSGulin

Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com>

* Update src/sparsify/login.py

Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com>

* remove comment

---------

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
Co-authored-by: Danny Guinther <dannyguinther@gmail.com>
Co-authored-by: Benjamin <ben@neuralmagic.com>
Co-authored-by: rahul-tuli <rahul@neuralmagic.com>
Co-authored-by: Konstantin Gulin <66528950+KSGulin@users.noreply.github.com>

* training aware and sparse transfer run mode support (#191)

* add sparsifyml dependencies to sparsify install (#195)

* update task registry + generalize matching (#201)

* rename performance to optim-level in legacy auto api (#199)

* [sparsify.run one-shot] CLI propagation of recipe_args (#198)

* Remove hardware optimization options (#200)

* Remove hardware optimization options

* Rename instead of remove optim_level

* Add OPTIM_LEVEL back to all list

* simple fixes in initial one-shot testing flow (#206)

* fixes for initial E2E runs of sparse transfer and training aware (#207)

* fixes for initial E2E runs of sparse transfer and training aware

* quality

* [Alpha] Rework Auto main script into Training-Aware and Sparse-Transfer script (#208)

* Initial scratch work

* Complete, but untested implementation

* Working yolov5

* Working across all integrations

* IC path fix

* Require model

* Remove debug adds

* make API KEY an argument (#211)

* Update integration and unit tests (#214)

* Update integration and unit tests

* Update IC base test model

* Add login step to test setup (#216)

* bump up version to 1.6.0 (#215) (#218)

Co-authored-by: dhuang <dhuang@MacBook-Pro-2.local>

(cherry picked from commit 699a476)

Co-authored-by: dhuangnm <74931910+dhuangnm@users.noreply.github.com>

* [BugFixes] Fix failing tests in `sparsify.alpha` (#223)

* Intermediate commit should be amended

* Remove failing test as synced with @KSGulin

* Explicitly pin protobuff depencies. (#225)

* Default num_samples to None (#227)

* remove legacy UI cmds from `make build` (#229)

* Remove dev print statements from IC runner (#231)

* Remove dev print statements

* Remove logger

* Fix incomplete wheel build (#232)

* Fix incomplete wheel build

* Add license string

* Add environment hecks

* Address review comments

* Catch generic Exception

* signal test

---------

Co-authored-by: Rahul Tuli <rahul@neuralmagic.com>
Co-authored-by: dhuangnm <74931910+dhuangnm@users.noreply.github.com>
Co-authored-by: dhuang <dhuang@MacBook-Pro.local>
Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com>
Co-authored-by: Danny Guinther <dannyguinther@gmail.com>
Co-authored-by: Benjamin <ben@neuralmagic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants