Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Export Refactor] Feature Branch #1858

Merged
merged 21 commits into from
Jan 10, 2024
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
5058120
initial commit
dbogunowicz Nov 28, 2023
bb5887e
respond to PR comments
dbogunowicz Dec 4, 2023
2930dea
[Export Refactor][Image Classification] `create_model` function (#1878)
dbogunowicz Dec 11, 2023
5ac4d35
[Export Refactor][Image Classification] `create_dummy_input` function…
dbogunowicz Dec 11, 2023
59f3f5a
[Export Refactor][Image Classification] `export_model` function (#1883)
dbogunowicz Dec 11, 2023
9096b0d
[Export Refactor][Image Classification] `apply_optimizations` functio…
dbogunowicz Dec 11, 2023
16c9bf3
[Export Refactor][Image Classification] `export_sample_inputs_outputs…
dbogunowicz Dec 11, 2023
4d66402
remove duplicated function
dbogunowicz Dec 11, 2023
ed04d3f
[Export Refactor][Image Classification] `create_deployment_folder` fu…
dbogunowicz Dec 12, 2023
571aed8
[Export Refactor][Image Classification] `validate_correctness` functi…
dbogunowicz Dec 12, 2023
627ddd6
[Export Refactor] End to end testing (#1898)
dbogunowicz Dec 14, 2023
3da5f23
[Export Refactor] Prepare the module to be more general (before inclu…
dbogunowicz Dec 19, 2023
c65ab6e
[Export Refactor][Transformers] Enable loading SparseModels (#1921)
dbogunowicz Dec 21, 2023
e4770c8
Fix the tests
dbogunowicz Dec 29, 2023
7b28881
fix tests with help from sara
dbogunowicz Jan 2, 2024
6179cb2
[Export][Transformers] Enable loading `text-generation` datasets (#1938)
dbogunowicz Jan 5, 2024
7f166a1
tests fixed
dbogunowicz Jan 6, 2024
c3c90a4
fix test
dbogunowicz Jan 6, 2024
57a4dd0
[Export refactor] final manual testing fixes (#1948)
bfineran Jan 10, 2024
ee78625
Export Refactor CLI (#1949)
bfineran Jan 10, 2024
8c647b8
Merge branch 'main' into feature/damian/feature_branch_export
bfineran Jan 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions .github/workflows/test-check.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ jobs:
deepsparse: ${{ steps.deepsparse-check.outputs.output }}
onnx: ${{ steps.onnx-check.outputs.output }}
pytorch: ${{ steps.pytorch-check.outputs.output }}
export: ${{ steps.export-check.outputs.output }}
steps:
- uses: actions/checkout@v2
with:
Expand Down Expand Up @@ -53,6 +54,12 @@ jobs:
((git diff --name-only origin/main HEAD | grep -E "[src|tests]/sparseml/pytorch|setup.py|.github")
|| (echo $GITHUB_REF | grep -E "refs/heads/[release/|main]"))
&& echo "::set-output name=output::1" || echo "::set-output name=output::0"
- name: "Checking if sparseml.export was changed"
id: export-check
run: >
((git diff --name-only origin/main HEAD | grep -E "[src|tests]/sparseml/export|setup.py|.github")
|| (echo $GITHUB_REF | grep -E "refs/heads/[release/|main]"))
&& echo "::set-output name=output::1" || echo "::set-output name=output::0"
base-tests:
runs-on: ubuntu-22.04
env:
Expand Down Expand Up @@ -221,3 +228,28 @@ jobs:
run: pip3 install .[dev,torch,transformers]
- name: "🔬 Running transformers tests"
run: make test TARGETS=transformers
export-tests:
runs-on: ubuntu-22.04
env:
SPARSEZOO_TEST_MODE: "true"
needs: test-setup
if: ${{needs.test-setup.outputs.export == 1}}
steps:
- uses: actions/setup-python@v4
with:
python-version: '3.11'
- uses: actions/checkout@v2
- uses: actions/checkout@v2
with:
repository: "neuralmagic/sparsezoo"
path: "sparsezoo"
ref: ${{needs.test-setup.outputs.branch}}
- name: "⚙️ Install sparsezoo dependencies"
run: pip3 install -U pip && pip3 install setuptools sparsezoo/
- name: "Clean sparsezoo directory"
run: rm -r sparsezoo/
- name: "⚙️ Install dependencies"
run: pip3 install .[dev,torch,transformers,torchvision,onnxruntime]
- name: "🔬 Running export tests"
run: make test TARGETS=export

5 changes: 4 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ MDCHECKFILES := CODE_OF_CONDUCT.md CONTRIBUTING.md DEVELOPING.md README.md
SPARSEZOO_TEST_MODE := "true"

BUILD_ARGS := # set nightly to build nightly release
TARGETS := "" # targets for running pytests: deepsparse,keras,onnx,pytorch,pytorch_models,pytorch_datasets,tensorflow_v1,tensorflow_v1_models,tensorflow_v1_datasets
TARGETS := "" # targets for running pytests: deepsparse,keras,onnx,pytorch,pytorch_models,export,pytorch_datasets,tensorflow_v1,tensorflow_v1_models,tensorflow_v1_datasets
PYTEST_ARGS ?= ""
PYTEST_INTEG_ARGS ?= ""
ifneq ($(findstring deepsparse,$(TARGETS)),deepsparse)
Expand All @@ -18,6 +18,9 @@ endif
ifneq ($(findstring transformers,$(TARGETS)),transformers)
PYTEST_ARGS := $(PYTEST_ARGS) --ignore tests/sparseml/transformers
endif
ifneq ($(findstring export,$(TARGETS)),export)
PYTEST_ARGS := $(PYTEST_ARGS) --ignore tests/sparseml/export
endif
ifneq ($(findstring keras,$(TARGETS)),keras)
PYTEST_ARGS := $(PYTEST_ARGS) --ignore tests/sparseml/keras
endif
Expand Down
2 changes: 1 addition & 1 deletion src/sparseml/core/session.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ def pre_initialize_structure(
This will run the pre-initialize structure method for each modifier in the
session's lifecycle. This will also set the session's state to the
pre-initialized state. Takes care of cases when the model(s) structure
has been previosuly modified by a modifier.
has been previously modified by a modifier.

:param model: the model to pre-initialize the structure for
:param recipe: the recipe to use for the sparsification, can be a path to a
Expand Down
13 changes: 13 additions & 0 deletions src/sparseml/export/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
256 changes: 256 additions & 0 deletions src/sparseml/export/export.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
# Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import logging
import os
from pathlib import Path
from typing import Any, List, Optional, Union

from sparseml.export.export_data import export_data_samples
from sparseml.export.helpers import (
AVAILABLE_DEPLOYMENT_TARGETS,
ONNX_MODEL_NAME,
create_deployment_folder,
create_export_kwargs,
)
from sparseml.export.validators import validate_correctness as validate_correctness_
from sparseml.export.validators import validate_structure as validate_structure_
from sparseml.pytorch.opset import TORCH_DEFAULT_ONNX_OPSET
from sparseml.pytorch.utils.helpers import default_device
from src.sparseml.integration_helper_functions import (
IntegrationHelperFunctions,
resolve_integration,
)


_LOGGER = logging.getLogger(__name__)


def export(
source_path: Union[Path, str],
target_path: Union[Path, str],
onnx_model_name: str = ONNX_MODEL_NAME,
deployment_target: str = "deepsparse",
opset: int = TORCH_DEFAULT_ONNX_OPSET,
single_graph_file: bool = True,
num_export_samples: int = 0,
recipe: Optional[Union[Path, str]] = None,
deployment_directory_name: str = "deployment",
device: str = "cpu",
graph_optimizations: Union[str, List[str], None] = "all",
validate_correctness: bool = False,
validate_structure: bool = True,
integration: Optional[str] = None,
sample_data: Optional[Any] = None,
task: Optional[str] = None,
**kwargs,
):
"""
Export a PyTorch model located in source_path, to target_path.
The deployment files will be located at target_path/deployment_directory_name

The exporting logic consists of the following steps:
1. Create the model and validation dataloader (if needed) using the
integration-specific `create_model` function.
2. Export the model to ONNX using the integration-specific `export` function.
3. Apply the graph optimizations to the exported model.
4. Create the deployment folder at target_path/deployment_directory_name
using the integration-specific `create_deployment_folder` function.
5. Optionally, export samples using the integration-specific
`create_data_samples` function.
6. Optionally, validate the correctness of the exported model using
the integration-specific `validate_correctness` function.
7. Optionally, validate the structure of the exported model using
the integration-specific `validate_structure` function.

:param source_path: The path to the PyTorch model to export.
:param target_path: The path to save the exported model to.
:param onnx_model_name: The name of the exported model.
Defaults to ONNX_MODEL_NAME.
:param deployment_target: The deployment target to export
the model to. Defaults to 'deepsparse'.
:param opset: The ONNX opset to use for exporting the model.
Defaults to the latest supported opset.
:param recipe: The path to the recipe to use for exporting the model.
Defaults to None. If a recipe is found in the source_path, it will
be automatically used for export.
:param single_graph_file: Whether to save the model as a single
file. Defaults to True.
:param num_export_samples: The number of samples to create for
the exported model. Defaults to 0.
:param deployment_directory_name: The name of the deployment
directory to create for the exported model. Thus, the exported
model will be saved to `target_path/deployment_directory_name`.
Defaults to 'deployment'.
:param device: The device to use for exporting the model.
Defaults to 'auto'.
:param graph_optimizations: The graph optimizations to apply
to the exported model. Defaults to 'all'.
:param validate_correctness: Whether to validate the correctness
of the exported model. Defaults to False.
:param validate_structure: Whether to validate the structure
of the exporter model (contents of the target_path).
:param integration: The name of the integration to use for
exporting the model.Defaults to None, which will infer
the integration from the source_path.
:param sample_data: Optional sample data to use for exporting
the model. If not provided, a dummy input will be created
for the model. Defaults to None.
:param task: Optional task to use for exporting the model.
Defaults to None.
"""
# TODO: Remove with the followin once sparsezoo: #404 lands
"""
from sparsezoo.utils.registry import standardize_lookup_name
task = standardize_lookup_name(task)
"""
if task is not None:
task = task.replace("_", "-").replace(" ", "-")

# TODO: Remove once sparsezoo: #404 lands
if integration is not None:
integration = integration.replace("_", "-").replace(" ", "-")

# create the target path if it doesn't exist
if not Path(target_path).exists():
Path(target_path).mkdir(parents=True, exist_ok=True)

# choose the appropriate device
device = default_device() if device == "auto" else device

# assert the valid deployment target
if deployment_target not in AVAILABLE_DEPLOYMENT_TARGETS:
raise ValueError(
"Argument: deployment_target must be "
f"one of {AVAILABLE_DEPLOYMENT_TARGETS}. "
f"Got {deployment_target} instead."
)

integration = resolve_integration(source_path, integration)

_LOGGER.info(f"Starting export for {integration} model...")

helper_functions: IntegrationHelperFunctions = (
IntegrationHelperFunctions.load_from_registry(integration, task=task)
)

_LOGGER.info("Creating model for the export...")

# loaded_model_kwargs may include any objects
# that were created along with the model and are needed
# for the export
model, loaded_model_kwargs = helper_functions.create_model(
source_path,
device=device,
task=task,
recipe=recipe,
**kwargs,
)
model.eval()

if loaded_model_kwargs:
_LOGGER.info(
"Created additional items that will "
f"be used for the export: {list(loaded_model_kwargs.keys())}"
)

sample_data = (
helper_functions.create_dummy_input(**loaded_model_kwargs, **kwargs)
if sample_data is None
else sample_data
)

_LOGGER.info(f"Exporting {onnx_model_name} to {target_path}...")

export_kwargs = create_export_kwargs(loaded_model_kwargs)

onnx_file_path = helper_functions.export(
model=model,
sample_data=sample_data,
target_path=target_path,
onnx_model_name=onnx_model_name,
deployment_target=deployment_target,
opset=opset,
**export_kwargs,
)
_LOGGER.info(f"Successfully exported {onnx_model_name} to {onnx_file_path}...")

if num_export_samples:
_LOGGER.info(f"Exporting {num_export_samples} samples...")
(
input_samples,
output_samples,
label_samples,
) = helper_functions.create_data_samples(
num_samples=num_export_samples,
model=model,
**loaded_model_kwargs,
)
export_data_samples(
input_samples=input_samples,
output_samples=output_samples,
label_samples=label_samples,
target_path=target_path,
as_tar=False,
)

_LOGGER.info(
f"Creating deployment folder {deployment_directory_name} "
f"at directory: {target_path}..."
)

deployment_path = create_deployment_folder(
source_path=source_path,
target_path=target_path,
deployment_directory_name=deployment_directory_name,
deployment_directory_files_mandatory=helper_functions.deployment_directory_files_mandatory, # noqa: E501
deployment_directory_files_optional=helper_functions.deployment_directory_files_optional, # noqa: E501
onnx_model_name=onnx_model_name,
)

if validate_structure:
_LOGGER.info("Validating model structure...")
validate_structure_(
target_path=target_path,
deployment_directory_name=deployment_directory_name,
onnx_model_name=onnx_model_name,
deployment_directory_files_mandatory=helper_functions.deployment_directory_files_mandatory, # noqa: E501
deployment_directory_files_optional=helper_functions.deployment_directory_files_optional, # noqa: E501
)

if validate_correctness:
_LOGGER.info("Validating model correctness...")
if not num_export_samples:
raise ValueError(
"To validate correctness sample inputs/outputs are needed."
"To enable the validation, set `num_export_samples`"
"to True"
)
validate_correctness_(target_path, deployment_path, onnx_model_name)

_LOGGER.info(
f"Applying optimizations: {graph_optimizations} to the exported model..."
)

if helper_functions.apply_optimizations is not None:
helper_functions.apply_optimizations(
exported_file_path=os.path.join(deployment_path, onnx_model_name),
optimizations=graph_optimizations,
single_graph_file=single_graph_file,
)

_LOGGER.info(
f"Successfully exported model from:\n{target_path}"
f"\nto\n{deployment_path}\nfor integration: {integration}"
)
Loading
Loading