Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenPifPaf DeepSparse Pipeline #788

Merged
merged 22 commits into from
Dec 1, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file added logs.txt
Empty file.
6 changes: 6 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,10 @@ def _parse_requirements_file(file_path):
"torchvision>=0.3.0,<=0.12.0",
"opencv-python",
]
_openpifpaf_integration_deps = [
"openpifpaf==0.13.6",
"opencv-python",
]
# haystack dependencies are installed from a requirements file to avoid
# conflicting versions with NM's deepsparse/transformers
_haystack_requirements_file_path = os.path.join(
Expand Down Expand Up @@ -247,6 +251,7 @@ def _setup_extras() -> Dict:
"onnxruntime": _onnxruntime_deps,
"yolo": _yolo_integration_deps,
"haystack": _haystack_integration_deps,
"openpifpaf": _openpifpaf_integration_deps,
}


Expand All @@ -264,6 +269,7 @@ def _setup_entry_points() -> Dict:
"deepsparse.benchmark=deepsparse.benchmark.benchmark_model:main",
"deepsparse.server=deepsparse.server.cli:main",
"deepsparse.object_detection.annotate=deepsparse.yolo.annotate:main",
"deepsparse.pose_estimation.annotate=deepsparse.openpifpaf.annotate:main",
"deepsparse.image_classification.annotate=deepsparse.image_classification.annotate:main", # noqa E501
"deepsparse.instance_segmentation.annotate=deepsparse.yolact.annotate:main",
f"deepsparse.image_classification.eval={ic_eval}",
Expand Down
2 changes: 1 addition & 1 deletion src/deepsparse/image_classification/annotate.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@
get_annotations_save_dir,
get_image_loader_and_saver,
)
from deepsparse.yolo.utils.cli_helpers import create_dir_callback
from deepsparse.utils.cli_helpers import create_dir_callback


ic_default_stub = (
Expand Down
82 changes: 82 additions & 0 deletions src/deepsparse/open_pif_paf/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# OpenPifPaf Inference Pipelines

The DeepSparse integration of the OpenPifPaf model is a work in progress. Please check back soon for updates.
This README serves as a placeholder for internal information that may be useful for further development.

DeepSparse pipeline for OpenPifPaf

## Example use in DeepSparse Python API:

```python
from deepsparse import Pipeline

pipeline = Pipeline.create(task="open_pif_paf", batch_size = 1)
predictions = pipeline(images=['dancers.jpg'])
# predictions have attributes `data', 'keypoints', 'scores', 'skeletons'
predictions[0].scores
>> scores=[0.8542259724243828, 0.7930507659912109]
```
# predictions have attributes `data', 'keypoints', 'scores', 'skeletons'

## The necessity of external OpenPifPaf helper function
<img width="678" alt="image" src="https://user-images.githubusercontent.com/97082108/203295520-42fa325f-8a94-4241-af6f-75938ef26b14.png">

As illustrated by the diagram from the original paper: once the input image has been encoded into PIF or PAF (or CIF and CAF, same meaning, just two different naming conventions
per original authors) tensors, they need to be later decoded into human-understandable annotations.

Once the neural network outputs CIF and CAF tensors, they are then processed by an algorithm described below:

<img width="337" alt="image" src="https://user-images.githubusercontent.com/97082108/203295686-91305e9c-e455-4ac8-9652-978f9ec8463d.png">

For speed reasons, the decoding in the original `OpenPifPaf` repository is implemented in [C++ and libtorch](https://github.com/openpifpaf/openpifpaf/issues/560): `https://github.com/openpifpaf/openpifpaf/src/openpifpaf/csrc`

Rewriting this functionality would be a significant engineering effort, so I reuse part of the original implementation in the pipeline:

### On the pipeline instantiation

```python
model_cpu, _ = network.Factory().factory(head_metas=None)
self.processor = decoder.factory(model_cpu.head_metas)
```

First, I fetch the default `model` object (also a second argument, which is the last epoch of the pre-trained model) from the factory. Note, this `model` will not be used for inference, only to pull the information
about the heads of the model: `model_cpu.head_metas: List[Cif, Caf]`. This information will be consumed to create a (set of) decoder(s) (objects that map `fields`, raw network output, to human-understandable annotations).

Note: The `Cif` and `Caf` objects seem to be dataset-dependent. They hold e.g. the information about the expected relationship of the joints of the pose (skeleton).

Hint: Instead of returning Annotation objects, the API supports returning annotations as JSON serializable dicts. This is probably what we should aim for.

In the default scenario (I suspect for all the pose estimation tasks), the `self.processor` will be a `Multi` object that holds a single `CifCaf` decoder.

Other available decoders:

```python
{
openpifpaf.decoder.cifcaf.CifCaf,
openpifpaf.decoder.cifcaf.CifCafDense, # not sure what this does
openpifpaf.decoder.cifdet.CifDet, # I think this is just for the object detection task
openpifpaf.decoder.pose_similarity.PoseSimilarity, # for pose similarity task
openpifpaf.decoder.tracking_pose.TrackingPose # for tracking task
}
```

## On the engine output preprocessing

```python
def process_engine_outputs(self, fields):

for idx, (cif, caf) in enumerate(zip(*fields)):
annotations = self.processor._mappable_annotations(
[torch.tensor(cif),
torch.tensor(caf)], None, None)
```
I am passing the CIF and CAF values directly to the processor (through the private function; `self.processor` itself, by default, does batching and inference).
This is the functionality that we perhaps would like to fold into our computational graph:
1. To avoid being dependent on the external library and their torch-dependent implementation
2. Having control (and the possibility to improve upon) over the generic decoder






13 changes: 13 additions & 0 deletions src/deepsparse/open_pif_paf/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
223 changes: 223 additions & 0 deletions src/deepsparse/open_pif_paf/annotate.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
# Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""
Usage: deepsparse.open_pif_paf.annotate [OPTIONS]

Annotation Script for OpenPifPaf with DeepSparse

Options:
--model_filepath, --model-filepath TEXT
Path/SparseZoo stub to the model file to be
used for annotation [default: ...]
--source TEXT File path to an image or directory of image
files, a .mp4 video, or an integer (i.e. 0)
for webcam [required]
--engine [deepsparse|onnxruntime|torch]
Inference engine backend to run on. Choices
are 'deepsparse', 'onnxruntime', and
'torch'. Default is 'deepsparse'
--image_shape, --image_shape INTEGER
Image shape to use for inference, must be
two integers [default: 384, 384]
--num_cores, --num-cores INTEGER
The number of physical cores to run the
annotations with, defaults to using all
physical cores available on the system. For
DeepSparse benchmarks, this value is the
number of cores per socket
--save_dir, --save-dir DIRECTORY
The path to the directory for saving results
[default: annotation-results]
--name TEXT Name of directory in save-dir to write
results to. defaults to
{engine}-annotations-{run_number}
--target_fps, --target-fps FLOAT
Target FPS when writing video files. Frames
will be dropped to closely match target FPS.
--source must be a video file and if target-
fps is greater than the source video fps
then it will be ignored
--no_save, --no-save Set flag when source is from webcam to not
save results.Not supported for non-webcam
sources [default: False]
--help Show this message and exit

#######
Examples:

1) deepsparse.open_pif_paf.annotate --source PATH/TO/IMAGE.jpg
2) deepsparse.open_pif_paf.annotate --source PATH/TO/VIDEO.mp4
3) deepsparse.open_pif_paf.annotate --source 0
4) deepsparse.open_pif_paf.annotate --source PATH/TO/IMAGE_DIR
"""
import logging
from typing import Optional, Tuple

import click

import cv2
from deepsparse.open_pif_paf.utils import annotate_image
from deepsparse.pipeline import Pipeline
from deepsparse.utils.annotate import (
annotate,
get_annotations_save_dir,
get_image_loader_and_saver,
)
from deepsparse.utils.cli_helpers import create_dir_callback


open_pif_paf_default_stub = None

DEEPSPARSE_ENGINE = "deepsparse"
ORT_ENGINE = "onnxruntime"
TORCH_ENGINE = "torch"

_LOGGER = logging.getLogger(__name__)


@click.command(
context_settings=dict(
token_normalize_func=lambda x: x.replace("-", "_"), show_default=True
),
)
@click.option(
"--model_filepath",
dbogunowicz marked this conversation as resolved.
Show resolved Hide resolved
type=str,
default=open_pif_paf_default_stub,
help="Path/SparseZoo stub to the model file to be used for annotation",
)
@click.option(
"--source",
type=str,
required=True,
help="File path to image or directory of .jpg files, a .mp4 video, "
"or an integer (i.e. 0) for webcam",
)
@click.option(
"--engine",
type=click.Choice([DEEPSPARSE_ENGINE, ORT_ENGINE, TORCH_ENGINE]),
default=DEEPSPARSE_ENGINE,
help="Inference engine backend to run on. Choices are 'deepsparse', "
"'onnxruntime', and 'torch'. Default is 'deepsparse'",
)
@click.option(
"--image_shape",
type=int,
nargs=2,
default=(384, 384),
help="Image shape to use for inference, must be two integers",
)
@click.option(
"--num_cores",
type=int,
default=None,
help="The number of physical cores to run the annotations with, "
"defaults to using all physical cores available on the system."
" For DeepSparse benchmarks, this value is the number of cores "
"per socket",
)
@click.option(
"--save_dir",
type=click.Path(dir_okay=True, file_okay=False),
default="annotation-results",
callback=create_dir_callback,
help="The path to the directory for saving results",
)
@click.option(
"--name",
type=str,
default=None,
help="Name of directory in save-dir to write results to. defaults to "
"{engine}-annotations-{run_number}",
)
@click.option(
"--target_fps",
type=float,
default=None,
help="Target FPS when writing video files. Frames will be dropped to "
"closely match target FPS. --source must be a video file and if "
"target-fps is greater than the source video fps then it "
"will be ignored",
)
@click.option(
"--no_save",
is_flag=True,
help="Set flag when source is from webcam to not save results."
"Not supported for non-webcam sources",
)
def main(
model_filepath: str,
source: str,
engine: str,
image_shape: Tuple[int, int],
num_cores: Optional[int],
save_dir: str,
name: Optional[str],
target_fps: Optional[float],
no_save: bool,
) -> None:
"""
Annotation Script for OpenPifPaf with DeepSparse
dbogunowicz marked this conversation as resolved.
Show resolved Hide resolved
"""
save_dir = get_annotations_save_dir(
initial_save_dir=save_dir,
tag=name,
engine=engine,
)

loader, saver, is_video = get_image_loader_and_saver(
path=source,
save_dir=save_dir,
image_shape=image_shape,
target_fps=target_fps,
no_save=no_save,
)

is_webcam = source.isnumeric()
open_pif_paf_pipeline = Pipeline.create(
task="open_pif_paf",
model_path=model_filepath,
engine_type=engine,
num_cores=num_cores,
)

for iteration, (input_image, source_image) in enumerate(loader):
# annotate
annotated_image = annotate(
pipeline=open_pif_paf_pipeline,
annotation_func=annotate_image,
image=input_image,
target_fps=target_fps,
calc_fps=is_video,
original_image=source_image,
model_resolution=image_shape,
)

if is_webcam:
cv2.imshow("annotated", annotated_image)
cv2.waitKey(1)

# save
if saver:
saver.save_frame(annotated_image)

if saver:
saver.close()

_LOGGER.info(f"Results saved to {save_dir}")


if __name__ == "__main__":
main()
Loading