Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenPifPaf DeepSparse Pipeline #788

Merged
merged 22 commits into from
Dec 1, 2022

Conversation

dbogunowicz
Copy link
Contributor

@dbogunowicz dbogunowicz commented Nov 21, 2022

DeepSparse pipeline for OpenPifPaf

Sanity checks:

Use in the server

image

Use for the user:

from deepsparse import Pipeline

pipeline = Pipeline.create(task="open_pif_paf", batch_size = 1)
pred = pipeline(images=['/home/ubuntu/damian/deepsparse_copy/src/deepsparse/open_pif_paf/sample_images/dancers.jpg'])
print(pred)
data=[[[[126.87345886230469, 101.16527557373047, 0.9655829071998596], [0.0, 0.0, 0.0], [122.89710998535156, 97.64031219482422, 0.9702512621879578], [0.0, 0.0, 0.0], [111.99809265136719, 101.56838989257812, 0.9984070062637329], [131.0704345703125, 121.19110107421875, 0.936824381351471], [106.96343231201172, 133.9502716064453, 0.9553758502006531], [154.37930297851562, 145.5393829345703, 0.9516996741294861], [92.8624267578125, 184.04098510742188, 0.9412126541137695], [178.31959533691406, 165.24205017089844, 0.8640587329864502], [122.38360595703125, 189.9923858642578, 0.8330625295639038], [147.88308715820312, 183.24745178222656, 0.9424697756767273], [134.7152099609375, 191.03546142578125, 0.9262460470199585], [154.29681396484375, 264.6568603515625, 0.8947860598564148], [153.10012817382812, 265.38720703125, 0.9357694983482361], [140.0021514892578, 322.24884033203125, 0.8464810252189636], [168.99900817871094, 322.4757995605469, 0.8164876103401184]], [[212.71043395996094, 256.15692138671875, 0.8365193605422974], [208.12818908691406, 251.4163818359375, 0.8502614498138428], [209.69515991210938, 256.8374328613281, 0.5522475242614746], [211.00241088867188, 237.36216735839844, 0.7856449484825134], [0.0, 0.0, 0.0], [224.466064453125, 210.54421997070312, 0.9782769680023193], [240.1357879638672, 250.75091552734375, 0.8946046829223633], [219.88368225097656, 174.4077911376953, 0.9979187846183777], [238.06390380859375, 293.59661865234375, 0.861858069896698], [197.73731994628906, 169.2177276611328, 0.9272624850273132], [235.73577880859375, 334.4454040527344, 0.8017373085021973], [270.48651123046875, 176.56687927246094, 0.8050549626350403], [285.7158203125, 195.37460327148438, 0.7933297157287598], [250.8893585205078, 123.7066650390625, 0.6800427436828613], [312.3864440917969, 177.59234619140625, 0.5757677555084229], [241.41329956054688, 64.89651489257812, 0.5552536249160767], [357.16583251953125, 157.88079833984375, 0.5374707579612732]]]] 

keypoints=[[['nose', 'left_eye', 'right_eye', 'left_ear', 'right_ear', 'left_shoulder', 'right_shoulder', 'left_elbow', 'right_elbow', 'left_wrist', 'right_wrist', 'left_hip', 'right_hip', 'left_knee', 'right_knee', 'left_ankle', 'right_ankle'], ['nose', 'left_eye', 'right_eye', 'left_ear', 'right_ear', 'left_shoulder', 'right_shoulder', 'left_elbow', 'right_elbow', 'left_wrist', 'right_wrist', 'left_hip', 'right_hip', 'left_knee', 'right_knee', 'left_ankle', 'right_ankle']]] 

scores=[[0.8542259724243828, 0.7930507659912109]] 

skeletons=[[[(16, 14), (14, 12), (17, 15), (15, 13), (12, 13), (6, 12), (7, 13), (6, 7), (6, 8), (7, 9), (8, 10), (9, 11), (2, 3), (1, 2), (1, 3), (2, 4), (3, 5), (4, 6), (5, 7)], [(16, 14), (14, 12), (17, 15), (15, 13), (12, 13), (6, 12), (7, 13), (6, 7), (6, 8), (7, 9), (8, 10), (9, 11), (2, 3), (1, 2), (1, 3), (2, 4), (3, 5), (4, 6), (5, 7)]]]

Simple private unittest (not in the repo, since we do not have any zoo model yet):

import numpy
import PIL

import pytest
from deepsparse import Pipeline


def _image_numpy(input_str, channel_swap=False):
    with open(input_str, "rb") as f:
        image = numpy.array(PIL.Image.open(f).convert("RGB"))
    if channel_swap:
        image = image.transpose(2, 0, 1)
    return image


input_str = "/home/ubuntu/damian/deepsparse_copy/src/deepsparse/open_pif_paf/sample_images/dancers.jpg"


@pytest.mark.parametrize(
    "input, batch_size",
    [
        (input_str, 1),
        (input_str, 2),
        (_image_numpy(input_str, False), 1),
        (_image_numpy(input_str, True), 1),
        (_image_numpy(input_str, False), 2),
        (_image_numpy(input_str, True), 2),
    ],
)
def test_pipeline_inputs(input, batch_size):
    pipeline = Pipeline.create(task="open_pif_paf", batch_size=batch_size)
    if batch_size == 1:
        pred = pipeline(images=[input])
    else:
        pred = pipeline(images=[input, input])
    assert numpy.array(pred.data[0]).shape == (2, 17, 3)

The test was green

Investigation of the necessity of external OpenPifPaf helper function

As illustrated by the diagram from the original paper, once the image has been encoded into PIF or PAF (or CIF and CAF, same meaning here, those are just two naming conventions by the authors from the two different papers) tensors, they need to be later decoded into annotations.
image

Once the neural network outputs CIF and CAF tensors, they are processed by an algorithm described below:

image

For speed reasons, the decoding is actually implemented in C++ and libtorch:

https://github.com/openpifpaf/openpifpaf/src/openpifpaf/csrc

This is the part that I am right now unable to properly implement in the DeepSparse repo, so I mock it in the following way:

On the pipeline instantiation

self.model_cpu, _ = network.Factory().factory(head_metas=None)
self.processor = decoder.factory(self.model_cpu.head_metas)

First, we fetch the default model object (also a second argument, which is the last epoch of the pre-trained model) from the factory. Note, this model will not be used for inference, we need to pull the information about the heads of this model: model_cpu.head_metas: List[Cif, Caf]. This information will be consumed to create a (set of) decoder(s) (objects that map "fields", raw network output, to human-understandable annotations).

Note: The Cif and Caf objects seem to be dataset-dependent. They hold e.g. the information about the expected relationship of the joints of the pose (skeleton).

Hint: Instead of returning Annotation objects, the API supports returning annotations as JSON serializable dicts. This is probably what we should aim for, especially when we want to deploy OpenPifPaf inference in the Server.

In the default scenario (I suspect for all the pose estimation tasks), the self.processor will be a Multi object that holds a single CifCaf decoder.

Other available decoders:

{
openpifpaf.decoder.cifcaf.CifCaf,
openpifpaf.decoder.cifcaf.CifCafDense, # not sure what this does
openpifpaf.decoder.cifdet.CifDet, # I think this is just for the object detection task
openpifpaf.decoder.pose_similarity.PoseSimilarity, # for pose similarity task
openpifpaf.decoder.tracking_pose.TrackingPose # for tracking task
}

On the engine output preprocessing

 def process_engine_outputs(self, fields, **kwargs):
 
     for idx, (cif, caf) in enumerate(zip(*fields)):
         annotations = self.processor._mappable_annotations(
                [torch.tensor(cif), 
                 torch.tensor(caf)], None, None)

We are passing the CIF and CAF values directly to the processor (through the private function, self.processor by default does also batching and inference). This is the functionality that we would like to fold into our computational graph (I suppose):

  1. To avoid being dependent on the external library and their torch-dependent implementation
  2. To make sure that decoding will not be a bottleneck in terms of inference speed

@dbogunowicz dbogunowicz changed the title [OpenPifPaf] Productionizing the pipeline [Open Pif Paf] Productionizing the pipeline Nov 21, 2022
corey-nm
corey-nm previously approved these changes Nov 22, 2022
@dbogunowicz dbogunowicz changed the base branch from feature/damian/simple_open_pif_paf to main November 28, 2022 16:50
@dbogunowicz dbogunowicz dismissed corey-nm’s stale review November 28, 2022 16:50

The base branch was changed.

corey-nm
corey-nm previously approved these changes Nov 28, 2022
Copy link
Contributor

@corey-nm corey-nm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some optional nits, looks great to me!

src/deepsparse/open_pif_paf/annotate.py Outdated Show resolved Hide resolved
src/deepsparse/open_pif_paf/annotate.py Outdated Show resolved Hide resolved
src/deepsparse/open_pif_paf/utils/annotate.py Outdated Show resolved Hide resolved
src/deepsparse/open_pif_paf/utils/annotate.py Show resolved Hide resolved
Copy link
Member

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending comment

Questions:

  • since we are returning the encoded outputs directly, what is the intended use for the user?
  • what is the intended server environment flow?
  • test plan?

src/deepsparse/open_pif_paf/utils/annotate.py Outdated Show resolved Hide resolved
Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com>
@dbogunowicz
Copy link
Contributor Author

@bfineran

I moved some of the postprocessing from visualization to the pipeline, so the user receives human-readable output from the pipeline and thus the pipeline can be also used efficiently in the server.

bogunowicz@arrival.com added 2 commits November 29, 2022 14:47
corey-nm
corey-nm previously approved these changes Nov 29, 2022
Copy link
Contributor

@corey-nm corey-nm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! really clean implementations, nice! 🚀

src/deepsparse/open_pif_paf/annotate.py Show resolved Hide resolved
src/deepsparse/open_pif_paf/annotate.py Outdated Show resolved Hide resolved
src/deepsparse/open_pif_paf/annotate.py Show resolved Hide resolved
src/deepsparse/open_pif_paf/utils/annotate.py Outdated Show resolved Hide resolved
Co-authored-by: corey-nm <109536191+corey-nm@users.noreply.github.com>
corey-nm
corey-nm previously approved these changes Nov 30, 2022
Copy link
Contributor

@corey-nm corey-nm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

bfineran
bfineran previously approved these changes Nov 30, 2022
Copy link
Member

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @dbogunowicz

following @corey-nm 's comments in vit-pose let's get a simple readme up (can be a quick follow up). also as discussed offline let's do a brief check on postprocessing times using our new logging tools :)

@dbogunowicz
Copy link
Contributor Author

@bfineran , per PythonLogger, this is mean and std for 20 consecutive forward passes using a dense openpifpaf pose estimation model that takes 384x384 images as input (this is the resolution with which @anmarques is working with at the moment)

pre_process_latency[s] 0.0044 +- 0.0015
engine_forward_latency[s] 0.1699 +- 0.0173
post_process_latency[s] 0.0050 +- 0.0010

No red flags here seems that on average both pre- and post-processing steps are two OOM smaller than the engine forward.

Copy link
Contributor

@corey-nm corey-nm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!!

@dbogunowicz dbogunowicz changed the title [Open Pif Paf] Productionizing the pipeline OpenPifPaf DeepSparse Pipeline Dec 1, 2022
@dbogunowicz dbogunowicz merged commit af6a9fa into main Dec 1, 2022
@dbogunowicz dbogunowicz deleted the feature/damian/productionizing_open_pif_paf branch December 1, 2022 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants