Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ViT Pose Pipeline Examlpe #794

Merged
merged 5 commits into from
Dec 1, 2022
Merged

Conversation

dbogunowicz
Copy link
Contributor

@dbogunowicz dbogunowicz commented Nov 23, 2022

Stage 0 exploration for the ViT Pose model

Source: https://github.com/ViTAE-Transformer/ViTPose

Installation

Follow the instructions in the readme file. Note:

  • installing one of the dependencies, mmcv takes a lot of time and may look often like it is stuck. Be patient, it will eventually terminate successfully.
  • after the setup completes, it is also advisable to downgrade the default torch version from 1.3 to 1.2 to avoid CUDA errors (at least on my server, which does not support Cuda setup for 1.3)

Export

Exporting the sample onnx model is quite easy. Before running the onnx install, one needs to manually install timm, onnx and onnxruntime. Then, launch the export script:

python tools/deployment/pytorch2onnx.py /home/ubuntu/damian/ViTPose/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/ViTPose_base_coco_256x192.py /home/ubuntu/damian/ViTPose/vitpose-b.pth 

The first argument is a config file (for ViTpose B) the second argument is the .pth checkpoint (weights). Both can be found on the main site of the repository:
image

The resulting model is about 400mb.
Zrzut ekranu 2022-11-23 o 13 28 29

Zrzut ekranu 2022-11-23 o 13 28 43

Benchmarking in DeepSparse:

Naive benchmarking shows that for the dense model, the engine is roughly x2 faster then ORT:
Zrzut ekranu 2022-11-23 o 13 06 23

Postprocessing

ViT-Pose might be our first candidate for a "composed" deepsparse pipeline. It is a top-down pose estimation approach.
We take an image and run object detection (e.g. look for all the humans in the image)
We pass the cropped bounding boxes to ViT to get an array (batch, no_keypoints, h, w) array. To decode this array, according to the original paper, we need some simple composition of transposed convolutions.

What I do naively in this PR: I "squash" the array to (h,w) and then overlay it on the original image. We can see that the heatmap roughly coincides with the joints of the model.

image

image

Copy link
Member

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm through engine forward

  • any idea on complexity of postprocessing?
  • for stage zero pipelines let's get them added in first into an examples directory potentially

corey-nm
corey-nm previously approved these changes Nov 29, 2022
Copy link
Contributor

@corey-nm corey-nm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! clean integration

@corey-nm
Copy link
Contributor

Perhaps include some of the contents of the PR description as a README in the example folder?

Copy link
Member

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending comment on removing changes to task.py

+1 to a simple README

src/deepsparse/tasks.py Outdated Show resolved Hide resolved
Copy link
Contributor

@corey-nm corey-nm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

@dbogunowicz dbogunowicz changed the title [ViT Pose] Feature Branch (Stage 0) ViT Pose Pipeline Examlpe Dec 1, 2022
@dbogunowicz dbogunowicz merged commit acb075f into main Dec 1, 2022
@dbogunowicz dbogunowicz deleted the feature/damian/vit_pose_stage_zero branch December 1, 2022 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants