Skip to content

Commit

Permalink
Feature/sg 1060 yolo nas pose (#1611)
Browse files Browse the repository at this point in the history
* crowdpose_yolo_nas_pose_s

* crowdpose_yolo_nas_pose_s

* crowdpose_yolo_nas_pose_s

* crowdpose_yolo_nas_pose_s

* coco2017_yolo_nas_pose_s_ema_less_mosaic

* coco2017_yolo_nas_pose_s_less_mosaic

* coco2017_yolo_nas_pose_s_ema_less_mosaic_higher_final_lr_fp32

* coco2017_yolo_nas_pose_s_ema_less_mosaic_higher_final_lr_fp32

* coco2017_yolo_nas_pose_s_ema_less_mosaic_lr_focal

* shared head

* YoloNASPoseBoxesPostPredictionCallback

* New head design

* Another recipe with less zoom out, no crowd images

* Another recipe with less zoom out, no crowd images

* Another recipe with less zoom out, no crowd images

* coco2017_yolo_nas_pose_shared_s_ema_less_mosaic_lr_bce_local

* coco2017_yolo_nas_pose_shared_s_ema_less_mosaic_lr_bce_local

* Update scores

* Cleanup old configs, keep one config that gives best AP score

* Shortened recipe

* coco2017_yolo_nas_pose_shared_s_384_short

* Tune short recipe

* Tune short recipe

* Tune short recipe

* Tune short recipe

* coco2017_yolo_nas_pose_s_local

* Update settings of crowd_annotations_action to mask_as_normal since this is the setting we got the best result with

* coco2017_yolo_nas_pose_shared_s_local

* Update default params

* Update default params

* Update DEKR recipe

* M variant

* M variant

* Put more correct min_deltha

* Put more correct min_deltha

* Put more correct min_deltha

* Adding placeholders for YOLO-NAS-POSE

* Rename detection model export test file

* Adding export API support for pose estimation

* Adding export API support for pose estimation

* Added tmp hack

* multiply_by_pose_oks

* assigner_multiply_by_pose_oks

* ExperimentImprove visualization

* Update CrowdPose dataset

* Crowdpose

* Crowdpose

* Crowdpose

* crowdpose_yolo_nas_pose_s_no_crowd_no_ema_local

* Lower LR

* Proxy recipe

* crowdpose_yolo_nas_pose_s_proxy

* crowdpose_yolo_nas_pose_s_proxy

* crowdpose_yolo_nas_pose_s_proxy

* New architectures

* Fix WANDB params

* Fix WANDB params

* Fix WANDB params

* Fix WANDB params

* New architectures

* M

* L

* M

* coco2017_yolo_nas_pose_l_resume

* coco2017_yolo_nas_pose_m_resume

* Added fix to _is_more_extreme which would ensure callback would not crash if observed loss/metric is NaN/Inf

* Reduce LR

* Reduce LR

* Change EMA paramass

* Export and scores

* Export

* Fix bug of not saving simplified model

* Optimize head return types for better inference efficiency

* Metrics

* Yolo NAS Pose N

* Only EarlyStop no batch visualization

* coco2017_yolo_nas_pose_l_no_ema

* Only EarlyStop no batch visualization

* Removing old architectures

* Notebook for evaluation on COCO

* Remove unnecessary recipies

* Simplify the metric -> pass entire sample to the metric

* Simplify recipe

* coco2017_yolo_nas_pose_n_resume

* Simplify recipe

* Transforms overhaul & refactoring

* Transforms overhaul & refactoring

* Remove KeypointsImageToTensor transform - this will be done in collate fn

* Fix collate fn to do image layout change HWC->CHW

* Attempt to optimize efficiency

* Attempt to optimize efficiency

* Attempt to optimize efficiency

* Attempt to optimize efficiency

* Attempt to optimize efficiency

* Attempt to optimize efficiency

* Attempt to optimize efficiency

* Attempt to optimize efficiency

* Attempt to optimize efficiency

* Attempt to optimize efficiency

* Refactor sample

* Simplify recipe

* New keypoint transform

* Lower dropout rates & heavy augs

* crowdpose_yolo_nas_pose_s

* Improve visualization of pose gt by showing whether it is crowd target or not

* Make convert_to_tensor a bit more efficient by avoiding creating a tensor on cpu and then moving it to target device.

* Compute metric on CPU (Surprisingly it is faster, since amount of data & compute is not that big so data transfer takes more time than compute)

* Improve speed of computing focal loss

* New batch of training experiments

* New batch of training experiments

* Introduce sample-centric keypoint transforms

* Cleanup leftovers

* Update numbers

* Add benchmark results

* Fixed way of checking transforms that require additional samples

* Docstrings

* :attr -> :param

* Added docs clarifying behavior of mosaic & mixup

* Added docs clarifying behavior of mosaic & mixup

* Improved tests

* Additional docstrings & typing annotations

* Focal-EIOU loss

* Added missing additional_samples_count field

* Fixed predict implementation for pose

* Added docstrings

* KeypointsRemoveSmallObjects

* KeypointsRemoveSmallObjects

* Metric class to use data samples

* New dataset classes

* Reverting back old files to keep & update dataset recipies

* Simplified rescoring dataset params YAML file by using coco_pose_common_dataset_params defaults

* Removed old docs

* Remove space

* Introduce AbstractPoseEstimationPostPredictionCallback interface and move PoseEstimationPredictions to a proper place

* Cherry pick changes to post-prediction, visualization and metric

* Remove unwanted references to new datasets

* Remove YoloNASPoseCollateFN

* Make heavy augs a default training param for M & L

* Remove dropout

* Fixed unit test

* Update YoloNAS-M score

* Feature/sg 1060 yolo nas pose release pr to add datasets and metric (#1506)

* Cherry pick changes to post-prediction, visualization and metric

* Remove unwanted references to new datasets

* Remove YoloNASPoseCollateFN

* Fixed unit test

* Improve clarify of bbox format by giving it more explicit name and added a bunch of docstrings

* Improve variable names

* Document YoloNASPose loss

* Squashed changes with YoloNASPose & Loss

* Fixed attribute name that was not renamed

* Remove print statement

* Remove print statement

* Fixed attribute name that was not renamed

* Improve docstrings to use 'Num Keypoints' instead of magic number 17

* Fixed PoseNMS export to work with custom number of keypoints

* Remove outdated test

* Update recipes

* Added docstrings

* Simplify forward/forward_eval

* Simplify forward/forward_eval

* Remove any2device_no_grad

* _insert_heads_list_params

* _insert_heads_list_params

* Update ExtremeBatchCaseVisualizationCallback

* Document visualization callback better

* Added YoloNASPose test

* Added tests

* Refactor the way we generate usage instructions. Should be easier to follow and update

* Revert rename

* Dataset & Visualization callback

* Improve docstrings

* Improved docstrings

* Improved docstrings

* Improved docstrings

* Improved docstrings

* Rename bboxes -> bboxes_xyxy

* Example colab for evaluation of ONNX model

* Rename bboxes -> bboxes_xyxy

* Fixed instructions text

* Improve efficiency of training

* Remove files

* Update numbers

* Update animal pose

* Added integration tests for YoloNASPose

* Fix bug in replace head

* Add pretrain weights

* Added export notebook example

* Update integration test

* Updating branch for merge

* Updating branch for merge

* Remove AnimalPoseDataset

* Update markdown text

* Update markdown text

* Revert

* Added license

* Improve debug text in transfer_weights

* Revert

* Cleanup recipes

* Revert

* Cleanup recipes

* Revert

* Update notebooks

* Update mkdocs to include pose estimation

* Update docs

* Revert test

* Put back YAML file

* Added check to print license for YoloNAS-POSE

* Fixed bug in _pad_image that did not support pad_value=(R,B,G) input

* Added images & updated links to notebooks

* Added images & updated links to notebooks

* Added pop dataset_class from dataloader params

* Update quickstart

* Added missing rgb2bgr conversion

* Added missing rgb2bgr conversion

* Disable visualization of samples by default

* Added docstrings

* Updated additional resoruces section with link to recipies docs

* successor -> derivative

* Re-run notebook

* Fixed recipe to code test

* Re-run notebook

---------

Co-authored-by: Shay Aharon <80472096+shaydeci@users.noreply.github.com>
  • Loading branch information
BloodAxe and shaydeci committed Nov 6, 2023
1 parent b88e99b commit 46cdc18
Show file tree
Hide file tree
Showing 35 changed files with 2,949 additions and 9 deletions.
16 changes: 16 additions & 0 deletions LICENSE.YOLONAS-POSE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# YOLO-NAS-POSE License

These model weights or any components comprising the model and the associated documentation (the "Software") is licensed to you by Deci.AI, Inc. ("Deci") under the following terms:
© 2023 – Deci.AI, Inc.

Subject to your full compliance with all of the terms herein, Deci hereby grants you a non-exclusive, revocable, non-sublicensable, non-transferable worldwide and limited right and license to use the Software. If you are using the Deci platform for model optimization, your use of the Software is subject to the Terms of Use available here (the "Terms of Use").

You shall not, without Deci's prior written consent:
(i) resell, lease, sublicense or distribute the Software to any person;
(ii) use the Software to provide third parties with managed services or provide remote access to the Software to any person or compete with Deci in any way;
(iii) represent that you possess any proprietary interest in the Software;
(iv) directly or indirectly, take any action to contest Deci's intellectual property rights or infringe them in any way;
(V) reverse-engineer, decompile, disassemble, alter, enhance, improve, add to, delete from, or otherwise modify, or derive (or attempt to derive) the technology or source code underlying any part of the Software;
(vi) use the Software (or any part thereof) in any illegal, indecent, misleading, harmful, abusive, harassing and/or disparaging manner or for any such purposes. Except as provided under the terms of any separate agreement between you and Deci, including the Terms of Use to the extent applicable, you may not use the Software for any commercial use, including in connection with any models used in a production environment.

DECI PROVIDES THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS OF THE SOFTWARE BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ sweeper_test:

# Here you define a list of notebooks we want to execute and convert to markdown files
NOTEBOOKS_TO_RUN := src/super_gradients/examples/model_export/models_export.ipynb
NOTEBOOKS_TO_RUN += src/super_gradients/examples/model_export/models_export_pose.ipynb
NOTEBOOKS_TO_RUN += notebooks/what_are_recipes_and_how_to_use.ipynb
NOTEBOOKS_TO_RUN += notebooks/transfer_learning_classification.ipynb
NOTEBOOKS_TO_RUN += notebooks/how_to_use_knowledge_distillation_for_classification.ipynb
Expand Down
119 changes: 119 additions & 0 deletions YOLONAS-POSE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# YOLO-NAS-POSE
### A Next-Generation, Pose Estimation Foundational Model generated by Deci’s Neural Architecture Search Technology

Deci is thrilled to announce the release of a new object detection model, YOLO-NAS-POSE - a derivative of [YOLO-NAS](YOLONAS.md),
pose estimation architecture, providing superior real-time object detection capabilities and production-ready performance.
Deci's mission is to provide AI teams with tools to remove development barriers and attain efficient inference performance more quickly.

![YOLO-NAS-POSE](documentation/source/images/yolo_nas_pose_frontier_t4.png)

The new YOLO-NAS-POSE delivers state-of-the-art (SOTA) performance with the unparalleled accuracy-speed performance, outperforming other models such as YOLOv8-Pose, DEKR and others.

Deci's proprietary Neural Architecture Search technology, [AutoNAC™](https://deci.ai/technology/), generated the architecture of YOLO-NAS-POSE model.
The AutoNAC™ engine lets you input any task, data characteristics (access to data is not required), inference environment and performance targets,
and then guides you to find the optimal architecture that delivers the best balance between accuracy and inference speed for your specific application.
In addition to being data and hardware aware, the AutoNAC engine considers other components in the inference stack, including compilers and quantization.

| Model | AP | Latency (ms) |
|------------------|-------|--------------|
| YOLO-NAS N | 59.68 | 2.35 ms |
| YOLO-NAS S | 64.15 | 3.29 ms |
| YOLO-NAS M | 67.87 | 6.87 ms |
| YOLO-NAS L | 68.24 | 8.86 ms |

AP numbers in table reported for COCO 2017 Val dataset and latency benchmarked for 640x640 images on Nvidia T4 GPU.
No flip-TTA was used.

Similarly to YOLO-NAS, YOLO-NAS-POSE architecture employs quantization-aware blocks and selective quantization for optimized performance.
In fact YOLO-NAS-POSE is a derivative of YOLO-NAS and uses same backbone and neck as YOLO-NAS.
Only the head is different and is optimized by AutoNAC for pose estimation task.
That enables us to use transfer learning and fine-tune YOLO-NAS-POSE starting from YOLO-NAS weights.


## Quickstart

### Extract predicted poses

```python
import super_gradients

yolo_nas = super_gradients.training.models.get("yolo_nas_pose_l", pretrained_weights="coco_pose").cuda()
model_predictions = yolo_nas.predict("https://deci-pretrained-models.s3.amazonaws.com/sample_images/beatles-abbeyroad.jpg", conf=0.5).show()

prediction = model_predictions[0].prediction # One prediction per image - Here we work with 1 image, so we get the first.

bboxes = prediction.bboxes_xyxy # [Num Instances, 4] List of predicted bounding boxes for each object
poses = prediction.poses # [Num Instances, Num Joints, 3] list of predicted joints for each detected object (x,y, confidence)
scores = prediction.scores # [Num Instances] - Confidence value for each predicted instance
```

![YOLO-NAS-POSE Predict Demo](documentation/source/images/yolo_nas_pose_predict_demo.jpg)

### Recipes

We provide training recipies for training YOLO-NAS-POSE on COCO, CrowdPose and AnimalPose datasets.

#### COCO 2017

* [super_gradients/recipes/coco2017_yolo_nas_pose_n.yaml](src/super_gradients/recipes/coco2017_yolo_nas_pose_n.yaml)
* [super_gradients/recipes/coco2017_yolo_nas_pose_s.yaml](src/super_gradients/recipes/coco2017_yolo_nas_pose_s.yaml)
* [super_gradients/recipes/coco2017_yolo_nas_pose_m.yaml](src/super_gradients/recipes/coco2017_yolo_nas_pose_m.yaml)
* [super_gradients/recipes/coco2017_yolo_nas_pose_l.yaml](src/super_gradients/recipes/coco2017_yolo_nas_pose_l.yaml)


## Additional resources

<table>
<tr>
<td>
<a target="_blank" href="https://colab.research.google.com/drive/1O4N5Vbzv0rfkT81LQidPktX8RtoS5A40">
<img src="./documentation/assets/SG_img/colab_logo.png" /> Predict poses with YoloNAS Pose Model
</a>
</td>
</tr>
<tr>
<td>
<a target="_blank" href="https://colab.research.google.com/drive/1agLj0aGx48C_rZPrTkeA18kuncack6lF">
<img src="./documentation/assets/SG_img/colab_logo.png" /> Fine-Tune YoloNAS Pose on AnimalPose dataset Notebook
</a>
</td>
</tr>
<tr>
<td>
<a target="_blank" href="documentation/source/YoloNASPoseQuickstart.md">
Documentation: YOLO-NAS-POSE Quickstart
</a>
</td>
</tr>
<tr>
<td>
<a target="_blank" href="documentation/source/Recipes_Training.md">
Documentation: Recipies
</a>
</td>
</tr>
<tr>
<td>
<a target="_blank" href="documentation/source/models_export_pose.md">
Documentation: YOLO-NAS-POSE Export
</a>
</td>
</tr>


<tr>
<td>
Join our <a target="_blank" href="https://discord.gg/2v6cEGMREN">
Discord Community
</a>
</td>
</tr>
</table>


## LICENSE

The YOLO-NAS-POSE model is available under an open-source license with pre-trained weights available for non-commercial use on SuperGradients, Deci's PyTorch-based, open-source, computer vision training library.
With SuperGradients, users can train models from scratch or fine-tune existing ones, leveraging advanced built-in training techniques like Distributed Data Parallel, Exponential Moving Average, Automatic mixed precision, and Quantization Aware Training.

License file is available here: [YOLO-NAS-POSE WEIGHTS LICENSE](LICENSE.YOLONAS-POSE.md)
45 changes: 45 additions & 0 deletions documentation/source/YoloNASPoseQuickstart.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# YOLO-NAS-POSE Quickstart
<div>
<img src="images/yolo_nas_pose_frontier_t4.png" width="750">
</div>

Deci’s leveraged its proprietary Neural Architecture Search engine (AutoNAC) to generate YOLO-NAS-POSE - a new object
detection architecture that delivers the world’s best accuracy-latency performance.

The YOLO-NAS-POSE model incorporates quantization-aware RepVGG blocks to ensure compatibility with post-training
quantization, making it very flexible and usable for different hardware configurations.

In this tutorial, we will go over the basic functionality of the YOLO-NAS-POSE model.


## Instantiate a YOLO-NAS-POSE Model

```python
from super_gradients.training import models
from super_gradients.common.object_names import Models

yolo_nas_pose = models.get(Models.YOLO_NAS_POSE_L, pretrained_weights="coco_pose")
```

## Predict

```python
prediction = yolo_nas_pose.predict("https://deci-pretrained-models.s3.amazonaws.com/sample_images/beatles-abbeyroad.jpg")
prediction.show()
```
<div>
<img src="images/yolo_nas_pose_predict_demo.jpg" width="750">
</div>

## Export to ONNX & TensorRT

```python
yolo_nas_pose.export("yolo_nas_pose.onnx")
```

Please follow our [Pose Estimation Models Export](models_export_pose.md) tutorial for more details.

## Evaluation using pycocotools

We provide example notebook to evaluate YOLO-NAS POSE using COCO protocol.
Please check [Pose Estimation Models Export](https://github.com/Deci-AI/super-gradients/blob/master/notebooks/yolo_nas_pose_eval_with_pycocotools.ipynb) tutorial for more details.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 9 additions & 3 deletions documentation/source/model_zoo.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,9 +92,13 @@ All the available models are listed in the column `Model name`.

### Pretrained Pose Estimation PyTorch Checkpoints

| Model | Model Name | Dataset | Resolution | AP (No TTA / H-Flip TTA / H-Flip TTA+Rescoring) | Latency b1<sub>T4</sub> | Latency b1<sub>T4</sub> including IO | Latency (Production)**<sub>Jetson Xavier NX</sub> |
|-----------------|-----------------|-------------|------------|-------------------------------------------------|-------------------------|--------------------------------------|:-------------------------------------------------:|
| DEKR_W32_NO_DC | dekr_w32_no_dc | COCO2017 PE | 640x640 | 63.08 / 64.96 / 67.32 | 13.29 ms | 15.31 ms | 75.99 ms |
| Model | Model Name | Dataset | Resolution | AP (No TTA / H-Flip TTA / H-Flip TTA+Rescoring) | Latency b1<sub>T4</sub> | Latency b1<sub>T4</sub> including IO | Latency (Production)**<sub>Jetson Xavier NX</sub> |
|----------------|-----------------|-------------|------------|-------------------------------------------------|-------------------------|--------------------------------------|:-------------------------------------------------:|
| DEKR_W32_NO_DC | dekr_w32_no_dc | COCO2017 PE | 640x640 | 63.08 / 64.96 / 67.32 | 13.29 ms | 15.31 ms | 75.99 ms |
| YoloNAS POSE N | yolo_nas_pose_n | COCO2017 PE | 640x640 | 59.68 / N/A / N/A | N/A | 2.35 ms | 15.99 ms |
| YoloNAS POSE S | yolo_nas_pose_s | COCO2017 PE | 640x640 | 64.15 / N/A / N/A | N/A | 3.29 ms | 21.01 ms |
| YoloNAS POSE M | yolo_nas_pose_m | COCO2017 PE | 640x640 | 67.87 / N/A / N/A | N/A | 6.87 ms | 38.40 ms |
| YoloNAS POSE L | yolo_nas_pose_l | COCO2017 PE | 640x640 | 68.24 / N/A / N/A | N/A | 8.86 ms | 49.34 ms |


## Implemented Model Architectures
Expand Down Expand Up @@ -141,4 +145,6 @@ Devices[https://arxiv.org/pdf/1807.11164](https://arxiv.org/pdf/1807.11164)


### Pose Estimation

- [HRNet DEKR](https://github.com/HRNet/HigherHRNet-Human-Pose-Estimation) - Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression [https://arxiv.org/pdf/2104.02300.pdf](https://arxiv.org/pdf/2104.02300.pdf)
- YoloNAS Pose
Loading

0 comments on commit 46cdc18

Please sign in to comment.