Feature/sg 1060 yolo nas pose (#1611)

* crowdpose_yolo_nas_pose_s * crowdpose_yolo_nas_pose_s * crowdpose_yolo_nas_pose_s * crowdpose_yolo_nas_pose_s * coco2017_yolo_nas_pose_s_ema_less_mosaic * coco2017_yolo_nas_pose_s_less_mosaic * coco2017_yolo_nas_pose_s_ema_less_mosaic_higher_final_lr_fp32 * coco2017_yolo_nas_pose_s_ema_less_mosaic_higher_final_lr_fp32 * coco2017_yolo_nas_pose_s_ema_less_mosaic_lr_focal * shared head * YoloNASPoseBoxesPostPredictionCallback * New head design * Another recipe with less zoom out, no crowd images * Another recipe with less zoom out, no crowd images * Another recipe with less zoom out, no crowd images * coco2017_yolo_nas_pose_shared_s_ema_less_mosaic_lr_bce_local * coco2017_yolo_nas_pose_shared_s_ema_less_mosaic_lr_bce_local * Update scores * Cleanup old configs, keep one config that gives best AP score * Shortened recipe * coco2017_yolo_nas_pose_shared_s_384_short * Tune short recipe * Tune short recipe * Tune short recipe * Tune short recipe * coco2017_yolo_nas_pose_s_local * Update settings of crowd_annotations_action to mask_as_normal since this is the setting we got the best result with * coco2017_yolo_nas_pose_shared_s_local * Update default params * Update default params * Update DEKR recipe * M variant * M variant * Put more correct min_deltha * Put more correct min_deltha * Put more correct min_deltha * Adding placeholders for YOLO-NAS-POSE * Rename detection model export test file * Adding export API support for pose estimation * Adding export API support for pose estimation * Added tmp hack * multiply_by_pose_oks * assigner_multiply_by_pose_oks * ExperimentImprove visualization * Update CrowdPose dataset * Crowdpose * Crowdpose * Crowdpose * crowdpose_yolo_nas_pose_s_no_crowd_no_ema_local * Lower LR * Proxy recipe * crowdpose_yolo_nas_pose_s_proxy * crowdpose_yolo_nas_pose_s_proxy * crowdpose_yolo_nas_pose_s_proxy * New architectures * Fix WANDB params * Fix WANDB params * Fix WANDB params * Fix WANDB params * New architectures * M * L * M * coco2017_yolo_nas_pose_l_resume * coco2017_yolo_nas_pose_m_resume * Added fix to _is_more_extreme which would ensure callback would not crash if observed loss/metric is NaN/Inf * Reduce LR * Reduce LR * Change EMA paramass * Export and scores * Export * Fix bug of not saving simplified model * Optimize head return types for better inference efficiency * Metrics * Yolo NAS Pose N * Only EarlyStop no batch visualization * coco2017_yolo_nas_pose_l_no_ema * Only EarlyStop no batch visualization * Removing old architectures * Notebook for evaluation on COCO * Remove unnecessary recipies * Simplify the metric -> pass entire sample to the metric * Simplify recipe * coco2017_yolo_nas_pose_n_resume * Simplify recipe * Transforms overhaul & refactoring * Transforms overhaul & refactoring * Remove KeypointsImageToTensor transform - this will be done in collate fn * Fix collate fn to do image layout change HWC->CHW * Attempt to optimize efficiency * Attempt to optimize efficiency * Attempt to optimize efficiency * Attempt to optimize efficiency * Attempt to optimize efficiency * Attempt to optimize efficiency * Attempt to optimize efficiency * Attempt to optimize efficiency * Attempt to optimize efficiency * Attempt to optimize efficiency * Refactor sample * Simplify recipe * New keypoint transform * Lower dropout rates & heavy augs * crowdpose_yolo_nas_pose_s * Improve visualization of pose gt by showing whether it is crowd target or not * Make convert_to_tensor a bit more efficient by avoiding creating a tensor on cpu and then moving it to target device. * Compute metric on CPU (Surprisingly it is faster, since amount of data & compute is not that big so data transfer takes more time than compute) * Improve speed of computing focal loss * New batch of training experiments * New batch of training experiments * Introduce sample-centric keypoint transforms * Cleanup leftovers * Update numbers * Add benchmark results * Fixed way of checking transforms that require additional samples * Docstrings * :attr -> :param * Added docs clarifying behavior of mosaic & mixup * Added docs clarifying behavior of mosaic & mixup * Improved tests * Additional docstrings & typing annotations * Focal-EIOU loss * Added missing additional_samples_count field * Fixed predict implementation for pose * Added docstrings * KeypointsRemoveSmallObjects * KeypointsRemoveSmallObjects * Metric class to use data samples * New dataset classes * Reverting back old files to keep & update dataset recipies * Simplified rescoring dataset params YAML file by using coco_pose_common_dataset_params defaults * Removed old docs * Remove space * Introduce AbstractPoseEstimationPostPredictionCallback interface and move PoseEstimationPredictions to a proper place * Cherry pick changes to post-prediction, visualization and metric * Remove unwanted references to new datasets * Remove YoloNASPoseCollateFN * Make heavy augs a default training param for M & L * Remove dropout * Fixed unit test * Update YoloNAS-M score * Feature/sg 1060 yolo nas pose release pr to add datasets and metric (#1506) * Cherry pick changes to post-prediction, visualization and metric * Remove unwanted references to new datasets * Remove YoloNASPoseCollateFN * Fixed unit test * Improve clarify of bbox format by giving it more explicit name and added a bunch of docstrings * Improve variable names * Document YoloNASPose loss * Squashed changes with YoloNASPose & Loss * Fixed attribute name that was not renamed * Remove print statement * Remove print statement * Fixed attribute name that was not renamed * Improve docstrings to use 'Num Keypoints' instead of magic number 17 * Fixed PoseNMS export to work with custom number of keypoints * Remove outdated test * Update recipes * Added docstrings * Simplify forward/forward_eval * Simplify forward/forward_eval * Remove any2device_no_grad * _insert_heads_list_params * _insert_heads_list_params * Update ExtremeBatchCaseVisualizationCallback * Document visualization callback better * Added YoloNASPose test * Added tests * Refactor the way we generate usage instructions. Should be easier to follow and update * Revert rename * Dataset & Visualization callback * Improve docstrings * Improved docstrings * Improved docstrings * Improved docstrings * Improved docstrings * Rename bboxes -> bboxes_xyxy * Example colab for evaluation of ONNX model * Rename bboxes -> bboxes_xyxy * Fixed instructions text * Improve efficiency of training * Remove files * Update numbers * Update animal pose * Added integration tests for YoloNASPose * Fix bug in replace head * Add pretrain weights * Added export notebook example * Update integration test * Updating branch for merge * Updating branch for merge * Remove AnimalPoseDataset * Update markdown text * Update markdown text * Revert * Added license * Improve debug text in transfer_weights * Revert * Cleanup recipes * Revert * Cleanup recipes * Revert * Update notebooks * Update mkdocs to include pose estimation * Update docs * Revert test * Put back YAML file * Added check to print license for YoloNAS-POSE * Fixed bug in _pad_image that did not support pad_value=(R,B,G) input * Added images & updated links to notebooks * Added images & updated links to notebooks * Added pop dataset_class from dataloader params * Update quickstart * Added missing rgb2bgr conversion * Added missing rgb2bgr conversion * Disable visualization of samples by default * Added docstrings * Updated additional resoruces section with link to recipies docs * successor -> derivative * Re-run notebook * Fixed recipe to code test * Re-run notebook --------- Co-authored-by: Shay Aharon <80472096+shaydeci@users.noreply.github.com>
Deci-AI · Nov 6, 2023 · 46cdc18 · 46cdc18
1 parent b88e99b
commit 46cdc18
Show file tree

Hide file tree

Showing 35 changed files with 2,949 additions and 9 deletions.
diff --git a/LICENSE.YOLONAS-POSE.md b/LICENSE.YOLONAS-POSE.md
@@ -0,0 +1,16 @@
+# YOLO-NAS-POSE License
+
+These model weights or any components comprising the model and the associated documentation (the "Software") is licensed to you by Deci.AI, Inc. ("Deci") under the following terms:
+© 2023 – Deci.AI, Inc. 
+
+Subject to your full compliance with all of the terms herein, Deci hereby grants you a non-exclusive, revocable, non-sublicensable, non-transferable worldwide and limited right and license to use the Software. If you are using the Deci platform for model optimization, your use of the Software is subject to the Terms of Use available here (the "Terms of Use"). 
+
+You shall not, without Deci's prior written consent: 
+(i) resell, lease, sublicense or distribute the Software to any person; 
+(ii) use the Software to provide third parties with managed services or provide remote access to the Software to any person or compete with Deci in any way; 
+(iii) represent that you possess any proprietary interest in the Software; 
+(iv) directly or indirectly, take any action to contest Deci's intellectual property rights or infringe them in any way; 
+(V) reverse-engineer, decompile, disassemble, alter, enhance, improve, add to, delete from, or otherwise modify, or derive (or attempt to derive) the technology or source code underlying any part of the Software; 
+(vi) use the Software (or any part thereof) in any illegal, indecent, misleading, harmful, abusive, harassing and/or disparaging manner or for any such purposes. Except as provided under the terms of any separate agreement between you and Deci, including the Terms of Use to the extent applicable, you may not use the Software for any commercial use, including in connection with any models used in a production environment. 
+
+DECI PROVIDES THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS OF THE SOFTWARE BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 
diff --git a/Makefile b/Makefile
@@ -32,6 +32,7 @@ sweeper_test:
 
 # Here you define a list of notebooks we want to execute and convert to markdown files
 NOTEBOOKS_TO_RUN := src/super_gradients/examples/model_export/models_export.ipynb
+NOTEBOOKS_TO_RUN += src/super_gradients/examples/model_export/models_export_pose.ipynb
 NOTEBOOKS_TO_RUN += notebooks/what_are_recipes_and_how_to_use.ipynb
 NOTEBOOKS_TO_RUN += notebooks/transfer_learning_classification.ipynb
 NOTEBOOKS_TO_RUN += notebooks/how_to_use_knowledge_distillation_for_classification.ipynb

diff --git a/YOLONAS-POSE.md b/YOLONAS-POSE.md
@@ -0,0 +1,119 @@
+# YOLO-NAS-POSE
+### A Next-Generation, Pose Estimation Foundational Model generated by Deci’s Neural Architecture Search Technology
+
+Deci is thrilled to announce the release of a new object detection model, YOLO-NAS-POSE - a derivative of [YOLO-NAS](YOLONAS.md), 
+pose estimation architecture, providing superior real-time object detection capabilities and production-ready performance. 
+Deci's mission is to provide AI teams with tools to remove development barriers and attain efficient inference performance more quickly.
+
+![YOLO-NAS-POSE](documentation/source/images/yolo_nas_pose_frontier_t4.png)
+
+The new YOLO-NAS-POSE delivers state-of-the-art (SOTA) performance with the unparalleled accuracy-speed performance, outperforming other models such as YOLOv8-Pose, DEKR and others.
+
+Deci's proprietary Neural Architecture Search technology, [AutoNAC™](https://deci.ai/technology/), generated the architecture of YOLO-NAS-POSE model. 
+The AutoNAC™ engine lets you input any task, data characteristics (access to data is not required), inference environment and performance targets, 
+and then guides you to find the optimal architecture that delivers the best balance between accuracy and inference speed for your specific application. 
+In addition to being data and hardware aware, the AutoNAC engine considers other components in the inference stack, including compilers and quantization.
+
+| Model            | AP    | Latency (ms) |
+|------------------|-------|--------------|
+| YOLO-NAS N       | 59.68 | 2.35 ms      |
+| YOLO-NAS S       | 64.15 | 3.29 ms      |
+| YOLO-NAS M       | 67.87 | 6.87 ms      |
+| YOLO-NAS L       | 68.24 | 8.86 ms      |
+
+AP numbers in table reported for COCO 2017 Val dataset and latency benchmarked for 640x640 images on Nvidia T4 GPU.
+No flip-TTA was used.
+
+Similarly to YOLO-NAS, YOLO-NAS-POSE architecture employs quantization-aware blocks and selective quantization for optimized performance. 
+In fact YOLO-NAS-POSE is a derivative of YOLO-NAS and uses same backbone and neck as YOLO-NAS. 
+Only the head is different and is optimized by AutoNAC for pose estimation task. 
+That enables us to use transfer learning and fine-tune YOLO-NAS-POSE starting from YOLO-NAS weights.
+
+
+## Quickstart
+
+### Extract predicted poses 
+
+```python
+import super_gradients
+
+yolo_nas = super_gradients.training.models.get("yolo_nas_pose_l", pretrained_weights="coco_pose").cuda()
+model_predictions  = yolo_nas.predict("https://deci-pretrained-models.s3.amazonaws.com/sample_images/beatles-abbeyroad.jpg", conf=0.5).show()
+
+prediction = model_predictions[0].prediction # One prediction per image - Here we work with 1 image, so we get the first.
+
+bboxes = prediction.bboxes_xyxy # [Num Instances, 4] List of predicted bounding boxes for each object 
+poses  = prediction.poses       # [Num Instances, Num Joints, 3] list of predicted joints for each detected object (x,y, confidence)
+scores = prediction.scores      # [Num Instances] - Confidence value for each predicted instance
+```
+
+![YOLO-NAS-POSE Predict Demo](documentation/source/images/yolo_nas_pose_predict_demo.jpg)
+
+### Recipes
+
+We provide training recipies for training YOLO-NAS-POSE on COCO, CrowdPose and AnimalPose datasets. 
+
+#### COCO 2017
+
+* [super_gradients/recipes/coco2017_yolo_nas_pose_n.yaml](src/super_gradients/recipes/coco2017_yolo_nas_pose_n.yaml)
+* [super_gradients/recipes/coco2017_yolo_nas_pose_s.yaml](src/super_gradients/recipes/coco2017_yolo_nas_pose_s.yaml)
+* [super_gradients/recipes/coco2017_yolo_nas_pose_m.yaml](src/super_gradients/recipes/coco2017_yolo_nas_pose_m.yaml)
+* [super_gradients/recipes/coco2017_yolo_nas_pose_l.yaml](src/super_gradients/recipes/coco2017_yolo_nas_pose_l.yaml)
+
+
+## Additional resources
+
+<table>
+<tr>
+    <td>   
+        <a target="_blank" href="https://colab.research.google.com/drive/1O4N5Vbzv0rfkT81LQidPktX8RtoS5A40">
+            <img src="./documentation/assets/SG_img/colab_logo.png" /> Predict poses with YoloNAS Pose Model
+        </a>
+    </td>
+</tr>
+<tr>
+    <td>   
+        <a target="_blank" href="https://colab.research.google.com/drive/1agLj0aGx48C_rZPrTkeA18kuncack6lF">
+            <img src="./documentation/assets/SG_img/colab_logo.png" /> Fine-Tune YoloNAS Pose on AnimalPose dataset Notebook
+        </a>
+    </td>
+</tr>
+<tr>
+    <td>   
+        <a target="_blank" href="documentation/source/YoloNASPoseQuickstart.md"> 
+            Documentation: YOLO-NAS-POSE Quickstart 
+        </a>
+    </td>
+</tr>
+<tr>
+    <td>   
+        <a target="_blank" href="documentation/source/Recipes_Training.md"> 
+            Documentation: Recipies
+        </a>
+    </td>
+</tr>
+<tr>
+    <td>   
+        <a target="_blank" href="documentation/source/models_export_pose.md"> 
+            Documentation: YOLO-NAS-POSE Export 
+        </a>
+    </td>
+</tr>
+
+
+<tr>
+    <td>   
+        Join our <a target="_blank" href="https://discord.gg/2v6cEGMREN">
+             Discord Community
+        </a>
+    </td>
+</tr>
+</table>
+
+
+## LICENSE
+
+The YOLO-NAS-POSE model is available under an open-source license with pre-trained weights available for non-commercial use on SuperGradients, Deci's PyTorch-based, open-source, computer vision training library. 
+With SuperGradients, users can train models from scratch or fine-tune existing ones, leveraging advanced built-in training techniques like Distributed Data Parallel, Exponential Moving Average, Automatic mixed precision, and Quantization Aware Training.
+
+License file is available here: [YOLO-NAS-POSE WEIGHTS LICENSE](LICENSE.YOLONAS-POSE.md)
diff --git a/documentation/source/YoloNASPoseQuickstart.md b/documentation/source/YoloNASPoseQuickstart.md
@@ -0,0 +1,45 @@
+# YOLO-NAS-POSE Quickstart
+<div>
+<img src="images/yolo_nas_pose_frontier_t4.png" width="750">
+</div>
+
+Deci’s leveraged its proprietary Neural Architecture Search engine (AutoNAC) to generate YOLO-NAS-POSE - a new object 
+detection architecture that delivers the world’s best accuracy-latency performance. 
+
+The YOLO-NAS-POSE model incorporates quantization-aware RepVGG blocks to ensure compatibility with post-training 
+quantization,  making it very flexible and usable for different hardware configurations.
+
+In this tutorial, we will go over the basic functionality of the YOLO-NAS-POSE model. 
+
+
+## Instantiate a YOLO-NAS-POSE Model
+
+```python
+from super_gradients.training import models
+from super_gradients.common.object_names import Models
+
+yolo_nas_pose = models.get(Models.YOLO_NAS_POSE_L, pretrained_weights="coco_pose")
+```
+
+## Predict
+
+```python
+prediction = yolo_nas_pose.predict("https://deci-pretrained-models.s3.amazonaws.com/sample_images/beatles-abbeyroad.jpg")
+prediction.show()
+```
+<div>
+<img src="images/yolo_nas_pose_predict_demo.jpg" width="750">
+</div>
+
+## Export to ONNX & TensorRT
+
+```python
+yolo_nas_pose.export("yolo_nas_pose.onnx")
+```
+
+Please follow our [Pose Estimation Models Export](models_export_pose.md) tutorial for more details.
+
+## Evaluation using pycocotools
+
+We provide example notebook to evaluate YOLO-NAS POSE using COCO protocol.
+Please check [Pose Estimation Models Export](https://github.com/Deci-AI/super-gradients/blob/master/notebooks/yolo_nas_pose_eval_with_pycocotools.ipynb) tutorial for more details.
diff --git a/documentation/source/images/yolo_nas_pose_frontier_t4.png b/documentation/source/images/yolo_nas_pose_frontier_t4.png
diff --git a/documentation/source/images/yolo_nas_pose_predict_demo.jpg b/documentation/source/images/yolo_nas_pose_predict_demo.jpg
diff --git a/documentation/source/model_zoo.md b/documentation/source/model_zoo.md
@@ -92,9 +92,13 @@ All the available models are listed in the column `Model name`.
 
 ### Pretrained Pose Estimation PyTorch Checkpoints
 
-| Model           | Model Name      | Dataset     | Resolution | AP (No TTA / H-Flip TTA / H-Flip TTA+Rescoring) | Latency b1<sub>T4</sub> | Latency b1<sub>T4</sub> including IO | Latency (Production)**<sub>Jetson Xavier NX</sub> | 
-|-----------------|-----------------|-------------|------------|-------------------------------------------------|-------------------------|--------------------------------------|:-------------------------------------------------:|
-| DEKR_W32_NO_DC  | dekr_w32_no_dc  | COCO2017 PE | 640x640    | 63.08 / 64.96 / 67.32                           | 13.29 ms                | 15.31 ms                             | 75.99 ms                                          |
+| Model          | Model Name      | Dataset     | Resolution | AP (No TTA / H-Flip TTA / H-Flip TTA+Rescoring) | Latency b1<sub>T4</sub> | Latency b1<sub>T4</sub> including IO | Latency (Production)**<sub>Jetson Xavier NX</sub> | 
+|----------------|-----------------|-------------|------------|-------------------------------------------------|-------------------------|--------------------------------------|:-------------------------------------------------:|
+| DEKR_W32_NO_DC | dekr_w32_no_dc  | COCO2017 PE | 640x640    | 63.08 / 64.96 / 67.32                           | 13.29 ms                | 15.31 ms                             |                     75.99 ms                      |
+| YoloNAS POSE N | yolo_nas_pose_n | COCO2017 PE | 640x640    | 59.68 / N/A / N/A                               | N/A                     | 2.35 ms                              |                     15.99 ms                      |
+| YoloNAS POSE S | yolo_nas_pose_s | COCO2017 PE | 640x640    | 64.15 / N/A / N/A                               | N/A                     | 3.29 ms                              |                     21.01 ms                      |
+| YoloNAS POSE M | yolo_nas_pose_m | COCO2017 PE | 640x640    | 67.87 / N/A / N/A                               | N/A                     | 6.87 ms                              |                     38.40 ms                      |
+| YoloNAS POSE L | yolo_nas_pose_l | COCO2017 PE | 640x640    | 68.24 / N/A / N/A                               | N/A                     | 8.86 ms                              |                     49.34 ms                      |
 
 
 ## Implemented Model Architectures 
@@ -141,4 +145,6 @@ Devices[https://arxiv.org/pdf/1807.11164](https://arxiv.org/pdf/1807.11164)
 
 
 ### Pose Estimation
+
 - [HRNet DEKR](https://github.com/HRNet/HigherHRNet-Human-Pose-Estimation) - Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression [https://arxiv.org/pdf/2104.02300.pdf](https://arxiv.org/pdf/2104.02300.pdf)
+- YoloNAS Pose