Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Squashed changes with YoloNASPose & Loss #1512

Merged
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
238e41c
Introduce sample-centric keypoint transforms
BloodAxe Oct 2, 2023
efeb4ef
Cleanup leftovers
BloodAxe Oct 2, 2023
fa1f79d
Fixed way of checking transforms that require additional samples
BloodAxe Oct 3, 2023
88cef73
Docstrings
BloodAxe Oct 3, 2023
a29f58a
:attr -> :param
BloodAxe Oct 3, 2023
c797100
Added docs clarifying behavior of mosaic & mixup
BloodAxe Oct 3, 2023
7b1b9ee
Added docs clarifying behavior of mosaic & mixup
BloodAxe Oct 3, 2023
0d00668
Improved tests
BloodAxe Oct 3, 2023
a445917
Merge branch 'master' into feature/SG-1060-yolo-nas-pose-release
BloodAxe Oct 3, 2023
f388fcc
Additional docstrings & typing annotations
BloodAxe Oct 3, 2023
5b24d06
Merge remote-tracking branch 'origin/feature/SG-1060-yolo-nas-pose-re…
BloodAxe Oct 3, 2023
a7c97ce
Added missing additional_samples_count field
BloodAxe Oct 4, 2023
b84817d
Merge branch 'master' into feature/SG-1060-yolo-nas-pose-release
BloodAxe Oct 4, 2023
4a21438
KeypointsRemoveSmallObjects
BloodAxe Oct 4, 2023
53ecb3e
KeypointsRemoveSmallObjects
BloodAxe Oct 4, 2023
577804c
Merge branch 'master' into feature/SG-1060-yolo-nas-pose-release
BloodAxe Oct 5, 2023
fd41b71
Feature/sg 1060 yolo nas pose release pr to add datasets and metric (…
BloodAxe Oct 9, 2023
81091b7
Merge branch 'master' into feature/SG-1060-yolo-nas-pose-release
BloodAxe Oct 9, 2023
a1284be
Squashed changes with YoloNASPose & Loss
BloodAxe Oct 9, 2023
2d01377
Merge branch 'master' into feature/SG-1060-yolo-nas-pose-release-add-…
BloodAxe Oct 9, 2023
3ed26ff
Remove print statement
BloodAxe Oct 10, 2023
359dcf6
Fixed attribute name that was not renamed
BloodAxe Oct 10, 2023
928da84
Improve docstrings to use 'Num Keypoints' instead of magic number 17
BloodAxe Oct 10, 2023
0e540e5
Fixed PoseNMS export to work with custom number of keypoints
BloodAxe Oct 10, 2023
8dc7647
Added docstrings
BloodAxe Oct 10, 2023
0e23800
Simplify forward/forward_eval
BloodAxe Oct 10, 2023
15cef48
Simplify forward/forward_eval
BloodAxe Oct 10, 2023
295b0c7
_insert_heads_list_params
BloodAxe Oct 10, 2023
0474536
Merge master
shaydeci Oct 11, 2023
94a15fc
Added tests
BloodAxe Oct 11, 2023
74ea314
Refactor the way we generate usage instructions. Should be easier to …
BloodAxe Oct 11, 2023
d43cf2b
Merge branch 'master' into feature/SG-1060-yolo-nas-pose-release-add-…
BloodAxe Oct 12, 2023
408b5d0
Improve docstrings
BloodAxe Oct 12, 2023
e8b3f18
Improved docstrings
BloodAxe Oct 12, 2023
b2ff5e9
Improved docstrings
BloodAxe Oct 12, 2023
90bca43
Improved docstrings
BloodAxe Oct 12, 2023
84aa12b
Improved docstrings
BloodAxe Oct 12, 2023
fe00d56
Rename bboxes -> bboxes_xyxy
BloodAxe Oct 12, 2023
632d62d
Fixed instructions text
BloodAxe Oct 12, 2023
11d3eb2
Improve efficiency of training
BloodAxe Oct 12, 2023
91dd113
Merge branch 'master' into feature/SG-1060-yolo-nas-pose-release-add-…
BloodAxe Oct 13, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion src/super_gradients/common/object_names.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ class Losses:
DICE_CE_EDGE_LOSS = "DiceCEEdgeLoss"
DEKR_LOSS = "DEKRLoss"
RESCORING_LOSS = "RescoringLoss"
YOLONAS_POSE_LOSS = "YoloNASPoseLoss"


class Metrics:
Expand Down Expand Up @@ -113,11 +114,13 @@ class Transforms:
KeypointsImageNormalize = "KeypointsImageNormalize"
KeypointsImageStandardize = "KeypointsImageStandardize"
KeypointsImageToTensor = "KeypointsImageToTensor"
KeypointTransform = "KeypointTransform"
KeypointsPadIfNeeded = "KeypointsPadIfNeeded"
KeypointsLongestMaxSize = "KeypointsLongestMaxSize"
KeypointsRandomVerticalFlip = "KeypointsRandomVerticalFlip"
KeypointsRandomHorizontalFlip = "KeypointsRandomHorizontalFlip"
KeypointsRescale = "KeypointsRescale"
KeypointsRandomRotate90 = "KeypointsRandomRotate90"
KeypointsRemoveSmallObjects = "KeypointsRemoveSmallObjects"


class Optimizers:
Expand Down Expand Up @@ -312,6 +315,11 @@ class Models:
POSE_RESCORING = "pose_rescoring_custom"
POSE_RESCORING_COCO = "pose_rescoring_coco"

YOLO_NAS_POSE_N = "yolo_nas_pose_n"
YOLO_NAS_POSE_S = "yolo_nas_pose_s"
YOLO_NAS_POSE_M = "yolo_nas_pose_m"
YOLO_NAS_POSE_L = "yolo_nas_pose_l"


class ConcatenatedTensorFormats:
XYXY_LABEL = "XYXY_LABEL"
Expand Down Expand Up @@ -418,6 +426,7 @@ class Processings:
DetectionLongestMaxSizeRescale = "DetectionLongestMaxSizeRescale"
DetectionBottomRightPadding = "DetectionBottomRightPadding"
DetectionRescale = "DetectionRescale"
KeypointsRescale = "KeypointsRescale"
KeypointsLongestMaxSizeRescale = "KeypointsLongestMaxSizeRescale"
KeypointsBottomRightPadding = "KeypointsBottomRightPadding"
ImagePermute = "ImagePermute"
Expand Down
339 changes: 339 additions & 0 deletions src/super_gradients/conversion/onnx/pose_nms.py

Large diffs are not rendered by default.

10 changes: 9 additions & 1 deletion src/super_gradients/module_interfaces/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
from .module_interfaces import HasPredict, HasPreprocessingParams, SupportsReplaceNumClasses
from .exportable_detector import ExportableObjectDetectionModel, AbstractObjectDetectionDecodingModule, ModelHasNoPreprocessingParamsException
from .exceptions import ModelHasNoPreprocessingParamsException
from .exportable_detector import ExportableObjectDetectionModel, AbstractObjectDetectionDecodingModule
from .exportable_pose_estimation import ExportablePoseEstimationModel, PoseEstimationModelExportResult, AbstractPoseEstimationDecodingModule
from .pose_estimation_post_prediction_callback import AbstractPoseEstimationPostPredictionCallback, PoseEstimationPredictions

__all__ = [
"HasPredict",
Expand All @@ -8,4 +11,9 @@
"ExportableObjectDetectionModel",
"AbstractObjectDetectionDecodingModule",
"ModelHasNoPreprocessingParamsException",
"AbstractPoseEstimationPostPredictionCallback",
"PoseEstimationPredictions",
"ExportablePoseEstimationModel",
"PoseEstimationModelExportResult",
"AbstractPoseEstimationDecodingModule",
]
6 changes: 6 additions & 0 deletions src/super_gradients/module_interfaces/exceptions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
class ModelHasNoPreprocessingParamsException(Exception):
"""
Exception that is raised when model does not have preprocessing parameters.
"""

pass
637 changes: 637 additions & 0 deletions src/super_gradients/module_interfaces/exportable_pose_estimation.py

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
import abc
import dataclasses
import numpy as np

from typing import Any, List
from typing import Union, Optional
from torch import Tensor

__all__ = ["PoseEstimationPredictions", "AbstractPoseEstimationPostPredictionCallback"]


@dataclasses.dataclass
class PoseEstimationPredictions:
"""
A data class that encapsulates pose estimation predictions for a single image.

:param poses: Array of shape [N, K, 3] where N is number of poses and K is number of joints.
Last dimension is [x, y, score] where score the confidence score for the specific joint
with [0..1] range.
:param scores: Array of shape [N] with scores for each pose with [0..1] range.
:param bboxes_xyxy: Array of shape [N, 4] with bounding boxes for each pose in XYXY format.
Can be None if bounding boxes are not available (for instance, DEKR model does not output boxes).
"""

poses: Union[Tensor, np.ndarray]
scores: Union[Tensor, np.ndarray]
bboxes_xyxy: Optional[Union[Tensor, np.ndarray]]


class AbstractPoseEstimationPostPredictionCallback(abc.ABC):
"""
A protocol interface of a post-prediction callback for pose estimation models.
"""

@abc.abstractmethod
def __call__(self, predictions: Any) -> List[PoseEstimationPredictions]:
...
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
in_channels: 3

backbone:
NStageBackbone:

stem:
YoloNASStem:
out_channels: 48

stages:
- YoloNASStage:
out_channels: 96
num_blocks: 2
activation_type: relu
hidden_channels: 96
concat_intermediates: True

- YoloNASStage:
out_channels: 192
num_blocks: 3
activation_type: relu
hidden_channels: 128
concat_intermediates: True

- YoloNASStage:
out_channels: 384
num_blocks: 5
activation_type: relu
hidden_channels: 256
concat_intermediates: True

- YoloNASStage:
out_channels: 768
num_blocks: 2
activation_type: relu
hidden_channels: 512
concat_intermediates: True


context_module:
SPP:
output_channels: 768
activation_type: relu
k: [5,9,13]

out_layers: [stage1, stage2, stage3, context_module]

neck:
YoloNASPANNeckWithC2:

neck1:
YoloNASUpStage:
out_channels: 192
num_blocks: 4
hidden_channels: 128
width_mult: 1
depth_mult: 1
activation_type: relu
reduce_channels: True

neck2:
YoloNASUpStage:
out_channels: 96
num_blocks: 4
hidden_channels: 128
width_mult: 1
depth_mult: 1
activation_type: relu
reduce_channels: True

neck3:
YoloNASDownStage:
out_channels: 192
num_blocks: 4
hidden_channels: 128
activation_type: relu
width_mult: 1
depth_mult: 1

neck4:
YoloNASDownStage:
out_channels: 384
num_blocks: 4
hidden_channels: 256
activation_type: relu
width_mult: 1
depth_mult: 1

heads:
YoloNASPoseNDFLHeads:
num_classes: 17
reg_max: 16
heads_list:
- YoloNASPoseDFLHead:
bbox_inter_channels: 128
pose_inter_channels: 128
pose_regression_blocks: 2
shared_stem: False
width_mult: 1
pose_conf_in_class_head: True
pose_block_use_repvgg: False
first_conv_group_size: 0
num_classes:
stride: 8
reg_max: 16
cls_dropout_rate: 0.0
reg_dropout_rate: 0.0

- YoloNASPoseDFLHead:
bbox_inter_channels: 256
pose_inter_channels: 512
pose_regression_blocks: 2
shared_stem: False
width_mult: 1
pose_conf_in_class_head: True
pose_block_use_repvgg: False
first_conv_group_size: 0
num_classes:
stride: 16
reg_max: 16
cls_dropout_rate: 0.0
reg_dropout_rate: 0.0

- YoloNASPoseDFLHead:
bbox_inter_channels: 512
pose_inter_channels: 512
pose_regression_blocks: 3
shared_stem: False
width_mult: 1
pose_conf_in_class_head: True
pose_block_use_repvgg: False
first_conv_group_size: 0
num_classes:
stride: 32
reg_max: 16
cls_dropout_rate: 0.0
reg_dropout_rate: 0.0

bn_eps: 1e-6
bn_momentum: 0.03
inplace_act: True

_convert_: all
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
in_channels: 3

backbone:
NStageBackbone:

stem:
YoloNASStem:
out_channels: 48

stages:
- YoloNASStage:
out_channels: 96
num_blocks: 2
activation_type: relu
hidden_channels: 64
concat_intermediates: True

- YoloNASStage:
out_channels: 192
num_blocks: 3
activation_type: relu
hidden_channels: 128
concat_intermediates: True

- YoloNASStage:
out_channels: 384
num_blocks: 5
activation_type: relu
hidden_channels: 256
concat_intermediates: True

- YoloNASStage:
out_channels: 768
num_blocks: 2
activation_type: relu
hidden_channels: 384
concat_intermediates: False


context_module:
SPP:
output_channels: 768
activation_type: relu
k: [5,9,13]

out_layers: [stage1, stage2, stage3, context_module]

neck:
YoloNASPANNeckWithC2:

neck1:
YoloNASUpStage:
out_channels: 192
num_blocks: 2
hidden_channels: 192
width_mult: 1
depth_mult: 1
activation_type: relu
reduce_channels: True

neck2:
YoloNASUpStage:
out_channels: 96
num_blocks: 3
hidden_channels: 64
width_mult: 1
depth_mult: 1
activation_type: relu
reduce_channels: True

neck3:
YoloNASDownStage:
out_channels: 192
num_blocks: 2
hidden_channels: 192
activation_type: relu
width_mult: 1
depth_mult: 1

neck4:
YoloNASDownStage:
out_channels: 384
num_blocks: 3
hidden_channels: 256
activation_type: relu
width_mult: 1
depth_mult: 1

heads:
YoloNASPoseNDFLHeads:
num_classes: 17
reg_max: 16
pose_offset_multiplier: 1.0
compensate_grid_cell_offset: True
inference_mode: False # True used only when benchmarking
heads_list:
- YoloNASPoseDFLHead:
bbox_inter_channels: 128
pose_inter_channels: 128
pose_regression_blocks: 2
shared_stem: False
width_mult: 0.75
pose_conf_in_class_head: True
pose_block_use_repvgg: False
first_conv_group_size: 0
num_classes:
stride: 8
reg_max: 16
- YoloNASPoseDFLHead:
bbox_inter_channels: 256
pose_inter_channels: 512
pose_regression_blocks: 2
shared_stem: False
width_mult: 0.75
pose_conf_in_class_head: True
pose_block_use_repvgg: False
first_conv_group_size: 0
num_classes:
stride: 16
reg_max: 16
- YoloNASPoseDFLHead:
bbox_inter_channels: 512
pose_inter_channels: 512
pose_regression_blocks: 3
shared_stem: False
width_mult: 0.75
pose_conf_in_class_head: True
pose_block_use_repvgg: False
first_conv_group_size: 0
num_classes:
stride: 32
reg_max: 16


bn_eps: 1e-6
bn_momentum: 0.1
inplace_act: True

_convert_: all
Loading