Feature/sg 815 fix override dataset params #1092

BloodAxe · 2023-05-26T14:34:06Z

This PR fixes interpolation issue in our dataset configs.
There are two distinct use cases which are almost mutually exclusive, but looks like I managed to get them both working.

Scenario 1: Colab users

In colab, users override params via dataloader factory methods as follows:

dataset = coco2017_train_yolo_nas(dataset_params=dict(data_dir="e:/coco2017", input_dim=[320,320]))

Under the hood, we load dataset config from YAML file and apply user-provided overrides.

Scenario 2: Pycharm / Command line users

python train_from_recipe.py --config-name=coco2017_yolo_nas_s \
 dataset_params.train_dataset_params.input_dim.0=320
 dataset_params.train_dataset_params.input_dim.1=320

In command-line users are launching training with train_from_recipe and instantiation of the datasets is somewhat crooked. Since we still calling coco2017_train_yolo_nas, but we do a double work of merging a configuration for dataset.

In the call to coco2017_train_yolo_nas(dataset_params) the dataset_params variable comes from experiment configuration file with all the interpolations/instantiations done by hydra for us.
And inside the function coco2017_train_yolo_nas still loads default dataset params from YAML file.
And them we do the "merging".

The interpolation did not work as intended because of wrong order of instantiation/merging/overrides.
Looks like this PR addresses both scenarios. There are additional unit test cases which tests new logic of merging configs.

✅ Added test case to verify override works
✅ Ran all unit tests to ensure they are good (Some were actually failing so I fixed them, that's why there are additional files in the PR)
✅ Ran all integration tests to ensure they are good as well (Appart from failing tests caused by lack of DEKR pretrained weights the rest are good and we did not break anything)

Not fixed yet

(And I'm not sure how)
The COCO YOLO-NAS dataset config has this strange image preprocessing of resizing image to longest size of 636 and then padding to 640. Why this is the problem:
We cannot use interpolation in this case for obvious reasons

val_dataset_params:
  ...
  input_dim: [636, 636]
  transforms:
    - DetectionPadToSize:
        output_size: [640, 640]
    ...
    - DetectionTargetsFormatTransform:
        input_dim: [640, 640]

Solution 1: Change input_dim: [636, 636] to input_dim: [640, 640]
Pros: Can use interpolation
Cons: Lose a few decimals of mAP

Solution 2: Leave as is
Pros: None?
Cons: This recipe would not support changing image resolution via command line or colab

… was failing for some reason on linux machine, but not on windows

shaydeci · 2023-05-28T08:49:24Z

Nice!
Regarding the COCO config issue you brought up - did we check how significant is the mAP decrease? I mean its such a tiny difference...

Also, from your command example:

dataset_params.train_dataset_params.input_dim.0=320
dataset_params.train_dataset_params.input_dim.1=320

I think the transforms that Interpolate input_dim should handle single int values and turning them into [input_dim, input_dim]. I don't think we have too much of them in use tbh, and some already have this option implemetnted.
But...I do vaguely remember there might have been issues with overriding different data types inside hydra..

Should we also add the interpolations for our other detection recipes while at it?

Makefile

src/super_gradients/training/dataloaders/dataloaders.py

BloodAxe · 2023-05-29T07:09:50Z

Regarding the COCO config issue you brought up - did we check how significant is the mAP > decrease? I mean its such a tiny difference...

If we do just resize + pad to 640x640 without 636x636 in between we get this:

(First number is actual value, second is expected)

YOLO-NAS L
AssertionError: tensor(0.5184) != 0.5222 within 0.001 delta (tensor(0.0038) difference)

YOLO-NAS M
AssertionError: tensor(0.5113) != 0.5155 within 0.001 delta (tensor(0.0042) difference)

YOLO-NAS S
AssertionError: tensor(0.4705) != 0.475 within 0.001 delta (tensor(0.0045) difference)

BloodAxe · 2023-05-29T07:15:04Z

Also, from your command example:
dataset_params.train_dataset_params.input_dim.0=320
dataset_params.train_dataset_params.input_dim.1=320
I think the transforms that Interpolate input_dim should handle single int values and turning them into [input_dim, input_dim]. I don't think we have too much of them in use tbh, and some already have this option implemetnted.
But...I do vaguely remember there might have been issues with overriding different data types inside hydra..

Should we also add the interpolations for our other detection recipes while at it?

I'm not aware of hydra's command-line override grammar to express override of list. None of:

dataset_params.train_dataset_params.input_dim=320,320
dataset_params.train_dataset_params.input_dim=(320,320)
dataset_params.train_dataset_params.input_dim=[320,320]
Worked for me.

I agree that:

dataset_params.train_dataset_params.input_dim.0=320
dataset_params.train_dataset_params.input_dim.1=320

Looks ugly.

I think what we can do with single-scalar input_dim is the following:

We should allow input_dim: Union[int, Tuple[int,int]] and in our transforms we will call ensure_is_tuple_of_two to ensure our input dim is broadcasted to (rows, cols) if a single scalar is given:

def __init__(self, input_dim: Union[int, Tuple[int,int]]):
   self.input_dim = ensure_is_tuple_of_two(input_dim)

Louis-Dupont

Concerning input_dim, there is a third option which is to have 2 default params _default_resizing_dim and _default_padding_dim but only one public param input_dim

By default input_dim=None and we use _default_resizing_dim=633, _default_padding_dim=640.
When setting input_dim, it overrides both _default_resizing_dim and _default_padding_dim accordingly.

Similar logic to resizing_dim = input_dim if input_dim else _default_resizing_dim

Pro

Keep our results in the default case
Single param to override when someone wants to set input_dim

Con

Increase the complexity of the recipe (and/or resolver if we need to use them to do this if ... else ... logic)

Not sure we want it to go for that, it might not be worth it if we're fine with losing this few digit in MaP. But if we do care about these few digits I think it might be worth considering this approach

src/super_gradients/training/dataloaders/dataloaders.py

…set-params' into feature/SG-815-fix-override-dataset-params # Conflicts: # src/super_gradients/training/utils/utils.py

Louis-Dupont

LGTM

src/super_gradients/training/dataloaders/dataloaders.py

Louis-Dupont

LGTM

shaydeci

Feeling a bit unease about the fact that users wont know in case they mixed non-primitve instantiated objects and some values they wish to interpolate, but tbh I really dont have a better solution :

LGTM

BloodAxe · 2023-06-01T08:46:22Z

I can think only of one case where this may go wrong:
When user combines config-based transform declaration and already instantiated instances of transform object in transforms:

dataset_params={
"input_dim": 512,
"transforms": [
 DetectionRandomAffine(...),
 {"DetectionPaddedRescale": {"input_dim": "${dataset_params.train_dataset_params.input_dim}" }}
]

Frankly speaking I'm not sure whether we should ever allow this mixing.

BloodAxe added 10 commits May 26, 2023 15:43

Fix dataset interpolation issue

5e4aca6

Added makefile to ease running tests locally

8702aee

Fix bug in unpacking batch that may have more than two elements

1057918

Fix bug in unpacking batch that may have more than two elements

2a2df14

Add more coverage for unit tests to try cuda & cpu devices since test…

87996af

… was failing for some reason on linux machine, but not on windows

Exclude crowd in metrics tests since our ref values are excluding them

20c0496

Fix test

cf53374

Fix test

b819eef

Fix test

49aa5e4

Fix test

903b2fd

BloodAxe marked this pull request as ready for review May 26, 2023 14:58

BloodAxe requested review from shaydeci, ofrimasad and Louis-Dupont as code owners May 26, 2023 14:58

shaydeci requested changes May 28, 2023

View reviewed changes

Makefile Show resolved Hide resolved

src/super_gradients/training/dataloaders/dataloaders.py Outdated Show resolved Hide resolved

Merge branch 'master' into feature/SG-815-fix-override-dataset-params

2e0ae21

Louis-Dupont reviewed May 29, 2023

View reviewed changes

src/super_gradients/training/dataloaders/dataloaders.py Outdated Show resolved Hide resolved

BloodAxe added 4 commits May 29, 2023 11:18

Added ensure_is_tuple_of_two to allow override input_dim=512

e2e1463

Merge remote-tracking branch 'origin/feature/SG-815-fix-override-data…

1ce55c2

…set-params' into feature/SG-815-fix-override-dataset-params # Conflicts: # src/super_gradients/training/utils/utils.py

Added test case to check whether we can handle transforms

84c83bb

Merge branch 'master' into feature/SG-815-fix-override-dataset-params

418efd5

BloodAxe requested review from shaydeci and Louis-Dupont May 31, 2023 10:47

Louis-Dupont previously approved these changes May 31, 2023

View reviewed changes

src/super_gradients/training/dataloaders/dataloaders.py Show resolved Hide resolved

Fixed case when passing manually instantiated transforms

d978f91

BloodAxe dismissed Louis-Dupont’s stale review via 2c9ab37 May 31, 2023 11:54

Fix type of input_dim in preprocessing params. It is now tuple, not list

2c9ab37

Louis-Dupont approved these changes Jun 1, 2023

View reviewed changes

shaydeci approved these changes Jun 1, 2023

View reviewed changes

Merge branch 'master' into feature/SG-815-fix-override-dataset-params

e70d5d7

Merge branch 'master' into feature/SG-815-fix-override-dataset-params

abed051

BloodAxe merged commit 7907c48 into master Jun 1, 2023
1 check passed

BloodAxe deleted the feature/SG-815-fix-override-dataset-params branch June 1, 2023 12:34

shaydeci mentioned this pull request Jun 6, 2023

Bug/sg 000 merge failure for datasetparams #1140

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/sg 815 fix override dataset params #1092

Feature/sg 815 fix override dataset params #1092

BloodAxe commented May 26, 2023 •

edited

Loading

shaydeci commented May 28, 2023

BloodAxe commented May 29, 2023 •

edited

Loading

BloodAxe commented May 29, 2023

Louis-Dupont left a comment •

edited

Loading

Louis-Dupont left a comment

Louis-Dupont left a comment

shaydeci left a comment

BloodAxe commented Jun 1, 2023

Feature/sg 815 fix override dataset params #1092

Feature/sg 815 fix override dataset params #1092

Conversation

BloodAxe commented May 26, 2023 • edited Loading

Scenario 1: Colab users

Scenario 2: Pycharm / Command line users

Not fixed yet

shaydeci commented May 28, 2023

BloodAxe commented May 29, 2023 • edited Loading

BloodAxe commented May 29, 2023

Louis-Dupont left a comment • edited Loading

Choose a reason for hiding this comment

Louis-Dupont left a comment

Choose a reason for hiding this comment

Louis-Dupont left a comment

Choose a reason for hiding this comment

shaydeci left a comment

Choose a reason for hiding this comment

BloodAxe commented Jun 1, 2023

BloodAxe commented May 26, 2023 •

edited

Loading

BloodAxe commented May 29, 2023 •

edited

Loading

Louis-Dupont left a comment •

edited

Loading