Feature/sg 000 add adapter #1471

Louis-Dupont · 2023-09-19T06:25:06Z

Introduction of Adapter logic

Why Do We Need This?

We aim to allow seamless integration with SG for users who are already using DG.

Solution (updated)

If the user doesnt want to set anything in advance (in such case he will be asked questions) OR if he already has a cache (from DG or previous run)

val_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config_path="<path>",
    batch_size=10,
    drop_last=True,
)

If the user wants to set in advnace some parameters, he can still use DetectionDataConfig

adapter_config = DetectionDataConfig(images_extractor=lambda sample: sample[0], labels_extractor=voc_format_to_bbox, cache_path=analyzer.data_config.cache_path)
train_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    adapter_config=adapter_config,
    batch_size=10,
    drop_last=True,
)

Now, the same is possible if the user already has a dataloader.

val_loader = DetectionDataloaderAdapterFactory.from_dataloader(
    dataloader=train_loader,
    config_path="...",
)

Notebook
https://colab.research.google.com/drive/1clgfrdCrg5cWSl7nrjY0zVndz06SIC0P?usp=sharing

…ature/SG-000-add_adapter_v2

src/super_gradients/training/sg_trainer/sg_trainer.py

src/super_gradients/recipes/cityscapes_stdc_seg50.yaml

src/super_gradients/training/__init__.py

src/super_gradients/training/dataloaders/adapters.py

tests/integration_tests/data_adapter/data_adapter_on_other.py

tests/integration_tests/data_adapter/test_data_adapter_on_sg.py

requirements.txt

src/super_gradients/training/dataloaders/dataloaders.py

BloodAxe · 2023-10-05T07:04:00Z

Thanks for the Colab notebook. It really helpful to see how the intended usage.
I have few comments regarding the notebook, nothing critical or wrong, more a suggestsions:

DetectionDataloaderAdapterFactory / SegmentationDataloaderAdapterFactory - may I suggest DataAdapterForDetection / DataAdapterForSegmentation ?
adapter_cache_path argument name may be a bit confusing to users non familiar with internals and one may assume we will store some data samples in that location. Perhaps if we rename this argument as "config_path" that would be clearer to the user.
As you can see, the labels are normalized (0-1). This is all right, but it is not the format expected by SuperGradients.
Actually this was ❓❓❓ moment for me, why that dataset also normalize masks and how SG denormalize them and why. Lost the track why we modify targets. Maybe should have been explained a bit better. Usually in segmentation datasets we don't normalize targets so maybe in notebook we should show a more common use case?
For the questions that has fixed number of options to choose from, perhaps we can show some love to user and heuristicaly pick a default option (Which will be chosen if user just hit Enter):

--------------------------------------------------------------------------------
Which dimension corresponds the image channel? 
--------------------------------------------------------------------------------
Image shape: torch.Size([1, 3, 512, 512])
Options:
[0] | 0
[1] | 1 [Suggested value, Enter to accept]
[2] | 2
[3] | 3

Sure, heuristics may not handle ALL use cases but would certainly save some time in 90% of the cases.

yolonas_recipe = load_recipe(config_name="coco2017_yolo_nas_s", overrides=[f"arch_params.num_classes={20}", "training_hyperparams.max_epochs=1", "training_hyperparams.mixed_precision=False"])

As we've already asked user about number of classes when creating adapter data loader, we should be able to pull this information from that instance:

yolonas_recipe = load_recipe(config_name="coco2017_yolo_nas_s", overrides=[f"arch_params.num_classes={train_loader.adapter_config.num_classes}", "training_hyperparams.max_epochs=1", "training_hyperparams.mixed_precision=False"])

Louis-Dupont · 2023-10-05T10:17:15Z

DetectionDataloaderAdapterFactory / SegmentationDataloaderAdapterFactory - may I suggest DataAdapterForDetection / DataAdapterForSegmentation ?

Not fundamentally against the change, but I have a small concern; DataAdapterForDetection.from_dataset(dataset=...) It's not clear that this generates a dataloader. DetectionDataloaderAdapterFactory was not 100% explicit but with the name it's already a bit more intuitive.

We could change the method to convey this idea, for instance DataAdapterForDetection.build_dataloader_from_dataset, but I can't find a good name. here build_dataloader_from_dataset sounds bad, especially when we think about the from_dataloader version which would be build_dataloader_from_dataloader.

Eventually, my point is I would like the name (of class and method combined) to include the info that this builds a "dataloader which adapts X for a task Y" with X (dataset/dataloader).
Any idea of better naming? Or do you think it's fine to not refer to dataloader?

Louis-Dupont · 2023-10-05T10:19:21Z

adapter_cache_path argument name may be a bit confusing to users non familiar with internals and one may assume we will store some data samples in that location. Perhaps if we rename this argument as "config_path" that would be clearer to the user.

Yeah I wasnt too sure about whether to simplify or to be more explicit. But I think you're right, with the context it is (relatively) clear that the config_path is that of the current object (i.e. adapter)

I changed both adapter_config -> config and adapter_cache_path -> config_path 024da0c

Louis-Dupont · 2023-10-05T10:21:09Z

For the questions that has fixed number of options to choose from, perhaps we can show some love to user and heuristicaly pick a default option (Which will be chosen if user just hit Enter):

That's an interesting idea. I will create a task for this, and maybe to include this idea to other questions (if possible/useful).

Louis-Dupont · 2023-10-05T10:21:39Z

As we've already asked user about number of classes when creating adapter data loader, we should be able to pull this information from that instance:

yolonas_recipe = load_recipe(config_name="coco2017_yolo_nas_s", overrides=[f"arch_params.num_classes={train_loader.adapter_config.num_classes}", "training_hyperparams.max_epochs=1", "training_hyperparams.mixed_precision=False"])

Good point, i'll change that

Louis-Dupont · 2023-10-13T10:02:16Z

Note: the build fails because we did not release yet the latest DG version.
We need to first release DG, and then update the requirements.txt

…nto feature/SG-000-add_adapter_v2

BloodAxe

LGTM

BloodAxe

LGTM

Louis-Dupont added 13 commits September 12, 2023 09:44

wip

1aafa67

somewhat working version - still need test + dd

035b3b3

wip

9c1bff6

first NICE draft for collate fn - still miss indepth testing

b669251

Merge branch 'master' into feature/SG-000-add_adapter_v2

34bf7f5

add tests and make it work for collate_fn - yaml and DDP remaining

2b37b0b

seemingly working version - still require yaml and ddp test in-depth

0438a86

remove __target__

0f83543

undo

2cf4167

wip

62e3880

Merge branch 'master' into feature/SG-000-add_adapter_v2

6b08948

add docstrings

6b52803

fix test

7122810

Louis-Dupont changed the title ~~Feature/sg 000 add adapter v2~~ Feature/sg 000 add adapter Sep 19, 2023

Louis-Dupont added 10 commits September 19, 2023 11:02

add check for double wrapping

df4a42e

fix logic

f097fdb

fix doc

12eb14d

Merge branch 'hotfix/SG-000-remove___target___for_collate_fn' into fe…

dadece1

…ature/SG-000-add_adapter_v2

refine

9627742

add manual tests

58a603b

add tests and fix collates

4f4ccf8

uncomment test

7c696c7

Merge branch 'master' into feature/SG-000-add_adapter_v2

953fe81

deprecate moved collate_fn

02f3fe7

Louis-Dupont marked this pull request as ready for review September 20, 2023 11:06

Louis-Dupont requested review from shaydeci, ofrimasad and BloodAxe as code owners September 20, 2023 11:06

Louis-Dupont commented Sep 20, 2023

View reviewed changes

src/super_gradients/training/sg_trainer/sg_trainer.py Outdated Show resolved Hide resolved

BloodAxe reviewed Sep 20, 2023

View reviewed changes

src/super_gradients/recipes/cityscapes_stdc_seg50.yaml Outdated Show resolved Hide resolved

Louis-Dupont added 3 commits October 4, 2023 10:13

update log

7aa7e21

update docstring

51cfb6b

use classmethod instead of staticmethod

a0cf771