Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/sg 000 add adapter #1471

Merged
merged 64 commits into from
Oct 13, 2023
Merged

Feature/sg 000 add adapter #1471

merged 64 commits into from
Oct 13, 2023

Conversation

Louis-Dupont
Copy link
Contributor

@Louis-Dupont Louis-Dupont commented Sep 19, 2023

Introduction of Adapter logic

Why Do We Need This?

We aim to allow seamless integration with SG for users who are already using DG.

Solution (updated)

If the user doesnt want to set anything in advance (in such case he will be asked questions) OR if he already has a cache (from DG or previous run)

val_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    config_path="<path>",
    batch_size=10,
    drop_last=True,
)

If the user wants to set in advnace some parameters, he can still use DetectionDataConfig

adapter_config = DetectionDataConfig(images_extractor=lambda sample: sample[0], labels_extractor=voc_format_to_bbox, cache_path=analyzer.data_config.cache_path)
train_loader = DetectionDataloaderAdapterFactory.from_dataset(
    dataset=train_set,
    adapter_config=adapter_config,
    batch_size=10,
    drop_last=True,
)

Now, the same is possible if the user already has a dataloader.

val_loader = DetectionDataloaderAdapterFactory.from_dataloader(
    dataloader=train_loader,
    config_path="...",
)

Notebook
https://colab.research.google.com/drive/1clgfrdCrg5cWSl7nrjY0zVndz06SIC0P?usp=sharing

@Louis-Dupont Louis-Dupont changed the title Feature/sg 000 add adapter v2 Feature/sg 000 add adapter Sep 19, 2023
@Louis-Dupont Louis-Dupont marked this pull request as ready for review September 20, 2023 11:06
requirements.txt Outdated Show resolved Hide resolved
@BloodAxe
Copy link
Collaborator

BloodAxe commented Oct 5, 2023

Thanks for the Colab notebook. It really helpful to see how the intended usage.
I have few comments regarding the notebook, nothing critical or wrong, more a suggestsions:

  • DetectionDataloaderAdapterFactory / SegmentationDataloaderAdapterFactory - may I suggest DataAdapterForDetection / DataAdapterForSegmentation ?

  • adapter_cache_path argument name may be a bit confusing to users non familiar with internals and one may assume we will store some data samples in that location. Perhaps if we rename this argument as "config_path" that would be clearer to the user.

  • As you can see, the labels are normalized (0-1). This is all right, but it is not the format expected by SuperGradients.
    Actually this was ❓❓❓ moment for me, why that dataset also normalize masks and how SG denormalize them and why. Lost the track why we modify targets. Maybe should have been explained a bit better. Usually in segmentation datasets we don't normalize targets so maybe in notebook we should show a more common use case?

  • For the questions that has fixed number of options to choose from, perhaps we can show some love to user and heuristicaly pick a default option (Which will be chosen if user just hit Enter):

--------------------------------------------------------------------------------
Which dimension corresponds the image channel? 
--------------------------------------------------------------------------------
Image shape: torch.Size([1, 3, 512, 512])
Options:
[0] | 0
[1] | 1 [Suggested value, Enter to accept]
[2] | 2
[3] | 3

Sure, heuristics may not handle ALL use cases but would certainly save some time in 90% of the cases.

  • yolonas_recipe = load_recipe(config_name="coco2017_yolo_nas_s", overrides=[f"arch_params.num_classes={20}", "training_hyperparams.max_epochs=1", "training_hyperparams.mixed_precision=False"])

As we've already asked user about number of classes when creating adapter data loader, we should be able to pull this information from that instance:

yolonas_recipe = load_recipe(config_name="coco2017_yolo_nas_s", overrides=[f"arch_params.num_classes={train_loader.adapter_config.num_classes}", "training_hyperparams.max_epochs=1", "training_hyperparams.mixed_precision=False"])

@Louis-Dupont
Copy link
Contributor Author

DetectionDataloaderAdapterFactory / SegmentationDataloaderAdapterFactory - may I suggest DataAdapterForDetection / DataAdapterForSegmentation ?

Not fundamentally against the change, but I have a small concern; DataAdapterForDetection.from_dataset(dataset=...) It's not clear that this generates a dataloader. DetectionDataloaderAdapterFactory was not 100% explicit but with the name it's already a bit more intuitive.

We could change the method to convey this idea, for instance DataAdapterForDetection.build_dataloader_from_dataset, but I can't find a good name. here build_dataloader_from_dataset sounds bad, especially when we think about the from_dataloader version which would be build_dataloader_from_dataloader.

Eventually, my point is I would like the name (of class and method combined) to include the info that this builds a "dataloader which adapts X for a task Y" with X (dataset/dataloader).
Any idea of better naming? Or do you think it's fine to not refer to dataloader?

@Louis-Dupont
Copy link
Contributor Author

Louis-Dupont commented Oct 5, 2023

adapter_cache_path argument name may be a bit confusing to users non familiar with internals and one may assume we will store some data samples in that location. Perhaps if we rename this argument as "config_path" that would be clearer to the user.

Yeah I wasnt too sure about whether to simplify or to be more explicit. But I think you're right, with the context it is (relatively) clear that the config_path is that of the current object (i.e. adapter)

I changed both adapter_config -> config and adapter_cache_path -> config_path 024da0c

@Louis-Dupont
Copy link
Contributor Author

For the questions that has fixed number of options to choose from, perhaps we can show some love to user and heuristicaly pick a default option (Which will be chosen if user just hit Enter):

That's an interesting idea. I will create a task for this, and maybe to include this idea to other questions (if possible/useful).

@Louis-Dupont
Copy link
Contributor Author

As we've already asked user about number of classes when creating adapter data loader, we should be able to pull this information from that instance:

yolonas_recipe = load_recipe(config_name="coco2017_yolo_nas_s", overrides=[f"arch_params.num_classes={train_loader.adapter_config.num_classes}", "training_hyperparams.max_epochs=1", "training_hyperparams.mixed_precision=False"])

Good point, i'll change that

@Louis-Dupont
Copy link
Contributor Author

Note: the build fails because we did not release yet the latest DG version.
We need to first release DG, and then update the requirements.txt

Copy link
Collaborator

@BloodAxe BloodAxe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@BloodAxe BloodAxe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BloodAxe BloodAxe merged commit 38b5a3f into master Oct 13, 2023
6 of 7 checks passed
@BloodAxe BloodAxe deleted the feature/SG-000-add_adapter_v2 branch October 13, 2023 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants