Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support some downstream classification datasets. #1467

Merged
merged 33 commits into from
May 5, 2023

Conversation

zzc98
Copy link
Contributor

@zzc98 zzc98 commented Apr 7, 2023

Motivation

Add some classification datasets. These datasets are listed below.

dataset paper classes size(train/test)
Oxford 102 Flowers Automated flower classification over a large number of classes 102 2,040/6,149
Caltech-101 Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories 102 3,060/6,084
Oxford-IIIT Pets 0/1 Deep Neural Networks via Block Coordinate Descent 37 3,680/3,369
Describable Textures (DTD) Describing Textures in the Wild 47 3,760/1,880
FGVC Aircraft Fine-Grained Visual Classification of Aircraft 100 6,667/3,333
Stanford Cars 3D Object Representations for Fine-Grained Categorization 196 8,144/8,041
SUN397 SUN Database: Large-scale Scene Recognition from Abbey to Zoo 397 19,850/19,850
Food-101 Food-101 – Mining Discriminative Components with Random Forests 101 75,750/25,250

Examples

Oxford 102 Flowers
>>> from mmpretrain.datasets import Flowers102
>>> train_cfg = dict(data_root='data/Flowers102', split='trainval')
>>> train = Flowers102(**train_cfg)
>>> train
Dataset Flowers102
    Number of samples:  2040
    Root of dataset:    data/Flowers102
>>> test_cfg = dict(data_root='data/Flowers102', split='test')
>>> test = Flowers102(**test_cfg)
>>> test
Dataset Flowers102
    Number of samples:  6149
    Root of dataset:    data/Flowers102
Caltech-101
>>> from mmpretrain.datasets import Caltech101
>>> train_cfg = dict(data_root='data/Caltech', split='train')
>>> train = Caltech101(**train_cfg)
>>> train
Dataset Caltech101
    Number of samples:  3060
    Number of categories:       102
    Root of dataset:    data/Caltech
>>> test_cfg = dict(data_root='data/Caltech', split='test')
>>> test = Caltech101(**test_cfg)
>>> test
Dataset Caltech101
    Number of samples:  6728
    Number of categories:       102
    Root of dataset:    data/Caltech
Oxford-IIIT Pets
>>> from mmpretrain.datasets import OxfordIIITPet
>>> train_cfg = dict(data_root='data/Oxford-IIIT_Pets', split='trainval')
>>> train = OxfordIIITPet(**train_cfg)
>>> train
Dataset OxfordIIITPet
    Number of samples:  3680
    Number of categories:       37
    Root of dataset:    data/Oxford-IIIT_Pets
>>> test_cfg = dict(data_root='data/Oxford-IIIT_Pets', split='test')
>>> test = OxfordIIITPet(**test_cfg)
>>> test
Dataset OxfordIIITPet
    Number of samples:  3669
    Number of categories:       37
    Root of dataset:    data/Oxford-IIIT_Pets
Describable Textures (DTD)
>>> from mmpretrain.datasets import DTD
>>> train_cfg = dict(data_root='data/dtd', split='trainval')
>>> train = DTD(**train_cfg)
>>> train
Dataset DTD
    Number of samples:  3760
    Number of categories:       47
    Root of dataset:    data/dtd
>>> test_cfg = dict(data_root='data/dtd', split='test')
>>> test = DTD(**test_cfg)
>>> test
Dataset DTD
    Number of samples:  1880
    Number of categories:       47
    Root of dataset:    data/dtd
FGVC Aircraft
>>> from mmpretrain.datasets import FGVCAircraft
>>> train_cfg = dict(data_root='data/fgvc-aircraft-2013b/data', split='trainval')
>>> train = FGVCAircraft(**train_cfg)
>>> train
Dataset FGVCAircraft
    Number of samples:  6667
    Number of categories:       100
    Root of dataset:    data/fgvc-aircraft-2013b/data
>>> test_cfg = dict(data_root='data/fgvc-aircraft-2013b/data', split='test')
>>> test = FGVCAircraft(**test_cfg)
>>> test
Dataset FGVCAircraft
    Number of samples:  3333
    Number of categories:       100
    Root of dataset:    data/fgvc-aircraft-2013b/data
Stanford Cars
>>> from mmpretrain.datasets import StanfordCars
>>> train_cfg = dict(data_root='data/Stanford_Cars', split='train')
>>> train = StanfordCars(**train_cfg)
>>> train
Dataset StanfordCars
    Number of samples:  8144
    Number of categories:       196
    Root of dataset:    data/Stanford_Cars
>>> test_cfg = dict(data_root='data/Stanford_Cars', split='test')
>>> test = StanfordCars(**test_cfg)
>>> test
Dataset StanfordCars
    Number of samples:  8041
    Number of categories:       196
    Root of dataset:    data/Stanford_Cars
SUN397
>>> from mmpretrain.datasets import SUN397
>>> train_cfg = dict(data_root='data/SUN397', split='train')
>>> train = SUN397(**train_cfg)
>>> train
Dataset SUN397
    Number of samples:  19824
    Number of categories:       397
    Root of dataset:    data/SUN397
>>> test_cfg = dict(data_root='data/SUN397', split='test')
>>> test = SUN397(**test_cfg)
>>> test
Dataset SUN397
    Number of samples:  19829
    Number of categories:       397
    Root of dataset:    data/SUN397
Food-101
>>> from mmpretrain.datasets import Food101
>>> train_cfg = dict(data_root='data/food-101', split='train')
>>> train = Food101(**train_cfg)
>>> train
Dataset Food101
    Number of samples:  75750
    Number of categories:       101
    Root of dataset:    data/food-101
>>> test_cfg = dict(data_root='data/food-101', split='test')
>>> test = Food101(**test_cfg)
>>> test
Dataset Food101
    Number of samples:  25250
    Number of categories:       101
    Root of dataset:    data/food-101

Checklist

Before PR:

  • Pre-commit or other linting tools are used to fix the potential lint issues.
  • Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
  • The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

  • If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects, like MMDet or MMSeg.
  • CLA has been signed and all committers have signed the CLA in this PR.

@codecov
Copy link

codecov bot commented Apr 7, 2023

Codecov Report

Patch coverage has no change and project coverage change: +0.83 🎉

Comparison is base (c9a0cb0) 84.37% compared to head (4c12612) 85.21%.

❗ Current head 4c12612 differs from pull request most recent head ed6c9fe. Consider uploading reports for the commit ed6c9fe to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##              dev    #1467      +/-   ##
==========================================
+ Coverage   84.37%   85.21%   +0.83%     
==========================================
  Files         142      238      +96     
  Lines        9925    17898    +7973     
  Branches     1621     2796    +1175     
==========================================
+ Hits         8374    15251    +6877     
- Misses       1277     2130     +853     
- Partials      274      517     +243     
Flag Coverage Δ
unittests 85.21% <ø> (+0.83%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 380 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@fangyixiao18 fangyixiao18 mentioned this pull request Apr 14, 2023
14 tasks
@wangbo-zhao
Copy link
Contributor

The number of categories in FGVC Aircraft should be 100

@zzc98
Copy link
Contributor Author

zzc98 commented Apr 17, 2023

The number of categories in FGVC Aircraft should be 100

The number of categories has been modified. Thanks.

@Ezra-Yu Ezra-Yu changed the base branch from main to dev April 23, 2023 06:42
Copy link
Collaborator

@Ezra-Yu Ezra-Yu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please rm the useless files.

mmpretrain/datasets/oxford102flowers.py Outdated Show resolved Hide resolved
mmpretrain/datasets/oxfordiiitpet.py Outdated Show resolved Hide resolved
mmpretrain/datasets/stanford_cars.py Outdated Show resolved Hide resolved
mmpretrain/datasets/sun397.py Outdated Show resolved Hide resolved
mmpretrain/datasets/dtd.py Outdated Show resolved Hide resolved
mmpretrain/datasets/categories.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@Ezra-Yu Ezra-Yu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please recorrect this param doc.

test_mode (bool): ``test_mode=True`` means in test phase.

docs/en/api/datasets.rst Outdated Show resolved Hide resolved
mmpretrain/datasets/caltech101.py Outdated Show resolved Hide resolved
mmpretrain/datasets/cifar.py Outdated Show resolved Hide resolved
mmpretrain/datasets/cifar.py Outdated Show resolved Hide resolved
mmpretrain/datasets/cifar.py Show resolved Hide resolved
mmpretrain/datasets/fgvcaircraft.py Outdated Show resolved Hide resolved
mmpretrain/datasets/fgvcaircraft.py Outdated Show resolved Hide resolved
mmpretrain/datasets/dtd.py Outdated Show resolved Hide resolved
mmpretrain/datasets/flowers102.py Outdated Show resolved Hide resolved
mmpretrain/datasets/oxfordiiitpet.py Outdated Show resolved Hide resolved
mmpretrain/datasets/stanfordcars.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@Ezra-Yu Ezra-Yu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@fangyixiao18 fangyixiao18 merged commit 496e098 into open-mmlab:dev May 5, 2023
@zzc98 zzc98 deleted the add-cls-datasets branch May 5, 2023 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants