Add Custom Dataset Training Support #154

samet-akcay · 2022-03-22T07:20:33Z

Description

This PR adds custom dataset support.
Fixes Support training with custom MVTec like dataset but without masks (ground truths) #147

Changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist

My code follows the pre-commit style and check guidelines of this project.
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing tests pass locally with my changes

…ib into feature/data/btad

djdameln

Thanks, great addition! I didn't manually test the custom dataset format yet, but I'll do that and will post here if I run into any issues.

djdameln · 2022-03-22T12:22:05Z

README.md

+  task: segmentation # classification or segmentation
+  mask: <path/to/mask/annotations> #optional
+  extensions: null
+  split_ratio: 0.2


Maybe add some comments here to the parameters that may be hard to understand. e.g.
split_ratio: 0.2 # ratio of the normal images that will be used to create a test split

djdameln · 2022-03-22T12:29:49Z

README.md

+It is also possible to train on a custom dataset. To do so, `data` section in `config.yaml` is to be modified as follows:
+```yaml
+dataset:
+  name: custom


Maybe we should use format here instead of name. For MVTec we also have a format field in addition to name. The way I see it, format determines which dataset class is used under the hood, while name can be anything that identifies the specific dataset that is used.

djdameln · 2022-03-22T12:32:06Z

anomalib/config/config.py

@@ -177,7 +177,8 @@ def get_configurable_parameters(
    config = update_input_size_config(config)

    # Project Configs
-    project_path = Path(config.project.path) / config.model.name / config.dataset.name / config.dataset.category
+    category = config.dataset.category if "category" in config.dataset.keys() else ""


Maybe it would be a bit more clear if we check the dataset type here, and only add the category to the path if the type is MVTec.

djdameln · 2022-03-22T12:36:10Z

anomalib/data/custom.py

+    return samples
+
+
+class CustomDataset(Dataset):


I'm not sure about the naming. Maybe FolderDataset would be more appropriate? Custom sounds a bit like users can choose their own 'custom' format. But this class represents a dataset that follows a fixed format based on the folder structure of the data.

djdameln · 2022-03-22T12:40:10Z

anomalib/data/custom.py

+            The dataset expects that mask annotation filenames must be same as the original filename.
+            To show an example, we therefore need to modify the mask filenames in MVTec dataset.
+
+            >>> # Rename MVTec mask annotations so that they are the same as image filanames


I'm afraid the example in the docstring might cause some confusion with the users (why use the custom dataset class for MVTec if there is a dataset class specific for mvtec). Maybe we could keep it simple and start the example with the assumption that the user has a folder of normal images and a folder of abnormal images, and explicitly state this at the beginning of the example.

…o feature/data/custom-dataset

djdameln

Thanks!

samet-akcay added 22 commits February 24, 2022 04:24

renamed download-progress-bar as download

f175a24

added new download functions to init

f841f51

Added Btech data module

12cd8ee

Added btech tests

7bc453f

Move split functions into a util module

3a32443

Modified mvtec

132ceb1

added btech to get-datamodule

907281f

fix typo in btech docstring

16de223

update docstring

c2353db

cleanedup dataset download utils

287c974

Address mypy

df8b655

modify config files and update readme.md

966ad94

Fix dataset path

97d98fa

Merge branch 'development' into feature/data/btad

1e78a31

Resolved merge conflicts

f6cba9a

Merge branch 'feature/data/btad' of github.com:openvinotoolkit/anomal…

9513723

…ib into feature/data/btad

WiP: Created make_dataset function

b71f4d3

Renamed folder dataset into custom

28f7d3e

Added custom dataset tests

83c1384

updated config.yaml file to show custom dataset is available

09908b0

Added custom dataset to get_datamodule

215df46

Resolve merge conflicts

ee12a7a

samet-akcay requested a review from djdameln March 22, 2022 07:20

samet-akcay mentioned this pull request Mar 22, 2022

How to train in a custom dataset? #109

Closed

samet-akcay changed the title ~~Feature/data/custom dataset~~ Add Custom Dataset Training Support Mar 22, 2022

djdameln reviewed Mar 22, 2022

View reviewed changes

samet-akcay added 4 commits March 23, 2022 06:03

Address PR comments

cf22594

Merge branch 'development' of github.com:openvinotoolkit/anomalib int…

8b827d4

…o feature/data/custom-dataset

Merge branch 'development' of github.com:openvinotoolkit/anomalib int…

2d24d16

…o feature/data/custom-dataset

fix dataset path

6646c3b

samet-akcay added 4 commits March 24, 2022 01:13

Debugging the ci

b3cf100

Fixed folder dataset tests

00e8020

Added code quality checks back to the ci

8e47bd3

Added code coverage back to pre-merge tests

314b164

djdameln approved these changes Mar 24, 2022

View reviewed changes

samet-akcay merged commit b03fb32 into development Mar 24, 2022

samet-akcay deleted the feature/data/custom-dataset branch March 24, 2022 10:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Custom Dataset Training Support #154

Add Custom Dataset Training Support #154

samet-akcay commented Mar 22, 2022 •

edited

Loading

djdameln left a comment

djdameln Mar 22, 2022

djdameln Mar 22, 2022

djdameln Mar 22, 2022

djdameln Mar 22, 2022

djdameln Mar 22, 2022

djdameln left a comment

Add Custom Dataset Training Support #154

Add Custom Dataset Training Support #154

Conversation

samet-akcay commented Mar 22, 2022 • edited Loading

Description

Changes

Checklist

djdameln left a comment

Choose a reason for hiding this comment

djdameln Mar 22, 2022

Choose a reason for hiding this comment

djdameln Mar 22, 2022

Choose a reason for hiding this comment

djdameln Mar 22, 2022

Choose a reason for hiding this comment

djdameln Mar 22, 2022

Choose a reason for hiding this comment

djdameln Mar 22, 2022

Choose a reason for hiding this comment

djdameln left a comment

Choose a reason for hiding this comment

samet-akcay commented Mar 22, 2022 •

edited

Loading