Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kolektor dataset #983

Merged
merged 26 commits into from
May 12, 2023
Merged

Conversation

Ravindu987
Copy link
Contributor

@Ravindu987 Ravindu987 commented Mar 26, 2023

Description

  • Work in progress

  • Added the dataset classes, dataloaders and other necessary functions to load Kolektor Dataset

  • Due to the nature of the dataset, used unconventional methods which needs to be discussed

  • Requires ImageIO and Numpy libraries

  • Fixes [Task]: Implement Kolektor dataset #964

Changes

  • Bug fix (non-breaking change which fixes an issue)
  • Refactor (non-breaking change which refactors the code base)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist

  • My code follows the pre-commit style and check guidelines of this project.
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing tests pass locally with my changes
  • I have added a summary of my changes to the CHANGELOG (not for minor changes, docs and tests).

Copy link
Collaborator

@ashwinvaidya17 ashwinvaidya17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the efforts! I understand this is WIP, I've still gone ahead and added a few comments. Also, can you also accress the issues raised by Codacy.

src/anomalib/data/kolektor.py Show resolved Hide resolved
transform=transform_eval,
split=Split.TEST,
root=root,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing prepare_data method. Have a look at

def prepare_data(self) -> None:
to see an example implementation

src/anomalib/data/kolektor.py Outdated Show resolved Hide resolved
src/anomalib/data/kolektor.py Show resolved Hide resolved
@Ravindu987
Copy link
Contributor Author

Hi @ashwinvaidya17
Thank you very much for your feedback.
In the latest commits I resolved three of the issues you raised.

Regarding the prepare_dataset there's an issue. The download link for the dataset doesn't provide an extension for the downloaded file. Instead it sends the extension via the response header. This works well with a browser but the urlretrieve method used in the download_and_extract method downloads the file without extension and therefore cannot extract.

We have two options to resolve this

  1. Instead of download_and_extract use a custom function to download and set the extension.
  2. Update the download_and_extract function to receive the response header and set the extension. If we choose this we'll have to update other make_dataset methods accordingly.

Please let me know your thoughts on this.

Also as I have informed @samet-akcay , I had to pause the work on this until I get a green light on the current logical implementation ( As you might have seen, I had to classify images on the runtime instead of loading from directories because of the nature of the dataset). Please let me know whether the logic is good to go so that I can finish up.

@github-actions github-actions bot added the Docs label Apr 16, 2023
@Ravindu987 Ravindu987 changed the title WIP: Add kolektor dataset Add kolektor dataset Apr 16, 2023
@Ravindu987
Copy link
Contributor Author

Hey guys
I've gone ahead and completed writing docstrings and implemented a custom function to download and extract the dataset.
Since I've tested it locally I removed the WIP tag. Please have a look.

Could you refer me a documentation on the test convention used here. Once I get an insight on this I can finish it as well.

Thanks

Copy link
Contributor

@samet-akcay samet-akcay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Ravindu987 for your great contribution! I have a comment regarding the use of download and extract function.

src/anomalib/data/kolektor.py Outdated Show resolved Hide resolved
Comment on lines 311 to 315
download_and_extract_kolektor(
"https://go.vicos.si/kolektorsdd",
self.root,
"2b094030343c1cd59df02203ac6c57a0",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you define DOWNLOAD_INFO variable as given below

DOWNLOAD_INFO = DownloadInfo(
    name="kolektor",
    url="https://go.vicos.si/kolektorsdd",
    hash="2b094030343c1cd59df02203ac6c57a0",
)

Then it would be possible to use anomalib.data.download.download_and_extract function as:

download_and_extract(self.root, DOWNLOAD_INFO)

This would be inline with the rest of the datamodule implementations. For reference, here are the some of the implementations:
mvtec-ad, avenue, btech, mvtec-3d...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This links with the previous issue which makes the download_and_extract function not usable here.

@Ravindu987
Copy link
Contributor Author

Thank you very much @samet-akcay for the review. I resolved one requested change and answered your questions as well. Please let me know your thoughts and any further changes.

Copy link
Contributor

@samet-akcay samet-akcay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks a lot for your contribution!

@github-actions github-actions bot removed the Docs label May 11, 2023
Copy link
Collaborator

@ashwinvaidya17 ashwinvaidya17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for your efforts!

@Ravindu987
Copy link
Contributor Author

Thanks for the review guys.
The Codacy Static Code Analysis fails because the init function in the DataModule class has too many arguments. I see no way around this as other implementations have the same issue.
Not sure why the Tox checks are failing. When I ran all pre-merge checks locally they all passed. Any ide on this?

@samet-akcay
Copy link
Contributor

Thanks for the review guys. The Codacy Static Code Analysis fails because the init function in the DataModule class has too many arguments. I see no way around this as other implementations have the same issue. Not sure why the Tox checks are failing. When I ran all pre-merge checks locally they all passed. Any ide on this?

It's a problem with our CI that we need to fix. You don't need to do anything else at this point. Thank again for your contribution

@Ravindu987
Copy link
Contributor Author

Thanks for the review guys. The Codacy Static Code Analysis fails because the init function in the DataModule class has too many arguments. I see no way around this as other implementations have the same issue. Not sure why the Tox checks are failing. When I ran all pre-merge checks locally they all passed. Any ide on this?

It's a problem with our CI that we need to fix. You don't need to do anything else at this point. Thank again for your contribution

Noted with thanks. Hope to contribute more in the future.

@samet-akcay samet-akcay merged commit 854044d into openvinotoolkit:main May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Task]: Implement Kolektor dataset
3 participants