Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autosplit: take image files with uppercase extensions into account #5269

Merged
merged 4 commits into from
Oct 20, 2021

Conversation

jdfr
Copy link
Contributor

@jdfr jdfr commented Oct 20, 2021

IMG_FORMATS contains lowercase versions of image file extensions.

In all other uses of IMG_FORMATS, care is taken to take into account image files regardless of the extension's case. However, the autosplit() function doesn't: it silently fails to include files whose extensions have any uppercase letter. This is not a problem on Windows, but has bitten me on Linux. This commit fixes the problem.

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Refined image file filtering mechanism in the dataset loading process.

📊 Key Changes

  • Altered the file search pattern from '**/.' to '.' to adjust the scope of recursive search.
  • Synchronized the commented pathlib-based alternative with the IMG_FORMATS global.
  • Ensured the file list is always sorted, improving the consistency of data processing.

🎯 Purpose & Impact

  • 🎨 Purpose: To streamline the file discovery process, ensuring that only relevant image files are included in the dataset. The pathlib alternative is kept up-to-date for potential future use.
  • 🚀 Impact:
    • Users can expect more reliable dataset loading with a refined search that may improve load times and reduce errors.
    • Consistent sorting of files contributes to reproducibility, important for model training and evaluation tasks.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 Hello @jdfr, thank you for submitting a 🚀 PR! To allow your work to be integrated as seamlessly as possible, we advise you to:

  • ✅ Verify your PR is up-to-date with origin/master. If your PR is behind origin/master an automatic GitHub actions rebase may be attempted by including the /rebase command in a comment body, or by running the following code, replacing 'feature' with the name of your local branch:
git remote add upstream https://github.com/ultralytics/yolov5.git
git fetch upstream
git checkout feature  # <----- replace 'feature' with local branch name
git merge upstream/master
git push -u origin -f
  • ✅ Verify all Continuous Integration (CI) checks are passing.
  • ✅ Reduce changes to the absolute minimum required for your bug fix or feature addition. "It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is." -Bruce Lee

@jdfr jdfr changed the title take image files with uppercase extensions into account in autosplit autosplit: take image files with uppercase extensions into account Oct 20, 2021
Removes additional variable (capital variable names are also only for global variables), and uses the same methodology as implemented earlier in datasets.py L409.
@glenn-jocher glenn-jocher merged commit db3bbdd into ultralytics:master Oct 20, 2021
@glenn-jocher
Copy link
Member

@jdfr PR is merged. Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐

BjarneKuehl pushed a commit to fhkiel-mlaip/yolov5 that referenced this pull request Aug 26, 2022
…ltralytics#5269)

* take image files with uppercase extensions into account in autosplit

* case fix

* Refactor implementation

Removes additional variable (capital variable names are also only for global variables), and uses the same methodology as implemented earlier in datasets.py L409.

* Remove redundant rglob characters

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants