Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Celery logger stops working using torch.hub.load #6060

Closed
1 of 2 tasks
arjitkatare opened this issue Dec 22, 2021 · 9 comments · Fixed by #7296
Closed
1 of 2 tasks

Celery logger stops working using torch.hub.load #6060

arjitkatare opened this issue Dec 22, 2021 · 9 comments · Fixed by #7296
Labels
bug Something isn't working Stale

Comments

@arjitkatare
Copy link

Search before asking

  • I have searched the YOLOv5 issues and found no similar bug report.

YOLOv5 Component

PyTorch Hub

Bug

For some reasons, while using torch.hub.load in celery worker, logger is getting shutdown. By specifying v6.0 in repo_or_dir, issue seems to be resolved

Environment

No response

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!
@arjitkatare arjitkatare added the bug Something isn't working label Dec 22, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Dec 22, 2021

👋 Hello @arjitkatare, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python>=3.6.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

$ git clone https://github.com/ultralytics/yolov5
$ cd yolov5
$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

@glenn-jocher
Copy link
Member

glenn-jocher commented Dec 23, 2021

@arjitkatare 👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀. We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

  • Minimal – Use as little code as possible to produce the problem
  • Complete – Provide all parts someone else needs to reproduce the problem
  • Reproducible – Test the code you're about to provide to make sure it reproduces the problem

For Ultralytics to provide assistance your code should also be:

  • Current – Verify that your code is up-to-date with GitHub master, and if necessary git pull or git clone a new copy to ensure your problem has not already been solved in master.
  • Unmodified – Your problem must be reproducible using official YOLOv5 code without changes. Ultralytics does not provide support for custom code ⚠️.

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! 😃

@github-actions
Copy link
Contributor

github-actions bot commented Jan 23, 2022

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

@JonathanSamelson
Copy link
Contributor

JonathanSamelson commented Mar 11, 2022

I ran into the same issue with my own loggers after updating PyTorch to 1.10 (from 1.7.1).

The problem comes from this line:
self.net = torch.hub.load('ultralytics/yolov5', 'custom', path=self.model_path, verbose=False)

In this case, the output of my script is as follows:

2022-03-11 17:42:03 pluton aptitude-toolbox[22744] INFO Model type YOLO selected.
YOLOv5  2022-1-12 torch 1.10.2+cu113 CUDA:0 (GeForce GTX 1080 Ti, 11264MiB)

Fusing layers...
Model Summary: 213 layers, 7039792 parameters, 0 gradients
Adding AutoShape...
Detector init duration = 4.6338949000000005s
Model type SORT selected.
Tracker init duration = 0.019383399999999718s

Then, I don't have any output from my loggers anymore.

Whereas when adding a tag such as v6.0:
self.net = torch.hub.load('ultralytics/yolov5:v6.0', 'custom', path=self.model_path, verbose=False)

My loggers are working and don't stop:

2022-03-11 17:46:15 pluton aptitude-toolbox[12680] INFO Model type YOLO selected.
2022-03-11 17:46:19 pluton aptitude-toolbox[12680] INFO Detector init duration = 4.442983099999999s
2022-03-11 17:46:19 pluton aptitude-toolbox[12680] INFO Model type SORT selected.
2022-03-11 17:46:19 pluton aptitude-toolbox[12680] INFO Tracker init duration = 0.019554799999999872s
.... [After the process ends] ...
2022-03-11 17:46:21 pluton aptitude-toolbox[12680] INFO Average FPS: 12.473039525066566
2022-03-11 17:46:21 pluton aptitude-toolbox[12680] INFO Average FPS w/o read time: 13.190384737142026

As you can see, I still have the details provided by my logger in the latter case.

Also, using the latest tag (v6.1) give me another error, which is not related I think:
Exception: path is on mount 'C:', start on mount 'E:'. Cache may be out of date, try force_reload=True or see https://docs.ultralytics.com/yolov5/tutorials/pytorch_hub_model_loading for help.
I tried using force_reload, the result is the same.

@glenn-jocher
Copy link
Member

@JonathanSamelson regarding the loggers, not sure what the problem could be. The current logging code is here, with LOGGER imported in various places:

yolov5/utils/general.py

Lines 77 to 88 in b94b59e

def set_logging(name=None, verbose=VERBOSE):
# Sets level and returns logger
if is_kaggle():
for h in logging.root.handlers:
logging.root.removeHandler(h) # remove all handlers associated with the root logger object
rank = int(os.getenv('RANK', -1)) # rank in world for Multi-GPU trainings
logging.basicConfig(format="%(message)s", level=logging.INFO if (verbose and rank in (-1, 0)) else logging.WARNING)
return logging.getLogger(name)
LOGGER = set_logging('yolov5') # define globally (used in train.py, val.py, detect.py, etc.)

Regarding the PyTorch v6.1 PyTorch Hub usage this works correctly for me, I'm not able to reproduce any errors:
Screenshot 2022-03-11 at 18 05 36

@JonathanSamelson
Copy link
Contributor

@glenn-jocher I think the difference in the format is caused by this line:

logging.basicConfig(format="%(message)s", level=logging.INFO if (verbose and rank in (-1, 0)) else logging.WARNING)

I think it causes the logger to change the format for all loggers instead of only yolov5 logger.

And actually I was wrong in my previous message, the logger does not stop working but the level of the logger changes to INFO so the DEBUG message are filtered.

@glenn-jocher
Copy link
Member

@JonathanSamelson ok got it. If you have a fix in mind can you please submit a PR? Thanks!

@JonathanSamelson
Copy link
Contributor

@glenn-jocher At the moment, I don't. I'm using kind of the same lines in my project...
I understand it does not matter until it becomes an underlying project.

@glenn-jocher
Copy link
Member

@arjitkatare @JonathanSamelson good news 😃! Your original issue may now be fixed ✅ in PR #7296 by @maxstrobel.

To receive this update:

  • Gitgit pull from within your yolov5/ directory or git clone https://github.com/ultralytics/yolov5 again
  • PyTorch Hub – Force-reload model = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
  • Notebooks – View updated notebooks Open In Colab Open In Kaggle
  • Dockersudo docker pull ultralytics/yolov5:latest to update your image Docker Pulls

Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants