Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running autobatch --batch -1 on Apple Metal Performance Shader (MPS) --device mps #11169

Closed
1 of 2 tasks
PetervanLunteren opened this issue Mar 14, 2023 · 3 comments
Closed
1 of 2 tasks
Labels
bug Something isn't working Stale

Comments

@PetervanLunteren
Copy link
Contributor

PetervanLunteren commented Mar 14, 2023

Search before asking

  • I have searched the YOLOv5 issues and found no similar bug report.

YOLOv5 Component

Training

Bug

The following error message pops up when running autobatch (--batch -1) on Apple MPS (--device mps). Please see reprex below. I've installed the requirements as documented here. MPS is properly built and available (see below).

Installation:

git clone https://github.com/ultralytics/yolov5
cd yolov5
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/nightly/cpu

Check mps:

python -c "import torch; print(torch.backends.mps.is_available()); print(torch.backends.mps.is_built())"
True
True

Reproduce error:

python train.py --epochs 1 --data data/coco128.yaml --weights yolov5s.pt --batch -1 --device mps
AutoBatch: Computing optimal batch size for --imgsz 640
Traceback (most recent call last):
  File "/Users/peter/yolov5/train.py", line 640, in <module>
    main(opt)
  File "/Users/peter/yolov5/train.py", line 529, in main
    train(opt.hyp, opt, device, callbacks)
  File "/Users/peter/yolov5/train.py", line 150, in train
    batch_size = check_train_batch_size(model, imgsz, amp)
  File "/Users/peter/yolov5/utils/autobatch.py", line 18, in check_train_batch_size
    return autobatch(deepcopy(model).train(), imgsz)  # compute optimal batch size
  File "/Users/peter/yolov5/utils/autobatch.py", line 43, in autobatch
    properties = torch.cuda.get_device_properties(device)  # device properties
  File "/Users/peter/miniforge3/envs/yoloenv/lib/python3.10/site-packages/torch/cuda/__init__.py", line 404, in get_device_properties
    _lazy_init()  # will define _get_device_properties
  File "/Users/peter/miniforge3/envs/yoloenv/lib/python3.10/site-packages/torch/cuda/__init__.py", line 248, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

It seems to go wrong at this line of code, where a CUDA device is expected. --batch -1 works fine without the --device mps argument.

Uninstalling torchvision and compiling from the vision repository like suggested here resulted in the same behaviour.

Could this be an easy fix by adjusting the code in autobatch.py to also accept mps as device?

Environment

  • python: 3.10.0
  • condaforge: 22.11.1
  • pytorch: 2.1.0.dev20230314
  • Macbook Pro with M1 Pro chip: macOS-12.3-arm64-arm-64bit

Minimal Reproducible Example

Execute from M1 mac:

python train.py --epochs 1 --data data/coco128.yaml --weights yolov5s.pt --batch -1 --device mps

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!
@PetervanLunteren PetervanLunteren added the bug Something isn't working label Mar 14, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Apr 14, 2023

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

@github-actions github-actions bot added the Stale label Apr 14, 2023
@github-actions github-actions bot removed the Stale label Apr 22, 2023
@github-actions
Copy link
Contributor

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale label May 23, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 2, 2023
@glenn-jocher
Copy link
Member

@PetervanLunteren hi there! Thanks for reporting this issue. It looks like the error you encountered is due to the expected CUDA device while using the --device mps argument. We'll investigate and address this in an upcoming release. In the meantime, feel free to run the --batch -1 command without the --device mps argument for successful execution. We appreciate your patience and understanding. If you have any more questions, feel free to ask!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Stale
Projects
None yet
Development

No branches or pull requests

2 participants