Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA error: device-side assert triggered #9

Open
starsky68 opened this issue Nov 7, 2020 · 1 comment
Open

CUDA error: device-side assert triggered #9

starsky68 opened this issue Nov 7, 2020 · 1 comment

Comments

@starsky68
Copy link

starsky68 commented Nov 7, 2020

Starting training for 100 epochs...

 Epoch   gpu_mem      GIoU       obj       cls      reid     total   targets  img_size

0%| | 0/1 [00:00<?, ?it/s]
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [0,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [1,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [2,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [3,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [4,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [5,0,0] Assertion t >= 0 && t < n_classes failed.
Traceback (most recent call last):
File "train.py", line 660, in
train() # train normally
File "train.py", line 452, in train
loss, loss_items = compute_loss_no_upsample(pred, reid_feat_out, targets, track_ids, model)
File "/home/11/YOLOV4_MCMOT-master/utils/utils.py", line 522, in compute_loss_no_upsample
return loss, torch.cat((l_box, l_obj, l_cls, l_reid, loss)).detach()
File "/home/11/.local/lib/python3.8/site-packages/apex-0.1-py3.8.egg/apex/amp/wrap.py", line 81, in wrapper
return orig_fn(seq, *args, **kwargs)
RuntimeError: CUDA error: device-side assert triggered

训练的时候出现上面的问题,我的标签格式如下:
0 1 0.01 0.02 0.03 0.04
1 1 0.014 0.015 0.03 0.016
3 1 0.05 0.06 0.07 0.08
4 1 0.017 0.018 0.019 0.020
2 1 0.09 0.1 0.12 0.13
2 2 0.021 0.022 0.023 0.024

一共有5个类别,怎么会出现越界呢?似乎类别数减去1才可以,这是因为有一个背景类?
@CaptainEven 请求帮助,感激不尽。

@kenrickfernandes
Copy link

Hello, did you solve this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants