Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyTorch 1.9 and CUDA 11.1 Support #324

Open
edmuthiah opened this issue Aug 28, 2021 · 6 comments
Open

PyTorch 1.9 and CUDA 11.1 Support #324

edmuthiah opened this issue Aug 28, 2021 · 6 comments

Comments

@edmuthiah
Copy link

edmuthiah commented Aug 28, 2021

Hello,

I'm looking to use this repository on PyTorch 1.9 and CUDA 11.1 as I'm trying use an RTX3090 for training.

I've tried the following combinations:
pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
pip3 install torch==1.8.2+cu111 torchvision==0.9.2+cu111 torchaudio==0.8.2 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html

But I get the following error when running train.py:
image

I have reviewed the following:
#243
#196
#193
#191
#271

All of them suggest downgrading, which is not possible for a RTX3090.

I'm happy to contribute but don't really understand what's causing the incompatibility with the above torch versions. Using PyTorch 1.9 Mish I've tried to change all mentions of self.act = Mish() to self.act= nn.Mish(inplace=False) in the /models/common.py file. However, this still throws the same error.

Thanks! @WongKinYiu @digantamisra98

@jackhu-bme
Copy link

Well,this error is not hard to solve. I tried this link:
https://blog.csdn.net/Xunuo1995/article/details/115454076?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522163024050016780357294915%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=163024050016780357294915&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~rank_v29_ecpm-1-115454076.first_rank_v2_pc_rank_v29&utm_term=b%5B%3A%2C4%5D%2B%3Dmath.log%288%2F%28640%2Fs%29+a+view+of+a+leaf+variable+yolov5&spm=1018.2226.3001.4187
and this worked well by adding "with torch.no_grad():"
I successfully trained on RTX3090 weeks ago.
Another thing to mention is that there's another error you'll face if you need to use test.py
the output variable is a list of torch.tensor, and this sentence in the python script don't work:
output.cpu().numpy()
I simply wrote this instead:
output2 = []
for m in output:
(tab here)m = m.cpu()
(tab here)output2.append(m)
output2 = output
And it worked successfully.
Well, that's all you need to do when training on RTX3090 and with a high pytorch version.
If you find it useful please give me a like,thanks.

@jackhu-bme
Copy link

Well if you can't read the Chinese blog I mentioned before, just do this:
with torch.no_grad():
b[:, 4] += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image)
b[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls
mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)
add the "with"sentence and the two tabs.

@jackhu-bme
Copy link

the two tabs are before the two sentences following with sentence.The web page can't show them.

@jackhu-bme
Copy link

@ed-muthiah

@edmuthiah
Copy link
Author

edmuthiah commented Aug 30, 2021

Thanks @Jack-Hu-2001

This works for me 🙂 I also want to note that this solution works too:
image

The reason that .data works is presented in this stackoverflow:

If y = x.data then y will be a Tensor that shares the same data with x, is unrelated with the computation history of x, and has requires_grad=False.

I think your solution is probably the more recent way to do it. Are you using the supplied from mish_cuda import MishCuda as Mish or are using the new PyTorch 1.9 Mish (m = nn.Mish())?

@jackhu-bme
Copy link

That's a pretty good solution, thanks for sharing! I did't use it for it's PyTorch1.8.1in my environment,highest supported on remote server. So I'm simply using the mishcuda. If there's any problem with m =nn.Mish(), you can share it here.
I'm sorry that the problem I mentioned in test.py is not in test.py, it's in utils/general.py about line1084
here:
def output_to_target(output, width, height):
# Convert model output to target format [batch_id, class_id, x, y, w, h, conf]
if isinstance(output, torch.Tensor):
output = output.cpu().numpy()
and another possible solution:
#318
Sorry for my bad memory.
Actually it apears when you use test.py on a high pytorch version.
@ed-muthiah

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants