Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics error due to inplace operation, "computation has been modified by an inplace operation". #2862

Closed
sykrn opened this issue Aug 7, 2020 · 5 comments · Fixed by #2878
Assignees
Labels
bug Something isn't working help wanted Open to be worked on

Comments

@sykrn
Copy link

sykrn commented Aug 7, 2020

Hey, @williamFalcon, I got a new error since I upgraded the library today.
I used the accuracy metric, but got an error.

Code sample:

# in lightning module
def training_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = F.cross_entropy(y_hat, y)
        acc = accuracy(y_hat, y)      # from the functional metric classification
        tensorboard_logs = {'train_loss': loss}
        return {'loss': loss, 'log': tensorboard_logs}

Error msg:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.LongTensor [32]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Can be solved using .clone() method.

However, when I clone the y before feeding to the accuracy function, no error was shown.

acc = accuracy(y_hat, y.clone())

But, it's inconvenience if user has to do it manually, isn't it?
Actually, I can use the code above without clone before I upgrade to the latest. So, it might due to the latest update/rebase causing this error.

The same error shown for f1_score metric.

@sykrn sykrn added bug Something isn't working help wanted Open to be worked on labels Aug 7, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Aug 7, 2020

Hi! thanks for your contribution!, great first issue!

@williamFalcon
Copy link
Contributor

cc @justusschock

@justusschock
Copy link
Member

justusschock commented Aug 7, 2020

cc @Diuven , I think you introduced in-place ops for speed-up, right?
Does the speedup only come from inplace methods or can we simply replace them by out-of-place methods ?

Diuven added a commit to Diuven/pytorch-lightning that referenced this issue Aug 8, 2020
@Diuven
Copy link
Contributor

Diuven commented Aug 8, 2020

cc @Diuven , I think you introduced in-place ops for speed-up, right?
Does the speedup only come from inplace methods or can we simply replace them by out-of-place methods ?

Yeah, you're right. I think this is because of the clamp_max_ I used in stat_scores_mulitple_classes. This is the part.

This is a quick PR fixing this issue. If you may, please check the code still gives the issue.

Sorry for the inconvenience!

williamFalcon pushed a commit that referenced this issue Aug 8, 2020
* Faster classfication stats

* Faster accuracy metric

* minor change on cls metric

* Add out-of-bound class clamping

* Add more tests and minor fixes

* Resolve code style warning

* Update for #2781

* hotfix

* Update pytorch_lightning/metrics/functional/classification.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update about conversation

* Add docstring on stat_scores_multiple_classes

* Fixing #2862

Co-authored-by: Younghun Roh <yhunroh@mindslab.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
@sykrn
Copy link
Author

sykrn commented Aug 8, 2020

Great, It runs without any error, now. Thanks for the hotfix. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Open to be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants