Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(diff/implicit): fix memory leak of OOP APIs #113

Merged
merged 11 commits into from
Jan 15, 2023

Conversation

XuehaiPan
Copy link
Member

No description provided.

@XuehaiPan XuehaiPan added bug Something isn't working pytorch Something PyTorch related functorch Something functorch related labels Nov 11, 2022
@codecov-commenter
Copy link

codecov-commenter commented Nov 11, 2022

Codecov Report

Base: 70.34% // Head: 70.26% // Decreases project coverage by -0.07% ⚠️

Coverage data is based on head (d233cbe) compared to base (381f4d1).
Patch coverage: 83.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #113      +/-   ##
==========================================
- Coverage   70.34%   70.26%   -0.08%     
==========================================
  Files          71       72       +1     
  Lines        2981     3000      +19     
==========================================
+ Hits         2097     2108      +11     
- Misses        884      892       +8     
Flag Coverage Δ
unittests 70.26% <83.00%> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
torchopt/nn/module.py 40.09% <27.27%> (+0.27%) ⬆️
torchopt/optim/meta/base.py 63.15% <75.00%> (ø)
torchopt/nn/stateless.py 81.57% <81.57%> (ø)
torchopt/diff/implicit/nn/module.py 81.39% <96.87%> (-4.32%) ⬇️
torchopt/__init__.py 100.00% <100.00%> (ø)
torchopt/diff/implicit/decorator.py 95.75% <100.00%> (+0.02%) ⬆️
torchopt/diff/zero_order/nn/module.py 91.42% <100.00%> (-1.08%) ⬇️
torchopt/nn/__init__.py 100.00% <100.00%> (ø)
torchopt/typing.py 95.65% <100.00%> (+0.19%) ⬆️
torchopt/utils.py 56.85% <100.00%> (ø)
... and 1 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@XuehaiPan XuehaiPan force-pushed the fix-implicit-oop branch 6 times, most recently from 94fcc09 to 5591aa9 Compare November 19, 2022 16:30
@Benjamin-eecs Benjamin-eecs added this to the 0.7.0 milestone Nov 21, 2022
@XuehaiPan XuehaiPan force-pushed the fix-implicit-oop branch 3 times, most recently from 1f1b509 to 8f4beac Compare January 14, 2023 11:23
@XuehaiPan XuehaiPan marked this pull request as ready for review January 14, 2023 12:04
@XuehaiPan XuehaiPan force-pushed the fix-implicit-oop branch 4 times, most recently from 400c024 to 871b54c Compare January 14, 2023 13:22
@XuehaiPan XuehaiPan merged commit 997cf56 into metaopt:main Jan 15, 2023
@XuehaiPan XuehaiPan deleted the fix-implicit-oop branch January 15, 2023 06:49
@zaccharieramzi
Copy link

zaccharieramzi commented Jan 15, 2024

Hi @XuehaiPan @JieRen98 ,

I see that this was just merged and I seem to be running in a memory leak of my own.
Could you explain what the problem was and how it was solved?

EDIT

Do you also know when the release will happen for this fix?

EDIT 2

Apparently my memory leak was not fixed by using the latest fix so my guess is that it's not the same issue.

@XuehaiPan
Copy link
Member Author

Hi @zaccharieramzi, you could reference the changes in examples/iMAML/imaml_omniglot.py (5cb7b6).

We previously created the inner network in the for-loop. For each iteration, a new inner network was created. That was the cause of the memory leak.

for task_id in range(num_tasks):
    inner_net = InnerNet(meta_params, ...)
    inner_net.solve(data, ...)
    ...

    gc.collect()

Now we create the InnerNet object outside of the for-loop and reuse the instances.

inner_nets = [InnerNet(meta_params, ...) for task_id in range(num_tasks)]

for task_id in range(num_tasks):
    inner_net = inner_nets[task_id]
    inner_net.reset_parameters(...)
    inner_net.solve(data, ...)
    ...

If this solution does not resolve your issue, you can open an issue for that.

@zaccharieramzi
Copy link

I solved my issue by getting rid of torchopt.FuncOptimizer and computing grads, updates and applying them manually.

I think the problem might be that in the step function of FuncOptimizer there is a create_graph=True, but I am not too sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working functorch Something functorch related pytorch Something PyTorch related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants