Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removal of metrics deepcopy before computing the metrics #3180

Merged
merged 4 commits into from
Apr 16, 2024

Conversation

gregjauvion
Copy link
Contributor

@gregjauvion gregjauvion commented Apr 5, 2024

What does this PR do?

This PR removes the deep copy of the metrics: Dict[str, Metric] object in the function _compute_and_log_metrics.

This function does the following:

  • Copy of the metrics (using copy.deepcopy)
  • Computation of the values of the metrics on the copy

It is called 3 times in composer.trainer.Trainer:

I don't see the need for copying the metrics before calling metric.compute(). In particular, the metrics are reset with metric.reset() at the start of the training on each batch, and at the start of the evaluation loop, making this copy useless. Is there a specific reason I am missing?

What issue(s) does this change relate to?

Related to issue #3153 describing that copying the metrics results in a memory leak in a specific use-case I'm working on.

Before submitting

  • Have you read the contributor guidelines?
  • Is this change a documentation change or typo fix? If so, skip the rest of this checklist.
  • Was this change discussed/approved in a GitHub issue first? It is much more likely to be merged if so.
  • Did you update any related docs and document your change?
  • Did you update any related tests and add any new tests related to your change? (see testing)
  • Did you run the tests locally to make sure they pass?
  • Did you run pre-commit on your change? (see the pre-commit section of prerequisites)

@gregjauvion
Copy link
Contributor Author

Hi @mvpatel2000 should I assign this MR to a maintainer of the repo? Please tell me who I should assign, thank you.

@mvpatel2000
Copy link
Contributor

Hi @mvpatel2000 should I assign this MR to a maintainer of the repo? Please tell me who I should assign, thank you.

@gregjauvion feel free to request from me. I can triage if necessary

@mvpatel2000
Copy link
Contributor

@gregjauvion thanks for the PR, I think this is correct. I will run a series of manual tests though to verify identical behavior

@gregjauvion
Copy link
Contributor Author

Thanks a lot for your reactivity! Please tell me if I can help in any way.

@mvpatel2000
Copy link
Contributor

Test failures seem related to a flakey test introduced in previous PR, patching elsewhere

Copy link
Contributor

@mvpatel2000 mvpatel2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mvpatel2000 mvpatel2000 merged commit 6f84caa into mosaicml:dev Apr 16, 2024
14 checks passed
@gregjauvion gregjauvion deleted the metrics-remove-deepcopy branch April 16, 2024 21:01
DhruvDh pushed a commit to DhruvDh/composer that referenced this pull request Apr 21, 2024
* remove deepcopy of metrics before calling metric.compute()

* update documentation of _compute_and_log_metrics

---------

Co-authored-by: Grégoire Jauvion <gregoire.jauvion@helsing.ai>
Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com>
j316chuck pushed a commit that referenced this pull request May 16, 2024
* remove deepcopy of metrics before calling metric.compute()

* update documentation of _compute_and_log_metrics

---------

Co-authored-by: Grégoire Jauvion <gregoire.jauvion@helsing.ai>
Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants