Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update NeptuneLogger #3165

Merged
merged 25 commits into from
Apr 15, 2024
Merged

Update NeptuneLogger #3165

merged 25 commits into from
Apr 15, 2024

Conversation

AleksanderWWW
Copy link
Contributor

@AleksanderWWW AleksanderWWW commented Apr 2, 2024

What does this PR do?

  • Update the parameter name upload_artifacts -> upload_checkpoints
  • Improve the way the OOMCallback handles files
  • Better handle normalized images logging
  • Improve file uploading
  • Improve docs
  • Add optional progress bar support for file downloading (if Neptune version permits)

What issue(s) does this change relate to?

Before submitting

  • Have you read the contributor guidelines?
  • Is this change a documentation change or typo fix? If so, skip the rest of this checklist.
  • Was this change discussed/approved in a GitHub issue first? It is much more likely to be merged if so.
  • Did you update any related docs and document your change?
  • Did you update any related tests and add any new tests related to your change? (see testing)
  • Did you run the tests locally to make sure they pass?
  • Did you run pre-commit on your change? (see the pre-commit section of prerequisites)

@AleksanderWWW
Copy link
Contributor Author

#3170

@SiddhantSadangi @normandy7

@AleksanderWWW AleksanderWWW marked this pull request as ready for review April 10, 2024 08:26
@AleksanderWWW AleksanderWWW requested a review from a team as a code owner April 10, 2024 08:26
Copy link
Contributor

@mvpatel2000 mvpatel2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though would like Cheng to review too given other changes

composer/loggers/neptune_logger.py Outdated Show resolved Hide resolved
@mvpatel2000 mvpatel2000 requested a review from cli99 April 10, 2024 21:53
@mvpatel2000
Copy link
Contributor

@AleksanderWWW can you please provide a proper PR description?

Copy link
Contributor

@cli99 cli99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@mvpatel2000
Copy link
Contributor

@AleksanderWWW can approve and merge once tests pass. Let me know if you have any trouble

@AleksanderWWW
Copy link
Contributor Author

Idk, tried so many things but no idea how to fix this doctest failure due to unexpected output generated by neptune @mvpatel2000
@SiddhantSadangi

Copy link
Contributor

@SiddhantSadangi SiddhantSadangi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove suppressing logs before we forget doing so :)

docs/source/trainer/file_uploading.rst Outdated Show resolved Hide resolved
@SiddhantSadangi
Copy link
Contributor

The error doesn't seem to related to Neptune outputs
ERROR: Error in "testcode" directive: invalid option block

The block is the same as our previous PR that cleared tests though 🤷

docs/source/trainer/logging.rst Outdated Show resolved Hide resolved
Co-authored-by: Siddhant Sadangi <siddhant.sadangi@neptune.ai>
@AleksanderWWW
Copy link
Contributor Author

AleksanderWWW commented Apr 15, 2024

Ok, idk how, but removing logging.getLogger and level setting + adding a whitespace here and there seems to resolve the issue. I will definitely wake up tomorrow morning a little wiser and a lot more confused than I used to be.

Copy link
Contributor

@mvpatel2000 mvpatel2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, a few minor nits I will apply

composer/callbacks/oom_observer.py Outdated Show resolved Hide resolved
tests/loggers/test_neptune_logger.py Outdated Show resolved Hide resolved
tests/loggers/test_neptune_logger.py Outdated Show resolved Hide resolved
tests/loggers/test_neptune_logger.py Outdated Show resolved Hide resolved
@mvpatel2000
Copy link
Contributor

@AleksanderWWW the following issue is causing test failures:

Expected nothing
Got:
Warning:     [neptune] [warning] Info (NVML): NVML Shared Library Not Found. GPU usage metrics may not be reported. For more information, see https://docs.neptune.ai/help/nvml_error/

is there a way to pass args to neptune to not emit warnings in this case? or does it have to be filtered

@mvpatel2000
Copy link
Contributor

Hm... it might be flakey... I reran and it's gone 🤔. I will rerun one more time. If it passes we can merge and if it flakes post-merge ill revert and reopen PR

@mvpatel2000 mvpatel2000 merged commit 10fd9b8 into mosaicml:dev Apr 15, 2024
15 checks passed
DhruvDh pushed a commit to DhruvDh/composer that referenced this pull request Apr 21, 2024
* Update NeptuneLogger

* better check symlinks

* Update composer/loggers/neptune_logger.py

Co-authored-by: Sabine <sabine.nyholm@neptune.ai>

* use progress bar if possible

* simplify imports

* update oom callback

* fix

* fix typing

* code review

* maybe a fix

* Apply suggestions from code review

Co-authored-by: Siddhant Sadangi <siddhant.sadangi@neptune.ai>

* format

* Update composer/callbacks/oom_observer.py

* Update tests/loggers/test_neptune_logger.py

* Update tests/loggers/test_neptune_logger.py

* Update tests/loggers/test_neptune_logger.py

---------

Co-authored-by: Sabine <sabine.nyholm@neptune.ai>
Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com>
Co-authored-by: Siddhant Sadangi <siddhant.sadangi@neptune.ai>
j316chuck pushed a commit that referenced this pull request May 16, 2024
* Update NeptuneLogger

* better check symlinks

* Update composer/loggers/neptune_logger.py

Co-authored-by: Sabine <sabine.nyholm@neptune.ai>

* use progress bar if possible

* simplify imports

* update oom callback

* fix

* fix typing

* code review

* maybe a fix

* Apply suggestions from code review

Co-authored-by: Siddhant Sadangi <siddhant.sadangi@neptune.ai>

* format

* Update composer/callbacks/oom_observer.py

* Update tests/loggers/test_neptune_logger.py

* Update tests/loggers/test_neptune_logger.py

* Update tests/loggers/test_neptune_logger.py

---------

Co-authored-by: Sabine <sabine.nyholm@neptune.ai>
Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com>
Co-authored-by: Siddhant Sadangi <siddhant.sadangi@neptune.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants