-
Notifications
You must be signed in to change notification settings - Fork 670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WB-4478] Feature/tb sync #1852
Conversation
@@ -484,6 +491,8 @@ def sync( | |||
clean_old_hours=24, | |||
clean_force=None, | |||
): | |||
# TODO: rather unfortunate, needed to avoid creating a `wandb` directory | |||
os.environ["WANDB_DIR"] = TMPDIR.name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@raubitsj this one really sucks. I guess we can address it once we get rid of wandb.old and _get_cling_api
... Not sure if it was worth it to dig deeper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other people took a pass at not creating wandb dirs, but i think it wasnt done cleanly so it keeps coming back
self._step, len(dropped_keys) | ||
) | ||
) | ||
print("\t" + ("\n\t".join(dropped_keys))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the error that Graphcore was seeing, atleast now we tell the user exactly what keys we're dropping if a single "row" is more than 4 megabytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIrst pass look (i will do a full review tomorrow.
Do you want to put this in also? i think it is related, i havent had enough confidence in it without tests to put it in:
#1586
Codecov Report
@@ Coverage Diff @@
## master #1852 +/- ##
==========================================
+ Coverage 73.81% 74.90% +1.08%
==========================================
Files 233 233
Lines 28554 28754 +200
==========================================
+ Hits 21078 21538 +460
+ Misses 7476 7216 -260
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great
https://wandb.atlassian.net/browse/WB-4478
https://wandb.atlassian.net/browse/CLI-708
Description
Syncs tensorboard event files with
wandb sync
and adds tests.Testing
How was this PR tested?