Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MovingAverage: add dynamic decay and swap weights #1726

Merged
merged 2 commits into from
Jun 18, 2020

Conversation

tf-marissaw
Copy link
Contributor

Adds two new features to MovingAverage:

  1. Dynamic decay: This helps the early accuracy of MovingAverage by starting the decay at 0.1 and gradually increasing it to average_decay.

  2. Swap weights: This makes it easier to temporarily swap the model weights and the average weights during eval.

def test_swap_weights(sequential_update):
for sequential_update in [True, False]:

strategy = tf.distribute.MirroredStrategy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use distributed strategies in the test suite yet. It's work in progress, a first step is there: #1713

Copy link
Contributor Author

@tf-marissaw tf-marissaw Apr 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

swap_weights has to be called in a cross replica context.

Can I use OneDeviceStrategy here?

Copy link
Member

@gabrieldemarmiesse gabrieldemarmiesse May 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're going to have very soon distributed capabilities in the tests, could you take a look at #1770 and tell me if this would work for you?

This will enable true multi-Gpu (virtual at least) strategy testing, even with multiple pytest workers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should work for this test case. Do you know when those patches will be merged?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two pull requests need to get reviewed to get there. So I'd expect a week or so. If next week it's not merged, I'll ping the other maintainers about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the blocking pull requests are still under review, would it be possible to merge this in with either the test commented out or just using OneDeviceStrategy in the test? I could immediately make a pull request that fixed the test case so it could be merged once the blocking pull requests are submitted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would either of these options work for you? We would like to get these changes in soon because they are fixing real problems with MovingAverage.

@seanpmorgan seanpmorgan added blocked Pending something elses completion and removed blocked Pending something elses completion labels Apr 26, 2020
@tf-marissaw tf-marissaw force-pushed the moving-average-01 branch 3 times, most recently from 775c98d to addaa4a Compare April 27, 2020 20:16
@gabrieldemarmiesse gabrieldemarmiesse added the blocked Pending something elses completion label May 2, 2020
Adding dynamic decay to MovingAverage improves early accuracy. When using dynamic decay, decay
starts at 0.1 and gradually increases up to `average_decay`.
This patch makes it easier to swap the model weights and the MovingAverage weights before eval and
swap them back after eval.
@tf-marissaw
Copy link
Contributor Author

Since #1770 has been merged, I have updated the patches to use "@pytest.mark.with_device([tf.distribute.MirroredStrategy])". Please take a look and let me know if there is anything else I should do.

@gabrieldemarmiesse gabrieldemarmiesse removed the blocked Pending something elses completion label Jun 11, 2020
@tf-marissaw
Copy link
Contributor Author

@Squadrick Is there anything you need from me before you can review this pull request? We would like to submit it soon. Thanks!

Copy link
Member

@Squadrick Squadrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the contribution!

@gabrieldemarmiesse Can you please merge this?

a.assign_sub(b)
return a

def swap(strategy, a, b):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this trick for swapping.

@gabrieldemarmiesse
Copy link
Member

Thanks @tf-marissaw for the pull request and thanks @Squadrick for the review!

@gabrieldemarmiesse gabrieldemarmiesse merged commit a0bfe3f into tensorflow:master Jun 18, 2020
ashutosh1919 pushed a commit to ashutosh1919/addons that referenced this pull request Jul 12, 2020
* Add dynamic decay to MovingAverage.

Adding dynamic decay to MovingAverage improves early accuracy. When using dynamic decay, decay
starts at 0.1 and gradually increases up to `average_decay`.

* Add ability to swap weights to MovingAverage.

This patch makes it easier to swap the model weights and the MovingAverage weights before eval and
swap them back after eval.
jrruijli pushed a commit to jrruijli/addons that referenced this pull request Dec 23, 2020
* Add dynamic decay to MovingAverage.

Adding dynamic decay to MovingAverage improves early accuracy. When using dynamic decay, decay
starts at 0.1 and gradually increases up to `average_decay`.

* Add ability to swap weights to MovingAverage.

This patch makes it easier to swap the model weights and the MovingAverage weights before eval and
swap them back after eval.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants