Point optimizer to tf.keras.optimizer.legacy.Optimizer to be compatib… #2706

chenmoneygithub · 2022-05-19T00:49:32Z

Description

Brief Description of the PR:
Keras has made a new version of optimizer, check the link here, and in a future TF/Keras release, tf.keras.optimizers.XXX will point to the new optimizer, and old optimizers will continue to be supported under legacy namespace until further notice. To avoid code breakage, this PR is replacing tf.keras.optimizer.Optimizer with tf.keras.optimizer.legacy.Optimizer.

Fixes # (issue)

Type of change

Checklist:

I've properly formatted my code according to the guidelines
- By running Black + Flake8
- By running pre-commit hooks
This PR addresses an already submitted issue for TensorFlow Addons
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
This PR contains modifications to C++ custom-ops

How Has This Been Tested?

No new test cases are required, just making sure the current test is not broken due to the change.

bot-of-gabrieldemarmiesse · 2022-05-19T00:49:55Z

@rehanguha @lokhande-vishnu @szutenberg @juntang-zhuang @RaphaelMeudec @PhilJd @manzilz @junjiek @CyberZHG @pkan2 @hyang0129

You are owners of some files modified in this pull request.
Would you kindly review the changes whenever you have the time to?
Thank you very much.

manzilz · 2022-05-19T03:35:06Z

changes to yogi.py: LGTM

juntang-zhuang · 2022-05-19T04:12:34Z

changes to adabelief.py LGTM!

rehanguha

tensorflow_addons/optimizers/cocob.py changes LGTM.

bhack · 2022-05-19T10:58:09Z

@chenmoneygithub To understand this a little bit:

With this PR all the Addons optimizers are legacy only
Some of our (now) legacy only optimizers are also reimplemented in Keras experimental (as non legacy?):
- https://github.com/keras-team/keras/blob/master/keras/optimizers/__init__.py#L37-L45
Keras have already some v2 ready optimizers:
- https://github.com/keras-team/keras/blob/master/keras/optimizers/__init__.py#L26-L33

/cc @seanpmorgan

chenmoneygithub · 2022-05-19T17:25:30Z

@bhack Thanks for your questions! Re your questions:

Yes, after this PR addon optimizers will be based on tf.keras.optimizers.legacy.Optimizer, which is an alias to current tf.keras.optimizers.Optimizer. We will later check how to make them compatible with the new optimizer, which would need some refactor work.
Do you mean AdamW? We are supporting AdamW in Keras now because it seems to be very popular.
V2 optimizer is what people use now. I apologize the the misleading name, but basically V2 optimizers were made at the same time as TF2, so V2 means it supports TF2. The experimental optimizer is made to replace all those V2 optimizers.

Hope this would answer your question!

bhack · 2022-05-19T17:50:25Z

Yes, thanks for the confirmation.

We will later check how to make them compatible with the new optimizer, which would need some refactor work.

I have some doubt on this point as it really depends on the Keras, Keras-CV, Keras-NLP roadmaps and how these will overlap with the existing "legacy" components here.

If a refactor it is needed probably it is better to evaluate if the single optimizer will be absorbed, with the new inheritance, in any of these Keras-* repos.

chenmoneygithub · 2022-05-19T18:07:54Z

@bhack Yes, I have exactly the same thought - new optimizer-based implementation should be done in Keras repo rather than doing in-place code changes to addon.optimizers.

bhack · 2022-05-19T18:51:51Z

Ok.
For this PR you need to handle a bit all the conditionals as we are going to support and produce wheels for the last 3 TF releases.

chenmoneygithub · 2022-05-19T19:10:18Z

Got it, will reflect it in the code.

seanpmorgan · 2022-05-20T22:25:27Z

Hi @chenmoneygithub , thanks for the PR and keeping us in the loop. As you can see from our CI this appears to be a significant breaking change. TF Addons supports 3 TensorFlow/Keras versions during each release, so implementing this change would require some work arounds on our end. I'm wondering if this is implemented as intended on the Keras side?

Has legacy.optimizer been available / has the new optimizer been part of experimental for a few releases? Typically Keras is quite good at backwards compatibility.

cc @fchollet

seanpmorgan · 2022-05-20T22:26:34Z

I didn't see an RFC in keras governance or anything on this, so I suspect more downstream consumers (other than TFA) will be surprised.

bhack · 2022-05-23T11:54:04Z

Also isn't not explained what these legacy optimizers are in the official documentation:
https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/legacy

chenmoneygithub · 2022-05-23T16:29:22Z

@seanpmorgan yes, your concern is very reasonable.

For some background context: In this release (TF/Keras 2.9), our goal is to push out the experimental optimizer, mainly for internal adoptions. Making experimental optimizer the default one will happen in a later release (2.10 or 2.11), and old optimizer deprecation would happen in a much later release (we have not decided which one yet!), but we made this legacy optimizer, which is a mirror of tf.keras.optimizers.XXX to prepare for the future deprecation.

For the question about breakage: the actions failed due to symbol missing, not functionality. I added the version control, which means only use legacy namespace for version after 2.8. We don't want to generate workloads for OSS contributors, so we prefer to handle the implementation based on our side.

@bhack Yea, this is something we will fix. The page should be more clear.

bhack · 2022-05-23T16:45:56Z

For some background context: In this release (TF/Keras 2.9), our goal is to push out the experimental optimizer, mainly for internal adoptions. Making experimental optimizer the default one will happen in a later release (2.10 or 2.11), and old optimizer deprecation would happen in a much later release (we have not decided which one yet!), but we made this legacy optimizer, which is a mirror of tf.keras.optimizers.XXX to prepare for the future deprecation.

I suppose that @seanpmorgan meant that he didn't find any RFC in:
https://github.com/keras-team/governance
https://github.com/tensorflow/community/pulls

Do you meany it will land there before the official deprecation?

I added the version control, which means only use legacy namespace for version after 2.8. We don't want to generate workloads for OSS contributors, so we prefer to handle the implementation based on our side.

If you have added this in Keras directly and you will not do a patch release we need to wait for the next stable to cherry pick this PR as we switch TF/Keras versions every new TF stable release/rc.

chenmoneygithub · 2022-05-23T16:50:16Z

@bhack Thanks!

Yes, we will make public notice before taking migration/deprecation actions, now we are focusing on internal optimizer adoptions.

Yea at this moment it's hard to edit the page.

bhack · 2022-05-23T16:56:23Z

Yea at this moment it's hard to edit the page.

What do you mean? I was mentioning about our dependency on a stable TF release in TFA master instead of TF nightly.

chenmoneygithub · 2022-05-23T17:06:44Z

@bhack Sorry I misread your previous comment, I thought you mean the documentation of legacy.optimizer.

Re you previous comment - I don't think cherrypicking is required? I am saying there are two options here:

addons contributors change the code to implement optimizers based on tf.keras.optimizers.experimental.XXX.
We move addons optimizers into main Keras by reimplementing them based on tf.keras.optimizers.experimental.XXX. e.g., tf.keras.optimizers.lamb.
is actually the better option, but it puts on workloads on contributors.

bhack · 2022-05-23T17:10:58Z

I added the version control.

I was talking about this. Can you point me on this version control?

chenmoneygithub · 2022-05-23T17:17:22Z

@bhack My previous push failed due to some merge conflict, I will ping you once I get the latest code uploaded.

bhack · 2022-05-23T17:22:05Z

My previous push failed due to some merge conflict, I will ping you once I get the latest code uploaded.

Ok thanks.

More in general:

We move addons optimizers into main Keras by reimplementing them based on tf.keras.optimizers.experimental.XXX. e.g., tf.keras.optimizers.lamb.

It could be ok for us but you need to consider:

A little bit of deprecation margin https://github.com/tensorflow/addons#periodic-evaluation-of-subpackages
I suppose you will eventually fix the Keras.io repository yourself for TFA depending tutorial/notebooks in case of breaking optimizers API
We are a small project but we have accumulated over time a little bit of downstream dependencies:

bhack · 2022-05-23T19:18:06Z

Ok, now I see that we are back on the original idea to conditional control the namespace downstream (in TFA directly).

seanpmorgan · 2022-05-23T19:25:18Z

Ok, now I see that we are back on the original idea to conditional control the namespace downstream (in TFA directly).

Yes, this is what I was afraid of. This PR is sub-optimal because it makes downstream consumers inherit the debt of a quickly implemented public api change to Keras.

@chenmoneygithub @fchollet is there anywhere publicly facing from the Keras perspective where we can discuss this? TFA is not the only one impacted by this.

cc @joanafilipa

chenmoneygithub · 2022-05-23T19:29:06Z

@bhack I just updated the code with the version control part.

Yes, I am aware of the large user group of tf addons, and never want to break their workflows. On Keras it remains unclear to us that when we can completely remove the code of the old optimizer, personally I feel it is a hard task. If you are curious on why we are making the new optimizer even though we are not removing the old code in the short term, briefly we feel the old optimizer's logic and layout are bad, so we replace things like slot variables and fused ops with more understandable code, and split out the distributed training part to a separate class.

bhack · 2022-05-23T19:40:20Z

Yes, I am aware of the large user group of tf addons, and never want to break their workflows. On Keras it remains unclear to us that when we can completely remove the code of the old optimizer, personally I feel it is a hard task. If you are curious on why we are making the new optimizer even though we are not removing the old code in the short term, briefly we feel the old optimizer's logic and layout are bad, so we replace things like slot variables and fused ops with more understandable code, and split out the distributed training part to a separate class.

Personally I agree with Sean's vision.
As these are public symbols it seems to me a little bit too late to present an RFC in Keras/governance or in TF/community only when effectively we want to remove these symbols (than at the RFC time they will be already legacy).

For a top namespace change like tf.keras.* I suppose that an early RFC is required to involve the community, SIGs, and more in general downstream projects.

This time it went like this, probably for a good reasons (I don't know) but I would not feel like promoting this process in general for our community / ecosystem.

chenmoneygithub · 2022-05-23T19:49:20Z

We discussed about RFC when we kicked off the project, and did not go for it because main changes happen on internal logic, while the public API mostly remains identical.

Our rollout journey would be hidden for main users, basically the tf.keras.optimizers namespace will point to tf.keras.optimizers.experimental in a future release (this info is available in the release note of TF 2.9), and users will switch to new optimizer silently. In the window of 2.9, we will mainly work with internal teams to fix unseen errors during testing, and later on we will publish a public notice for required actions. Addons optimizer is the most special one, it is written based on the Keras optimizer, and itself has a large user group, so I am prioritizing this action now.

bhack · 2022-05-23T19:58:34Z

I could understand this only eventually for end-users but not for derived/ecosystem projects.
Do you meant that using Keras public API for creating derived objects in a third party library and in this case a TF ecosystem library is not a supported use case anymore?

I don't think that TFA is a special case for derived projects.

chenmoneygithub · 2022-06-01T19:57:25Z

@bhack Refactored the code, please take another look when you are free, thanks!

szutenberg · 2022-06-01T23:04:59Z

tensorflow_addons/optimizers/__init__.py

@@ -14,6 +14,7 @@
 # ==============================================================================
 """Additional optimizers that conform to Keras API."""

+from tensorflow_addons.optimizers.constants import BASE_OPTIMIZER_CLASS


this breaks alphabetic order

This order matters, if this line does not go before other optimizers it creates a cyclic importing.

szutenberg · 2022-06-01T23:06:52Z

tensorflow_addons/optimizers/average_wrapper.py

-            raise TypeError(
-                "optimizer is not an object of tf.keras.optimizers.Optimizer"
-            )
+        if tf.__version__[:3] <= "2.8":


why do we need this condition? What if we would write (tf.keras.optimizers.Optimizer, BASE_OPTIMIZER_CLASS)?

I did this for the error message to be accurate. but rethinking about it, I am doing (tf.keras.optimizers.Optimizer, BASE_OPTIMIZER_CLASS) check and imply that after 2.9 you should expect tf.keras.optimizers.legacy.Optimizer.

szutenberg · 2022-06-01T23:20:39Z

tensorflow_addons/optimizers/constants.py

@@ -0,0 +1,5 @@
+import tensorflow as tf


missing copyright header

szutenberg · 2022-06-01T23:21:28Z

tensorflow_addons/optimizers/constants.py

@@ -0,0 +1,5 @@
+import tensorflow as tf
+
+BASE_OPTIMIZER_CLASS = tf.keras.optimizers.legacy.Optimizer


Why not if ... else? It will crash if tf.keras.optimizers.legacy doesn't exist (TF 2.8), right?

Are you sure it has to be CAPITAL_LETTER?

I'd rename it to KerasLegacyOptimizer (or KERAS_LEGACY_OPTIMIZER - I'm not an expert in conventions 😄 ) and rename the file to keras.py.

Thanks! I changed the check condition.

For the naming, yea, we probably want to do camel case since it is a class itself. But I don't want to imply Legacy in the name, which is not yet 100% correct - we still use tf.keras.optimizers.Optimizer in many places. I am renaming to BaseOptimizerClass, wdyt?

@chenmoneygithub - I expect that after refactor in keras is done, we will be gradually switching back to tf.keras.optimizers.Optimizer. BaseOptimizerClass will be confusing then because it'll be name for the legacy class.

sg, renamed to KerasLegacyOptimizer.

szutenberg · 2022-06-01T23:22:37Z

tensorflow_addons/optimizers/constants.py

+import tensorflow as tf
+
+BASE_OPTIMIZER_CLASS = tf.keras.optimizers.legacy.Optimizer
+if tf.__version__[:3] <= "2.8":


As I wrote previously:

>>> tf.__version__ '2.10.0-dev20220531' >>> tf.__version__[:3] '2.1' >>> tf.__version__[:3] > "2.8" False

Please fix this condition (also in other files) or just check if tf.keras.optimizers.legacy exists, maybe code would be cleaner then...

good catch! done

szutenberg · 2022-06-01T23:30:43Z

tensorflow_addons/optimizers/lookahead.py

@@ -16,11 +16,12 @@
 import tensorflow as tf
 from tensorflow_addons.utils import types


isn't required to update types.Optimizer?

good catch! done

szutenberg · 2022-06-01T23:39:47Z

tensorflow_addons/optimizers/conditional_gradient.py

@@ -17,12 +17,13 @@
 import tensorflow as tf
 from tensorflow_addons.utils.types import FloatTensorLike

+from tensorflow_addons.optimizers import BASE_OPTIMIZER_CLASS


this line should be added after line 17

fsx950223 · 2022-06-03T02:50:50Z

tensorflow_addons/optimizers/tests/weight_decay_optimizers_test.py

@@ -401,13 +401,17 @@ def test_var_list_with_exclude_list_sgdw(dtype):
    )


+if tf.__version__[:3] > "2.8":


Use from packaging.version import Version

thanks! I am switching to check if "optimizers.legacy" exists to keep consistency.

fsx950223 · 2022-06-03T02:51:42Z

tensorflow_addons/optimizers/lazy_adam.py

@@ -27,8 +27,14 @@
 from typing import Union, Callable


+if tf.__version__[:3] > "2.8":


Use from packaging.version import Version

fsx950223 · 2022-06-03T02:52:08Z

tensorflow_addons/optimizers/weight_decay_optimizers.py

@@ -261,10 +261,17 @@ def _do_use_weight_decay(self, var):
        return var.ref() in self._decay_var_list


+optimizer_class = Union[


Why not BaseOptimizerClass?

I am renaming this to keras_legacy_optimizer to keep aligned with changes suggested by Michal.

fsx950223 · 2022-06-04T07:25:29Z

tensorflow_addons/utils/types.py

-Optimizer = Union[tf.keras.optimizers.Optimizer, str]
+if importlib.util.find_spec("tensorflow.keras.optimizers.legacy") is not None:
+    Optimizer = Union[
+        tf.keras.optimizers.Optimizer, tf.keras.optimizers.legacy.Optimizer, str


Same as above

This I am using the name "Optimizer" to keep it unchanged.

chenmoneygithub · 2022-06-07T17:07:12Z

Hi there, sorry for not updating this for a while. I have been terribly sick for the last a few days, will update this PR when I can work. Thanks!

fsx950223 · 2022-06-13T06:39:13Z

Please check ci/cd.

chenmoneygithub · 2022-06-13T17:03:12Z

@fsx950223 I checked the failure log, and it seems to ask for type annotation from source Keras optimizer code not changed I made to addons. Can I ask what should we do here?

chenmoneygithub · 2022-06-18T21:56:26Z

Hi maintainers, the current error message is:

'keras.optimizers.optimizer_v2.optimizer_v2.Optimizer.__init__' has not complete type annotations in its signature (it's missing the type hint for 'name'). We would like this functions to be typed.If you are not familiar with adding type hints in functions, you can look at functions already typed inthe codebase.

Which does not look like a new issue imported by this PR. Could anyone help check and give some suggestions on how we should handle it? https://github.com/tensorflow/addons/runs/6836413149?check_suite_focus=true

I would like to have this PR submitted soon to avoid potential breakage of addons.optimizers in the next TF release.

bhack · 2022-06-18T22:54:38Z

@chenmoneygithub https://github.com/tensorflow/addons/blob/master/CONTRIBUTING.md#about-type-hints

chenmoneygithub · 2022-06-18T22:58:33Z

@bhack Thanks! But the error is pointing to main Keras repo rather than some classes I am changing in addons repo. Is there any way to exempt the check?

bhack · 2022-06-18T23:12:37Z

If can add an exclusion at:

https://github.com/tensorflow/addons/blob/master/tools/testing/source_code_test.py#L41

chenmoneygithub requested a review from WindQAQ as a code owner May 19, 2022 00:49

boring-cyborg bot added the optimizers label May 19, 2022

rehanguha reviewed May 19, 2022

View reviewed changes

chenmoneygithub force-pushed the fix-optimizer branch from d1448ae to 60e3d3a Compare May 23, 2022 19:14

chenmoneygithub added 2 commits June 1, 2022 12:40

small fix

13c58f6

move optimizer class to __init__.py

743c51d

chenmoneygithub force-pushed the fix-optimizer branch from 62625d6 to 743c51d Compare June 1, 2022 19:54

small fix

a66a595

bhack requested review from fsx950223 and AakashKumarNain June 1, 2022 21:15

szutenberg suggested changes Jun 1, 2022

View reviewed changes

bhack mentioned this pull request Jun 2, 2022

Gradient accumulate optimizer #2260

Closed

chenmoneygithub added 2 commits June 2, 2022 13:42

fix problems

69bcdb4

small fix

e734793

fsx950223 reviewed Jun 3, 2022

View reviewed changes

fsx950223 reviewed Jun 4, 2022

View reviewed changes

Rename BaseOptimizer to KerasLegacyOptimizer

8ecfab5

bhack requested a review from fsx950223 June 10, 2022 19:23

exclude keras optimizer from type check

82554de

boring-cyborg bot added the test-cases Related to Addons tests label Jun 18, 2022

fix import

268d37a

chenmoneygithub force-pushed the fix-optimizer branch from 8e0439a to 268d37a Compare June 19, 2022 00:24

fsx950223 approved these changes Jun 19, 2022

View reviewed changes

fsx950223 merged commit 339159f into tensorflow:master Jun 19, 2022

		@@ -0,0 +1,5 @@
		import tensorflow as tf

		BASE_OPTIMIZER_CLASS = tf.keras.optimizers.legacy.Optimizer

		@@ -16,11 +16,12 @@
		import tensorflow as tf
		from tensorflow_addons.utils import types

		@@ -401,13 +401,17 @@ def test_var_list_with_exclude_list_sgdw(dtype):
		)


		if tf.__version__[:3] > "2.8":

		@@ -27,8 +27,14 @@
		from typing import Union, Callable


		if tf.__version__[:3] > "2.8":

		@@ -261,10 +261,17 @@ def _do_use_weight_decay(self, var):
		return var.ref() in self._decay_var_list


		optimizer_class = Union[

Point optimizer to tf.keras.optimizer.legacy.Optimizer to be compatib… #2706

Point optimizer to tf.keras.optimizer.legacy.Optimizer to be compatib… #2706

Conversation

chenmoneygithub commented May 19, 2022

Description

Type of change

Checklist:

How Has This Been Tested?

bot-of-gabrieldemarmiesse commented May 19, 2022

manzilz commented May 19, 2022

juntang-zhuang commented May 19, 2022

rehanguha left a comment

Choose a reason for hiding this comment

bhack commented May 19, 2022

chenmoneygithub commented May 19, 2022

bhack commented May 19, 2022 • edited Loading

chenmoneygithub commented May 19, 2022 • edited Loading

bhack commented May 19, 2022

chenmoneygithub commented May 19, 2022

seanpmorgan commented May 20, 2022

seanpmorgan commented May 20, 2022

bhack commented May 23, 2022

chenmoneygithub commented May 23, 2022

bhack commented May 23, 2022 • edited Loading

chenmoneygithub commented May 23, 2022

bhack commented May 23, 2022 • edited Loading

chenmoneygithub commented May 23, 2022

bhack commented May 23, 2022

chenmoneygithub commented May 23, 2022

bhack commented May 23, 2022 • edited Loading

bhack commented May 23, 2022

seanpmorgan commented May 23, 2022

chenmoneygithub commented May 23, 2022

bhack commented May 23, 2022

chenmoneygithub commented May 23, 2022

bhack commented May 23, 2022

chenmoneygithub commented Jun 1, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fsx950223 Jun 3, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chenmoneygithub commented Jun 7, 2022

fsx950223 commented Jun 13, 2022

chenmoneygithub commented Jun 13, 2022

chenmoneygithub commented Jun 18, 2022

bhack commented Jun 18, 2022

chenmoneygithub commented Jun 18, 2022

bhack commented Jun 18, 2022

bhack commented May 19, 2022 •

edited

Loading

chenmoneygithub commented May 19, 2022 •

edited

Loading

bhack commented May 23, 2022 •

edited

Loading

bhack commented May 23, 2022 •

edited

Loading

bhack commented May 23, 2022 •

edited

Loading

fsx950223 Jun 3, 2022 •

edited

Loading