Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static Learning Phase ignored on CNTK #9946

Closed
wants to merge 6 commits into from

Conversation

datumbox
Copy link
Contributor

@datumbox datumbox commented Apr 15, 2018

While working on a bigger PR for the behaviour of Learning Phase on BatchNormalization, I noticed that there is a bug on the CNTK backend that affects how it handles static learning phases. It seems that when the user sets a static learning_phase, it is ignored. I believe the bigger PR requires building a case and thorough discussion, so I decided to split the changes into 2 parts (I will soon document the problem on BN and submit another PR).

What's the problem on CNTK:

  • The current implementation of learning_phase on CNTK requires it to always be a tensor; it can never be an integer even when it is set statically.
  • Unfortunately multiple methods on training.py check if isinstance(K.learning_phase(), int) to decide if the learning phase is static.
  • Since the learning_phase on CNTK is never an integer, if the user sets K.set_learning_phase(1) it is ignored.

Things to know about this PR:

  • On the first commit of this PR, I added a test which fails on the current master. The test shows that BatchNormalization operates in test-mode even when the user explicitly sets learning_phase to 1. After applying the patch all tests pass.
  • This fix addresses the problem but it's not an elegant solution. If we don't like it, I'm happy to update or close the PR and someone else can provide a better solution.
  • I have never worked with CNTK backend before. So I would really appreciate a thorough review from any of the people who wrote the CNTK backend (@taehoonlee, @fchollet, @souptc or @ozabluda)

@souptc
Copy link
Contributor

souptc commented Apr 15, 2018

thanks @datumbox to bring this up. It looks when I implemented the "learning_phase" in cntk_backend, my understanding is not accurate. Let me try to make sure I understand correctly now:

In keras, "learning_phase" by default is a scalar placeholder which is an input of the graph. But if user call "set_learning_phase" with a given value, "learning_phase" will turn to a static flag. When create keras_train/test_function, if "learning_phase" is not integer (the default case), we will append the "learning_phase" flag to inputs.

Is my understanding correct? If yes, I think I can wrapper this logic inside cntk_backend without impact keras layer. Let me try some experiment and come back later.

one question: what will happened if user at beginning create some keras layer with default "learning_phase" setting (the dynamic one), then call "set_learning_phase" to make it static flag with other layers? is this a valid case?

@datumbox
Copy link
Contributor Author

hey @souptc, thanks for having a look, much appreciated mate.

I believe your description is accurate based on what I saw on the code of the other backends. I also see your point for when you switch from dynamic to static; effectively changing it like that can mess up the graph. I typically avoid switching around learning_phase while I'm in the middle of operations.

@fchollet
Copy link
Member

Is my understanding correct? If yes, I think I can wrapper this logic inside cntk_backend without impact keras layer

Sounds good to me. This would be preferable.

@datumbox
Copy link
Contributor Author

Agreed. :)

@souptc could you please let me know when your PR is submitted? I have another PR that depends on fixing this issue. I'll also close this PR as soon as yours is merged.

@datumbox
Copy link
Contributor Author

@fchollet quick question. What is the timeline for releasing the next Keras version? I'm working a change on the behaviour of BatchNormalization layer; the code is ready but I need to document the "what and why". Any clue when the next release will happen?

To cut the long story short, my proposal is to make BN operate in test-mode when trainable = False. I'll follow up with theoretical explanations of why we want this, examples and a properly documented PR.

@souptc souptc mentioned this pull request Apr 16, 2018
@souptc
Copy link
Contributor

souptc commented Apr 16, 2018

@datumbox , I create a pr #9952 to address this. I also include your test case. please help to take a look.

@datumbox
Copy link
Contributor Author

datumbox commented Apr 17, 2018

Closing PR as @souptc's solution is more elegant and localized to the CNTK backend.

@datumbox datumbox closed this Apr 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants