-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added data_format
to flatten layer.
#9696
Conversation
The CNTK/Theano backends seem unhappy with the implementation. You need to use |
You are correct with the K.ndim, unsure if that will resolve the dynamic axis issue with CNTK. |
The issue persists with CNTK. @souptc could you please advise on how to fix it? |
my bad, it is a bug in cntk_backend. In cntk_backend.py method "def permute_dimensions", line 1115: here current_layout is a tuple but pattern could be a list, so compare may failed un-expected... Do you mind to fix this in this pr? |
Thanks @souptc! - that has fixed it. |
keras/utils/test_utils.py
Outdated
@@ -116,6 +116,9 @@ def layer_test(layer_cls, kwargs={}, input_shape=None, input_dtype=None, | |||
|
|||
# test as first layer in Sequential API | |||
layer_config = layer.get_config() | |||
# deals with data_format in flatten | |||
if 'data_format' in kwargs: | |||
layer_config['data_format'] = kwargs['data_format'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this change and add support for data_format
in get_config
for this layer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot about that func, I've updated to reflect your comments.
@fchollet Any more to do? |
keras/layers/core.py
Outdated
@@ -465,6 +466,13 @@ def get_config(self): | |||
class Flatten(Layer): | |||
"""Flattens the input. Does not affect the batch size. | |||
|
|||
Arguments: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format: it's # Arguments
(and no semicolon)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
…ack-embeddings-from-layer-outputs * upstream/master: (68 commits) fit/evaluate_generator supporting native tensors (keras-team#9816) keras-team#9642 Add kwarg and documentation for dilation_rate to SeparableConvs (keras-team#9844) Document that "same" is inconsistent across backends with strides!=1 (keras-team#9629) Improve tests by designating dtype of sample data (keras-team#9834) Add documentation for 'subset' and interpolation' arguments (ImageDataGenerator) (keras-team#9817) Revert default theme to readthedocs Various docs fixes. Fix conflict Add support for class methods documentation (keras-team#9751) Add missing verbose opt for evaluate_generator (keras-team#9811) Added `data_format` to flatten layer. (keras-team#9696) Allow saving models directly to binary stream (keras-team#9789) Fix ctc_batch_cost() error when batch_size = 1 (keras-team#9775) Fix keras-team#9802 (keras-team#9803) Fix error in ImageDataGenerator documentation (keras-team#9798) fix typo (keras-team#9792) keras-team#9733: Extend RemoteMonitor to send data as application/json (keras-team#9734) Fixed inconsistencies regarding ReduceLROnPlateau (keras-team#9723) Fix doc issue. General stateful metrics fixes (keras-team#9446) ...
* Added data_format to flatten * Added flatten tests * Fixed Tests * Added more dimension tests * Reverted TF backend change * Reverted * Fixed CI Problems * Altered to K.ndim for compatability * Updated CNTK backend * Updated to match comments * Updated Docs
I don't understand why we would always convert |
In what way? This is to make it easier to productise going from NCHW to NHWC. The dimension ordering of your 4d matrix matters before your dense layers, if you don’t account for these 2 formats then you get lots of regression tests breaking when you do the conversion for inference time. Why is it suboptimal if it’s a flatten layer? - ignoring the one extra transpose added when using NCHW. I don’t fully understand your final point. |
It's suboptimal as it creates ops in the graph when training I think this happened: with good intent, you created this initializer: + def __init__(self, data_format='channels_last', **kwargs): The However, then this change happened 243338d , where I think someone thought this was an error, as - def __init__(self, data_format='channels_last', **kwargs):
+ def __init__(self, data_format=None, **kwargs): Which suddenly is not backwards compatible anymore, and changes the behaviour of both training and doing inference on I suggest reverting 243338d |
Ah, now I see the confusion! Yes I agree |
If you create an inference graph in
NHWC
but have trained inNCHW
you will realise that your inference predictions are different from train time.This is due to the fact that the flatten layer does not respect channel orderings, therefore I have added a
data_format
keyword arg such that it is always reshaped toNHWC
- as for dense layers, it does not matter.