Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

want zoneout lstm supported #1867

Closed
breadbread1984 opened this issue May 21, 2020 · 12 comments
Closed

want zoneout lstm supported #1867

breadbread1984 opened this issue May 21, 2020 · 12 comments

Comments

@breadbread1984
Copy link

Describe the feature and the current behavior/state.

no zoneout lstm found in addons

Relevant information

Which API type would this fall under (layer, metric, optimizer, etc.)

tfa.layers

Who will benefit with this feature?

people implementing Tacotron2 with tf.keras API

Any other info.

@failure-to-thrive
Copy link
Contributor

I could handle it...

@breadbread1984
Copy link
Author

really appreciated to it

@failure-to-thrive
Copy link
Contributor

So far so good! Any ideas on how to test it? Input values, random seed, expected output values?

@breadbread1984
Copy link
Author

speedy implement! given the same input and zero initial state, it outputs a tensor having element value either the same as lstm's or zero.

@failure-to-thrive
Copy link
Contributor

Here it is! You can copy & paste the class into your code. Example of usage and tests are further down below. If everything work as intended we could incorporate it into TFA.

import tensorflow as tf
from tensorflow.keras.layers import LSTMCell, RNN, LSTM

class ZoneoutLSTMCell(LSTMCell):
    def __init__(
        self,
        units,
        zoneout_h = 0,
        zoneout_c = 0,
        **kwargs
    ):
        super().__init__(
            units,
            **kwargs
        )
        self.zoneout_h = zoneout_h
        self.zoneout_c = zoneout_c

    def _zoneout(self, t, tm1, rate, training):
        dt = tf.cast(tf.random.uniform(t.shape) >= rate * training, t.dtype)
        return dt*t + (1 - dt)*tm1

    def call(self, inputs, states, training=None):
        output, new_states = super().call(inputs, states, training)
        h = self._zoneout(new_states[0], states[0], self.zoneout_h, training)
        c = self._zoneout(new_states[1], states[1], self.zoneout_c, training)
        return h, [h, c]


x = tf.constant([[[1., 2, 3, 4, 5]]])
initial_state = [tf.constant([[11., 12, 13]]), tf.constant([[14., 15, 16]])]

tf.random.set_seed(0)
l0 = LSTM(3, return_state=True)
y0 = l0(x, initial_state=initial_state, training=True)
tf.print(y0)

tf.random.set_seed(0)
l = RNN(ZoneoutLSTMCell(3, zoneout_h=.3, zoneout_c=.5), return_state=True)
y = l(x, initial_state=initial_state, training=True)
tf.print(y)

@breadbread1984
Copy link
Author

you need to save zoneout_h and zoneout_c in operator's config dictionary.

@breadbread1984
Copy link
Author

I get unsupported operand type(s) for *: 'float' and 'NoteType' at line "dt = tf.cast(tf.random.uniform(t.shape) >= rate * training, t.dtype)"

@failure-to-thrive
Copy link
Contributor

you need to save zoneout_h and zoneout_c in operator's config dictionary.

Sure. The code above is just an algorithm implementation. If it is OK we can move forward.

I get unsupported operand type(s) for *: 'float' and 'NoteType' at line "dt = tf.cast(tf.random.uniform(t.shape) >= rate * training, t.dtype)"

Could you describe environment where it happened?

@breadbread1984
Copy link
Author

sorry, I used the operator the wrong way. I can successfully run the code.

@breadbread1984
Copy link
Author

@failure-to-thrive I found there is a flaw in your implement. the hidden in your implement is calculated from cell of lstm, but the hidden of zoneout lstm should be calculated from zoneouted cell.

@failure-to-thrive
Copy link
Contributor

I've simply implemented the generalized form from the original academic paper. Although there are lots of variations which are mentioned in the paper too, I'm not sure whether all of them should be implemented in the scope of TFA. Any ideas?

@seanpmorgan
Copy link
Member

TensorFlow Addons is transitioning to a minimal maintenance and release mode. New features will not be added to this repository. For more information, please see our public messaging on this decision:
TensorFlow Addons Wind Down

Please consider sending feature requests / contributions to other repositories in the TF community with a similar charters to TFA:
Keras
Keras-CV
Keras-NLP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants