Regression by PhasedLSTM with a gradient explosion #21

hnchang · 2020-04-17T10:12:48Z

Hello,

When I used PhasedLSTM (PLSTM) to perform the regression (to find the correlation between an input sequence and an output sequence), I got "nan" in the weight , also the loss in the beginning of the first epoch, even I used gradient clipping.

The generated data for training: (little modified from https://fairyonice.github.io/Extract-weights-from-Keras's-LSTM-and-calcualte-hidden-and-cell-states.html)

The optimizer is as follows:
model.compile(loss="mean_squared_error", sample_weight_mode="temporal", optimizer = keras.optimizers.Adam(lr=0.01, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0))

After checked the weights in PLSTM layer, I found the values of timegate-kernel getting larger and larger, then the weights get to "nan". (The first two rows)

I changed to standard LSTM (other settings and learning rate [still 0.01] the same), the loss converges. Therefore, I traced the source code of PLSTM, considering the initialization of timegate_kernel matters, but stuck for a long time, having little progress.

I am wondering if anyone has the similar issue? Any suggestions to find the reason why the gradient get exploded is appreciated. The relevant code is at the link:

https://github.com/hnchang/Regression-with-PhasedLSTM/blob/master/reg_plstm.py

Much thanks,
James

The text was updated successfully, but these errors were encountered:

ntlex · 2020-05-13T09:18:25Z

Hey James,

I am having a similar issue here. Two things that have worked for me:

Reduce the learning rate (on schedule or manually)
Use clipping gradients to prevent them from exploding https://machinelearningmastery.com/how-to-avoid-exploding-gradients-in-neural-networks-with-gradient-clipping/

I hope this helps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression by PhasedLSTM with a gradient explosion #21

Regression by PhasedLSTM with a gradient explosion #21

hnchang commented Apr 17, 2020 •

edited

Loading

ntlex commented May 13, 2020

Regression by PhasedLSTM with a gradient explosion #21

Regression by PhasedLSTM with a gradient explosion #21

Comments

hnchang commented Apr 17, 2020 • edited Loading

ntlex commented May 13, 2020

hnchang commented Apr 17, 2020 •

edited

Loading