Skip to content

Commit

Permalink
Update example notebook
Browse files Browse the repository at this point in the history
  • Loading branch information
drasmuss committed Jun 23, 2021
1 parent f85d330 commit cec965a
Show file tree
Hide file tree
Showing 2 changed files with 30 additions and 34 deletions.
Binary file modified docs/examples/psMNIST-weights.hdf5
Binary file not shown.
64 changes: 30 additions & 34 deletions docs/examples/psMNIST.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,13 @@
"been shown. The psMNIST task adds more complexity to the input by applying a fixed\n",
"permutation to all of the pixel sequences. This is done to ensure that the information\n",
"contained in the image is distributed evenly throughout the sequence, so that in order\n",
"to perform the task successfully, the network needs to process information across the\n",
"to perform the task successfully the network needs to process information across the\n",
"whole length of the input sequence.\n",
"\n",
"The following notebook uses a single KerasLMU layer inside a simple TensorFlow model to\n",
"showcase the accuracy and efficiency of performing the psMNIST task using these novel\n",
"memory cells. Using the LMU for this task currently produces state-of-the-art results\n",
"this task ([see\n",
"([see\n",
"paper](https://papers.nips.cc/paper/9689-legendre-memory-units-continuous-time-representation-in-recurrent-neural-networks.pdf))."
]
},
Expand Down Expand Up @@ -62,7 +62,7 @@
"metadata": {},
"source": [
"First we set a seed to ensure that the results in this example are reproducible. A\n",
"random number generator state (`rng`) is also created, and this will later be used to\n",
"random number generator (`rng`) is also created, and this will later be used to\n",
"generate the fixed permutation to be applied to the image data."
]
},
Expand Down Expand Up @@ -131,13 +131,12 @@
"method on the images. The first dimension of the reshaped output size represents the\n",
"number of samples our dataset has, which we keep the same. We want to transform each\n",
"sample into a column vector, and to do so we make the second and third dimensions -1 and\n",
"1, respectively, leveraging a standard NumPy trick specifically used for converting\n",
"multi-dimensional data into column vectors.\n",
"1, respectively.\n",
"\n",
"The image displayed below shows the result of this flattening process, and is an example\n",
"of the type of data that is used in the Sequential MNIST task. Note that even though the\n",
"image has been reshaped into an 98 x 8 image (so that it can fit on the screen), there\n",
"is still a fair amount of structure observable in the image."
"image has been flattened, there is still a fair amount of structure observable in the\n",
"image."
]
},
{
Expand All @@ -162,14 +161,15 @@
"metadata": {},
"source": [
"Finally, we apply a fixed permutation on the images in both the training and testing\n",
"datasets. This essentially shuffles the pixels of the image sequences in a consistent\n",
"datasets. This shuffles the pixels of the image sequences in a consistent\n",
"way, allowing for images of the same digit to still be similar, but removing the\n",
"convenience of edges and contours that the network can use for easy digit inference.\n",
"\n",
"We can see, from the image below, that the fixed permutation applied to the image\n",
"creates an even distribute of pixels across the entire sequence. This makes the task\n",
"much more difficult as it makes it necessary for the network to process the entire input\n",
"sequence to accurately predict what the digit is. We now have our data for the Permuted\n",
"much more difficult, as it makes it necessary for the network to process the entire\n",
"input\n",
"sequence to accurately classify the digit. We now have our data for the Permuted\n",
"Sequential MNIST (psMNIST) task."
]
},
Expand Down Expand Up @@ -205,11 +205,11 @@
"metadata": {},
"outputs": [],
"source": [
"X_train = train_images[0:50000]\n",
"X_train = train_images[:50000]\n",
"X_valid = train_images[50000:]\n",
"X_test = test_images\n",
"\n",
"Y_train = train_labels[0:50000]\n",
"Y_train = train_labels[:50000]\n",
"Y_valid = train_labels[50000:]\n",
"Y_test = test_labels\n",
"\n",
Expand All @@ -235,7 +235,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Our model uses a single LMU layer configured with 212 `units` and an `order` of 256\n",
"Our model uses a single LMU layer configured with 212 hidden `units` and an `order` of\n",
"256\n",
"dimensions for the memory, maintaining `units` + `order` = 468 variables in memory\n",
"between time-steps. These numbers were chosen primarily to have a comparable number of\n",
"internal variables to the models that were being compared against in the\n",
Expand All @@ -256,17 +257,15 @@
"source": [
"n_pixels = X_train.shape[1]\n",
"\n",
"lmu_layer = tf.keras.layers.RNN(\n",
" keras_lmu.LMUCell(\n",
" memory_d=1,\n",
" order=256,\n",
" theta=n_pixels,\n",
" hidden_cell=tf.keras.layers.SimpleRNNCell(212),\n",
" hidden_to_memory=False,\n",
" memory_to_memory=False,\n",
" input_to_hidden=True,\n",
" kernel_initializer=\"ones\",\n",
" )\n",
"lmu_layer = keras_lmu.LMU(\n",
" memory_d=1,\n",
" order=256,\n",
" theta=n_pixels,\n",
" hidden_cell=tf.keras.layers.SimpleRNNCell(212),\n",
" hidden_to_memory=False,\n",
" memory_to_memory=False,\n",
" input_to_hidden=True,\n",
" kernel_initializer=\"ones\",\n",
")\n",
"\n",
"# TensorFlow layer definition\n",
Expand Down Expand Up @@ -295,14 +294,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"To train our model, we use a `batch_size` of 100, and train for 10 `epochs`, which is a\n",
"To train our model we use a `batch_size` of 100 and train for 10 `epochs`, which is\n",
"far less than most other solutions to the psMNIST task. We could train for more epochs\n",
"if we wished to fine-tune performance, but that is not necessary for the purposes of\n",
"this example. We also create a `ModelCheckpoint` callback that saves the weights of the\n",
"model to a file after each epoch.\n",
"\n",
"The time required for this to run is tracked using the `time` library. Training may take\n",
"a long time to complete, and to save time, this notebook defaults to using pre-trained\n",
"Training may take\n",
"a long time to complete, and to save time this notebook defaults to using pre-trained\n",
"weights. To train the model from scratch, simply change the `do_training` variable to\n",
"`True` before running the cell below."
]
Expand Down Expand Up @@ -339,11 +338,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The progression of the training process is shown below. Here we plot the accuracy for\n",
"the training and validation for each epoch.\n",
"\n",
"Note that if this notebook has been configured to use trained weights, instead of using\n",
"live data, a saved image of a previous training run will be displayed."
"The progression of the training process is shown below, plotting the\n",
"training and validation accuracy."
]
},
{
Expand All @@ -359,7 +355,7 @@
" plt.legend()\n",
" plt.xlabel(\"Epoch\")\n",
" plt.ylabel(\"Accuracy\")\n",
" plt.title(\"Post-epoch Training Accuracies\")\n",
" plt.title(\"Post-epoch training accuracies\")\n",
" plt.xticks(np.arange(epochs), np.arange(1, epochs + 1))\n",
" plt.ylim((0.85, 1.0)) # Restrict range of y axis to (0.85, 1) for readability\n",
" plt.savefig(\"psMNIST-training.png\")\n",
Expand All @@ -386,7 +382,7 @@
"metadata": {},
"source": [
"With the training complete, let's use the trained weights to test the model. Since the\n",
"weights are saved to file after every epoch, we can simply load the saved weights, then\n",
"best weights are saved to file, we can simply load the saved weights, then\n",
"test it against the permuted sequences in the test set."
]
},
Expand Down

0 comments on commit cec965a

Please sign in to comment.