Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Progress Report] Implementation of final decision layer #14

Open
rightnknow opened this issue Feb 13, 2019 · 4 comments
Open

[Progress Report] Implementation of final decision layer #14

rightnknow opened this issue Feb 13, 2019 · 4 comments
Assignees
Labels
investigation research from various resources and make the optimized decision

Comments

@rightnknow
Copy link
Collaborator

This is to document the implement process of constructing final decision layer using both Spectrogram+CNN and audio+LSTM models.
Current two possible flow can be:

  1. CNN and LSTM model train with same input data(in different form) at the same time, tune the dense layer of both output
  2. fully train two model, froze it, then do training based on dense layer of both output

Current Implementation will be option 1

@rightnknow rightnknow added the investigation research from various resources and make the optimized decision label Feb 13, 2019
@rightnknow rightnknow self-assigned this Feb 13, 2019
@rightnknow
Copy link
Collaborator Author

rightnknow commented Feb 13, 2019

From keras-team/keras#7581
The post mentioned that only weights of the model can be transferred between Tensorflow and Keras, therefore it's still required to build the CNN or LSTM model in Keras or Tensorflow. For now I'll continue with Keras.
The New CNN model in Keras will try to have same structure as before. which is Inception V3 +2 Dense layer

PS:
Tensorflow save model as .pb file, and can restore variable and weights
Keras save model as json or h5py. Json only save model structure, h5py save weights.

@rightnknow
Copy link
Collaborator Author

rightnknow commented Feb 14, 2019

Keras provide Inception V3 weight and model in their library, along with other network models such as VGG and ResNet.

Currently there's issue found in Keras inception V3 model.
http://blog.datumbox.com/the-batch-normalization-layer-of-keras-is-broken/
keras-team/keras#9214

Current structure is the following
qq 20190214022743
The structure is identical to Katherine's model in Tensorflow.
However due to implementation method. The Keras inception v3 takes in image with # of channel = 3
Here, we do modification to the input

img_input = Input(shape=(image_size, image_size, 1))
    img_conc = Concatenate()([img_input, img_input, img_input])
    base_model = InceptionV3(weights='imagenet', include_top=True,input_tensor = img_conc)

By this we cast a grayscale image to rgb channel and feed it into the image.(I also tried with reading the grayscale image as rgb, it doesn't change much in terms of result)

However, I face a strange bug where the network itself doesn't seem to learn from the inception v3 network.

    img_input = Input(shape=(image_size, image_size, 1))
    img_conc = Concatenate()([img_input, img_input, img_input])
    base_model = InceptionV3(weights='imagenet', include_top=True,input_tensor = img_conc)
    #for layer in base_model.layers:
    #    layer.trainable = False
    firstLayer = base_model.output
    secondLayer = Dropout(0.5)(firstLayer)
    outputLayer = Dense(6,activation='softmax')(secondLayer)
    model = Model(input=base_model.input,output = outputLayer )

    model.compile(loss='categorical_crossentropy',
                  optimizer='rmsprop',
                  metrics=['accuracy'])
    plot_model(model, to_file=r'C:\Users\zhanglichuan\Desktop\ECE496\lstm\model.png', show_shapes=True)
    history = model.fit(X_train_image, Y_train, epochs=30, validation_split=0.25)

qq 20190214024211
qq 20190214024024

From graph we see that although training loss is decreasing, validation loss is not. Also we get val accuracy around 20% which is very close to random guessing.

I've tried with various method like changing the size of input, change number of layers. However results are similar or even worse.

Here's result after adding another dense layer of size 200:
qq 20190214030441
qq 20190214030524
The result is worse compare to previous model.

Inception V3 is not the only model that can classify image. I also tried with VGG16 model. Similar result achieved.

Possible error:

  1. Resizing of input spectrogram
  2. Data Augmentation
  3. Well, bugs in keras application

For the potential bug exist in Batch Normalization Layer of Inception v3 model,
I apply a custom patch from
https://github.com/datumbox/keras/tree/fork/keras2.2.4
Detail explanation can be find here:
http://blog.datumbox.com/the-batch-normalization-layer-of-keras-is-broken/

Here's result
qq 20190214033851
qq 20190214033858

231/231 [==============================] - 0s 2ms/step
test accuracy is 0.26839826852728277

I add another Dense layer of size 1024 and dropout of 0.5.The result shows over-fitting.

qq 20190214051034
qq 20190214051049

231/231 [==============================] - 0s 2ms/step
test accuracy is 0.2727272729207943

@rightnknow
Copy link
Collaborator Author

rightnknow commented Mar 12, 2019

After some debugging, current graph looks better now, here is the graph.
QQ截图20190312074335
QQ截图20190312074348

231/231 [==============================] - 13s 55ms/step
test accuracy is 0.4363203467073895

After further modification of the program, the graph become
QQ截图20190312083559
QQ截图20190312083607

On test set
231/231 [==============================] - 1s 4ms/step
test accuracy is 0.48051948103553804

Here's the current CNN model structure.
QQ截图20190313070332

I'll proceed with final decision layer.

@rightnknow
Copy link
Collaborator Author

rightnknow commented Mar 13, 2019

Summary of the Overall Implemtation

The Implementation of the network is complete, Here's the graph of the overall structure
QQ截图20190314062032

Observation during implementation
The combined Model have relatively high accuracy compare to individual model such as CNN or LSTM. When LSTM has test accuacy around 50 and CNN has test accuracy around 45, the combined model usually achieves about 55-60 percent accuracy.

"Key Observation"
0. Combined model does do better as expected

  1. lack of data
  2. Parameters not tuned well

Problems during the implementation
The result from the combine model is unstable, especially when I split my training set into training and validation set. This cause a decrease in accuracy.
Random Initializaiton of the model affects learning and accuracy of the model.(getting different test accuracy each time I train the model)

Here's the loss and accuracy graph
QQ截图20190314061917

QQ截图20190313071231

Apparently there‘s something wrong with the valiation set. I don't have an explanation for it currently.

On test set
231/231 [==============================] - 0s 2ms/step
test accuracy is 0.5541125542415685

Here's a picture of the test accuracy reaching 60%:
IMG_3434

Project Constraint meet! However I don't suggest since we don't have enough data.
The result accuracy fluctuate between 55 to 60%.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigation research from various resources and make the optimized decision
Projects
None yet
Development

No branches or pull requests

1 participant