Error when training #1

PotatoMa0119 · 2022-03-16T08:19:05Z

Hi,
Thank you for your impressive work.
I've followed all the instructions, including importing fer2013 and ck+ dataset, changing path in ./util/info.json, along with executing preprocess.sh
After that, when I run trainmodel.py, the first epoch seems to run smoothly, but in the end, it throws an error as follows:

897/897 [==============================] - 720s 802ms/step - loss: 3.5670 - accuracy: 0.2134 - val_loss: 2.4353 - val_accuracy: 0.1638

Epoch 00001: val_loss improved from inf to 2.43534, saving model to data/model/Model-01-0.1638.hdf5
Traceback (most recent call last):
File "trainmodel.py", line 140, in
model.fit_generator(
File "/Users//PycharmProjects/engagementRecog/venv/lib/python3.8/site-packages/keras/engine/training.py", line 1975, in fit_generator
return self.fit(
File "/Users//PycharmProjects/engagementRecog/venv/lib/python3.8/site-packages/keras/engine/training.py", line 1230, in fit
callbacks.on_epoch_end(epoch, epoch_logs)
File "/Users/*PycharmProjects/engagementRecog/venv/lib/python3.8/site-packages/keras/callbacks.py", line 413, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "trainmodel.py", line 130, in on_epoch_end
self.training_data = shuffle_dataset(self.training_data)
File "/Users//PycharmProjects/engagementRecog/engagement-detection-master/data/dataset_ops.py", line 35, in inner
return func(args, kwargs)
File "/Users//PycharmProjects/engagementRecog/engagement-detection-master/data/dataset_ops.py", line 82, in shuffle_dataset
data_shape = np.array(args[0]).shape[0]
IndexError: tuple index out of range

Thank you for your help, I appreciate any instruction.

amogh7joshi · 2022-03-16T12:55:21Z

Thanks for raising this issue!

This is a weird one, I'm assuming from the traceback that the error is coming from np.array(args[0]).shape[0], and specifically the args[0] part. This would mean that there is no object being passed to the shuffle_dataset method.

I actually haven't run this code in a while (and the original is on a hard drive somewhere), so for debugging it might be useful to add a few extra statements.

In the DatasetShuffle callback that I've made, could you put an assert statement, like assert self.training_data is not None and assert self.validation_data is not None, as well as a print statement to actually inspect its contents (so right above line 130, after the method header)?

This would be useful for me to see where the error is actually occuring.

PotatoMa0119 · 2022-03-17T13:20:05Z

Thanks for your reply!
I've added some assert and print as you described:

And here is the output:
Epoch 00001: val_loss improved from inf to 2.21813, saving model to data/model/Model-01-0.2343.hdf5
[]
[]
Traceback (most recent call last):
File "trainmodel.py", line 144, in
model.fit_generator(
File "/Users//PycharmProjects/engagementRecog/venv/lib/python3.8/site-packages/keras/engine/training.py", line 1975, in fit_generator
return self.fit(
File "/Users//PycharmProjects/engagementRecog/venv/lib/python3.8/site-packages/keras/engine/training.py", line 1230, in fit
callbacks.on_epoch_end(epoch, epoch_logs)
File "/Users//PycharmProjects/engagementRecog/venv/lib/python3.8/site-packages/keras/callbacks.py", line 413, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "trainmodel.py", line 134, in on_epoch_end
self.training_data = shuffle_dataset(*self.training_data)
File "/Users//PycharmProjects/engagementRecog/engagement-detection-master/data/dataset_ops.py", line 35, in inner
return func(*args, **kwargs)
File "/Users//PycharmProjects/engagementRecog/engagement-detection-master/data/dataset_ops.py", line 82, in shuffle_dataset
data_shape = np.array(args[0]).shape[0]
IndexError: tuple index out of range

It seems that contents of self.training_data & self.validation_data are empty, but there isn't AssertionError.
I consider if there's anything to do with the dependencies because I'm running the project in an environment different from yours. Do you still remember which version of Tensorflow and Keras (and other packages in requirements.txt) did you use?

amogh7joshi · 2022-03-17T17:56:05Z

I'm not sure if it's anything to do with TF/Keras, but if I recall correctly I had used either 2.3 or 2.4. Maybe try moving the print statements up to the instantiation of the callback object, or you could also potentially just remove that callback altogether. It doesn't really serve much purpose in the grand scheme of things.

Also, just to note, you would get an assertion error with assert self.training_data, as this would try and run assert bool(self.training_data), and __bool__ of an empty list returns False. However, running assert self.training_data is not None is instead checking if that type of self.training_data is not None, and this still returns True even if the list is empty.

PotatoMa0119 · 2022-03-18T13:38:56Z

Removing shuffle_dataset worked, and I've finished the training.
Thank you so much for your guidance in the issue and assert !
Good luck.😉

PotatoMa0119 changed the title ~~Error when training data~~ Error when training Mar 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when training #1

Error when training #1

PotatoMa0119 commented Mar 16, 2022 •

edited

Loading

amogh7joshi commented Mar 16, 2022

PotatoMa0119 commented Mar 17, 2022

amogh7joshi commented Mar 17, 2022

PotatoMa0119 commented Mar 18, 2022

Error when training #1

Error when training #1

Comments

PotatoMa0119 commented Mar 16, 2022 • edited Loading

amogh7joshi commented Mar 16, 2022

PotatoMa0119 commented Mar 17, 2022

amogh7joshi commented Mar 17, 2022

PotatoMa0119 commented Mar 18, 2022

PotatoMa0119 commented Mar 16, 2022 •

edited

Loading