Allow saving weights of a very deep model into a HDF5 file. #7508

tunystom · 2017-08-03T11:12:37Z

I encountered a problem while saving weights of very deep Keras models into HDF5 file. The culprit is the HDF5 file format's inability to save object headers larger than 64 kB (according to https://support.hdfgroup.org/HDF5/faq/limits.html).

This pull request should solve the following issues (without the need to implement any custom saving/loading methods as proposed by some comments)

#6766
#5253

I also answered to a question to a related issue on StackOverflow:

Callbackfunction modelcheckpoint causes error in keras

The fix is really simple. If the data array (layer_names and weight_names specifically) being saved to HDF5 group is too large, it is simply chunked into several pieces (until they fit the memory limit) and saved individually under the original attribute name with an associated chunk number appended to it. The loading from a HDF5 file is implemented correspondingly.

…_group() methods that can deal with the problem of saving models with a lot of layers (or with nested deep models) into HDF5 files

fchollet

Please add a unit test that covers the case where header chunking occurs.

What's the number of layers past which you need this?

…robust to unrealistically long layer names

…ghts_names` HDF5 file attributes splitting

tunystom · 2017-08-18T20:14:30Z

@fchollet FYI, I added the unit tests you asked for. I am not sure whether it is required to check the internals of the created HDF5 file or not, but the current tests do so.

The number of layers of the model I could not save was over 1K. But the issue was not with the layer_names attribute inside HDF5 file (which also suffers from the same memory limit constraint and is fixed similarly), but with a different attribute array - weights_names. This happened as a result of the way I designed the whole model. I have actually 2 models: one is used as a nested model (feature extractor), this is the huge one, and the other servers as a backend built on top of it. If you take a look at the test method test_saving_model_with_long_weights_names you will see what I mean.

So, when you try to save a model containing a nested model, what happens is that the names of weights of the nested model are all saved within a single weights_names attribute of the HDF5 file. Since the list of names is converted to numpy array inside h5py, which effectively means every string inside the array will occupy the same amount as the longest string, it requires a lot of memory even for not very deep models after all.

stale · 2017-11-16T20:16:54Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

fchollet · 2018-01-20T00:25:06Z

Sorry for not reviewing this earlier. It got forgotten in the PR queue. But it seems like it would still be a useful fix.

This PR has merge conflicts and will need a review (at a glance there will be at least small style issues to fix). Are you still interested in working on it?

qieaaa · 2018-02-15T04:33:01Z

Thanks for @tunystom !
I met this h5py error when I run Nasnet large.
I tried to replace your topology.py script in keras folder and there is still an error about training.py under engine folder.Which is
AttributeError: 'Model' object has no attribute '_internal_output_shapes'
I add one line in line 608 in training.py
self._internal_output_shapes= self.internal_output_shapes
and it works well.

qieaaa

I tried this and it works well 👍

floydium · 2018-02-15T17:05:27Z

I was able to rebase this PR against the current master. This required a minor conflict resolution in a unit test file within the test folder (which frankly I resolved quite randomly - only a few lines of code). Then I copied the newly rebased topology.py file into my keras library folder, and it did work without any issues. I was able to save NASNetMobile weights.

Allow saving weights of a very deep model into a HDF5 file. # Conflicts: # tests/test_model_saving.py

ahundt · 2018-02-16T05:23:34Z

merged and created an updated PR in #9398

ahundt · 2018-02-16T05:38:39Z

@floydium @qieaaa could you review the new PR #9398? Is an additional change needed for self._internal_output_shapes= self.internal_output_shapes?

qieaaa · 2018-02-16T14:29:32Z

@ahundt Thank you so much ! I just tried it, it's cool
I replace your new topology.py and delete my additional self._internal_output_shapes= self.internal_output_shapes. I run it and all work well，thanks! I can review the new PR

fchollet · 2018-02-17T21:40:51Z

Merged updated PR.

Anurag27031994 · 2019-02-01T19:33:43Z

#Assuming that this is your model architecture. However, you may use whatever architecture, you want to (big or small; any).
def mymodel():
inputShape= (28, 28, 3);
model= Sequential()
model.add(Conv2D(20, 5, padding="same", input_shape=inputShape))
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dense(500))
model.add(Activation('relu'))
model.add(Dense(2, activation= "softmax"))
return model
model.fit(....) #paramaters to start training your model

################################################################################
################################################################################
#once your model has been trained, you want to save your model in your PC
#use get.weights() command to get your model weights
weigh= model.get_weights()

#now, use pickle to save your model weights, instead of .h5
#for heavy model architectures, .h5 file is unsupported.
pklfile= "D:/modelweights.pkl"
try:
fpkl= open(pklfile, 'wb') #Python 3
pickle.dump(weigh, fpkl, protocol= pickle.HIGHEST_PROTOCOL)
fpkl.close()
except:
fpkl= open(pklfile, 'w') #Python 2
pickle.dump(weigh, fpkl, protocol= pickle.HIGHEST_PROTOCOL)
fpkl.close()

################################################################################
################################################################################
#in future, you mave want to load your model back
#use pickle to load model weights

pklfile= "D:/modelweights.pkl"
try:
f= open(pklfile) #Python 2

weigh= pickle.load(f);                
f.close();

except:

f= open(pklfile, 'rb')     #Python 3                 
weigh= pickle.load(f);                
f.close();

restoredmodel= mymodel()
#use set_weights to load the modelweights into the model architecture
restoredmodel.set_weights(weigh)

################################################################################
################################################################################
#now, you can do your testing and evaluation- predictions
y_pred= restoredmodel.predict(X)

added _load_attributes_from_hdf5_group() and _save_attributes_to_hdf5…

fd9dcd2

…_group() methods that can deal with the problem of saving models with a lot of layers (or with nested deep models) into HDF5 files

fchollet reviewed Aug 3, 2017

View reviewed changes

tunystom added 2 commits August 18, 2017 17:43

np.split --> np.array_split, making _save_attributes_to_hdf5_group() …

7cf5682

…robust to unrealistically long layer names

adding unit tests for checking the validity of layer_names and `wei…

fd71143

…ghts_names` HDF5 file attributes splitting

stale bot added the stale label Nov 16, 2017

tr3e approved these changes Nov 20, 2017

View reviewed changes

stale bot removed the stale label Nov 20, 2017

titu1994 mentioned this pull request Dec 31, 2017

Addition of NASNet models #8711

Closed

kamikawa mentioned this pull request Jan 29, 2018

[Group convolution in Keras] ResNeXt mxnet -> IR -> keras microsoft/MMdnn#58

Closed

myvrml approved these changes Feb 13, 2018

View reviewed changes

fchollet mentioned this pull request Feb 14, 2018

Keras save_weights does not support large number of layers #5253

Closed

4 tasks

qieaaa approved these changes Feb 15, 2018

View reviewed changes

ahundt added a commit to ahundt/keras that referenced this pull request Feb 16, 2018

Merge pull request keras-team#7508 from rossumai/saving-large-models

dca5a7a

Allow saving weights of a very deep model into a HDF5 file. # Conflicts: # tests/test_model_saving.py

ahundt mentioned this pull request Feb 16, 2018

Allow saving weights of a very deep model into a HDF5 file. #9398

Merged

fchollet closed this Feb 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow saving weights of a very deep model into a HDF5 file. #7508

Allow saving weights of a very deep model into a HDF5 file. #7508

tunystom commented Aug 3, 2017

fchollet left a comment

tunystom commented Aug 18, 2017

stale bot commented Nov 16, 2017

fchollet commented Jan 20, 2018

qieaaa commented Feb 15, 2018

qieaaa left a comment

floydium commented Feb 15, 2018

ahundt commented Feb 16, 2018 •

edited

Loading

ahundt commented Feb 16, 2018 •

edited

Loading

qieaaa commented Feb 16, 2018

fchollet commented Feb 17, 2018

Anurag27031994 commented Feb 1, 2019

Allow saving weights of a very deep model into a HDF5 file. #7508

Allow saving weights of a very deep model into a HDF5 file. #7508

Conversation

tunystom commented Aug 3, 2017

fchollet left a comment

Choose a reason for hiding this comment

tunystom commented Aug 18, 2017

stale bot commented Nov 16, 2017

fchollet commented Jan 20, 2018

qieaaa commented Feb 15, 2018

qieaaa left a comment

Choose a reason for hiding this comment

floydium commented Feb 15, 2018

ahundt commented Feb 16, 2018 • edited Loading

ahundt commented Feb 16, 2018 • edited Loading

qieaaa commented Feb 16, 2018

fchollet commented Feb 17, 2018

Anurag27031994 commented Feb 1, 2019

ahundt commented Feb 16, 2018 •

edited

Loading

ahundt commented Feb 16, 2018 •

edited

Loading