Error(s) in loading state_dict for DataParallel: #27

cosmolu · 2018-10-15T07:24:56Z

When I download the pre_trained model and resume it. there is an error.
model.load_state_dict(checkpoint['state_dict'])
It seems that the name are not matched.(e.g. "module.features.0.weight" v.s. "features.module.0.weight")
How could I solve it if I wish to use the pre_trained model on Cifar10?
Thank you !

Traceback (most recent call last):
File "test_0.py", line 130, in
model = load_model()
File "test_0.py", line 104, in load_model
model.load_state_dict(checkpoint['state_dict'])
File "/home/cosmo/anaconda3/envs/tf8/lib/python3.6/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DataParallel:
Missing key(s) in state_dict: "module.features.0.weight", "module.features.0.bias", "module.features.3.weight", "module.features.3.bias", "module.features.6.weight", "module.features.6.bias", "module.features.8.weight", "module.features.8.bias", "module.features.10.weight", "module.features.10.bias", "module.classifier.weight", "module.classifier.bias".
Unexpected key(s) in state_dict: "features.module.0.weight", "features.module.0.bias", "features.module.3.weight", "features.module.3.bias", "features.module.6.weight", "features.module.6.bias", "features.module.8.weight", "features.module.8.bias", "features.module.10.weight", "features.module.10.bias", "classifier.weight", "classifier.bias".

imirzadeh · 2018-11-27T04:43:11Z

Hi,

The problem is the module is load with dataparallel activated and you are trying to load it without data parallel. That's why there's an extra module at the beginning of each key!

Refer to this link for more information:
https://discuss.pytorch.org/t/missing-keys-unexpected-keys-in-state-dict-when-loading-self-trained-model/22379

goncalomordido · 2019-03-20T15:03:19Z

You can also manually updated the dic. Like this:

        state_dict =checkpoint['state_dict']
        from collections import OrderedDict
        new_state_dict = OrderedDict()

        for k, v in state_dict.items():
            if 'module' not in k:
                k = 'module.'+k
            else:
                k = k.replace('features.module.', 'module.features.')
            new_state_dict[k]=v

        model.load_state_dict(new_state_dict)

xjock · 2019-04-17T02:19:18Z

The mutiple GPUs usage in pytorch is a little difficult.
In TF, you just set the os.environ["CUDA_VISIBLE_DEVICES"]='0,1,2,3'

raghav1810 · 2019-07-25T07:28:29Z

Am getting an error similar to this one
This is what I am running:
python cifar.py -a preresnet --depth 110 --epochs 3 --schedule 81 122 --gamma 0.1 --wd 1e-4 --checkpoint checkpoints/cifar10/preresnet-110 --resume 'checkpoint.pth.tar'
('checkpoint.pth.tar' is from the onedrive folder)

RuntimeError: Error(s) in loading state_dict for DataParallel:
Missing key(s) in state_dict: "module.bn.weight", "module.bn.bias", "module.bn.running_mean", "module.bn.running_var".
Unexpected key(s) in state_dict: "module.bn1.weight", "module.bn1.bias", "module.bn1.running_mean", "module.bn1.running_var", "module.layer1.0.conv3.weight", "module.layer1.0.bn3.weight", "module.layer1.0.bn3.bias", "module.layer1.0.bn3.running_mean", "module.layer1.0.bn3.running_var", "module.layer1.0.downsample.0.weight", "module.layer1.0.downsample.1.weight", "module.layer1.0.downsample.1.bias", "module.layer1.0.downsample.1.running_mean", "module.layer1.0.downsample.1.running_var", "module.layer1.1.conv3.weight", "module.layer1.1.bn3.weight", "module.layer1.1.bn3.bias", "module.layer1.1.bn3.running_mean", "module.layer1.1.bn3.running_var", "module.layer1.2.conv3.weight", "module.layer1.2.bn3.weight", "module.layer1.2.bn3.bias", "module.layer1.2.bn3.running_mean", "module.layer1.2.bn3.running_var", "module.layer1.3.conv3.weight", "module.layer1.3.bn3.weight", "module.layer1.3.bn3.bias", "module.layer1.3.bn3.running_mean", "module.layer1.3.bn3.running_var", "module.layer1.4.conv3.weight", "module.layer1.4.bn3.weight", "module.layer1.4.bn3.bias", "module.layer1.4.bn3.running_mean", "module.layer1.4.bn3.running_var", "module.layer1.5.conv3.weight", "module.layer1.5.bn3.weight", "module.layer1.5.bn3.bias", "module.layer1.5.bn3.running_mean", "module.layer1.5.bn3.running_var", "module.layer1.6.conv3.weight", "module.layer1.6.bn3.weight", "module.layer1.6.bn3.bias", "module.layer1.6.bn3.running_mean", "module.layer1.6.bn3.running_var", "module.layer1.7.conv3.weight", "module.layer1.7.bn3.weight", "module.layer1.7.bn3.bias", "module.layer1.7.bn3.running_mean", "module.layer1.7.bn3.running_var", "module.layer1.8.conv3.weight", "module.layer1.8.bn3.weight", "module.layer1.8.bn3.bias", "module.layer1.8.bn3.running_mean", "module.layer1.8.bn3.running_var", "module.layer1.9.conv3.weight", "module.layer1.9.bn3.weight", "module.layer1.9.bn3.bias", "module.layer1.9.bn3.running_mean", "module.layer1.9.bn3.running_var", "module.layer1.10.conv3.weight", "module.layer1.10.bn3.weight", "module.layer1.10.bn3.bias", "module.layer1.10.bn3.running_mean", "module.layer1.10.bn3.running_var", "module.layer1.11.conv3.weight", "module.layer1.11.bn3.weight", "module.layer1.11.bn3.bias", "module.layer1.11.bn3.running_mean", "module.layer1.11.bn3.running_var", "module.layer1.12.conv3.weight", "module.layer1.12.bn3.weight", "module.layer1.12.bn3.bias", "module.layer1.12.bn3.running_mean", "module.layer1.12.bn3.running_var", "module.layer1.13.conv3.weight", "module.layer1.13.bn3.weight", "module.layer1.13.bn3.bias", "module.layer1.13.bn3.running_mean", "module.layer1.13.bn3.running_var", "module.layer1.14.conv3.weight", "module.layer1.14.bn3.weight", "module.layer1.14.bn3.bias", "module.layer1.14.bn3.running_mean", "module.layer1.14.bn3.running_var", "module.layer1.15.conv3.weight", "module.layer1.15.bn3.weight", "module.layer1.15.bn3.bias", "module.layer1.15.bn3.running_mean", "module.layer1.15.bn3.running_var", "module.layer1.16.conv3.weight", "module.layer1.16.bn3.weight", "module.layer1.16.bn3.bias", "module.layer1.16.bn3.running_mean", "module.layer1.16.bn3.running_var", "module.layer1.17.conv3.weight", "module.layer1.17.bn3.weight", "module.layer1.17.bn3.bias", "module.layer1.17.bn3.running_mean", "module.layer1.17.bn3.running_var", "module.layer2.0.conv3.weight", "module.layer2.0.bn3.weight", "module.layer2.0.bn3.bias", "module.layer2.0.bn3.running_mean", "module.layer2.0.bn3.running_var", "module.layer2.0.downsample.1.weight", "module.layer2.0.downsample.1.bias", "module.layer2.0.downsample.1.running_mean", "module.layer2.0.downsample.1.running_var", "module.layer2.1.conv3.weight", "module.layer2.1.bn3.weight", "module.layer2.1.bn3.bias", "module.layer2.1.bn3.running_mean", "module.layer2.1.bn3.running_var", "module.layer2.2.conv3.weight", "module.layer2.2.bn3.weight", "module.layer2.2.bn3.bias", "module.layer2.2.bn3.running_mean", "module.layer2.2.bn3.running_var", "module.layer2.3.conv3.weight", "module.layer2.3.bn3.weight", "module.layer2.3.bn3.bias", "module.layer2.3.bn3.running_mean", "module.layer2.3.bn3.running_var", "module.layer2.4.conv3.weight", "module.layer2.4.bn3.weight", "module.layer2.4.bn3.bias", "module.layer2.4.bn3.running_mean", "module.layer2.4.bn3.running_var", "module.layer2.5.conv3.weight", "module.layer2.5.bn3.weight", "module.layer2.5.bn3.bias", "module.layer2.5.bn3.running_mean", "module.layer2.5.bn3.running_var", "module.layer2.6.conv3.weight", "module.layer2.6.bn3.weight", "module.layer2.6.bn3.bias", "module.layer2.6.bn3.running_mean", "module.layer2.6.bn3.running_var", "module.layer2.7.conv3.weight", "module.layer2.7.bn3.weight", "module.layer2.7.bn3.bias", "module.layer2.7.bn3.running_mean", "module.layer2.7.bn3.running_var", "module.layer2.8.conv3.weight", "module.layer2.8.bn3.weight", "module.layer2.8.bn3.bias", "module.layer2.8.bn3.running_mean", "module.layer2.8.bn3.running_var", "module.layer2.9.conv3.weight", "module.layer2.9.bn3.weight", "module.layer2.9.bn3.bias", "module.layer2.9.bn3.running_mean", "module.layer2.9.bn3.running_var", "module.layer2.10.conv3.weight", "module.layer2.10.bn3.weight", "module.layer2.10.bn3.bias", "module.layer2.10.bn3.running_mean", "module.layer2.10.bn3.running_var", "module.layer2.11.conv3.weight", "module.layer2.11.bn3.weight", "module.layer2.11.bn3.bias", "module.layer2.11.bn3.running_mean", "module.layer2.11.bn3.running_var", "module.layer2.12.conv3.weight", "module.layer2.12.bn3.weight", "module.layer2.12.bn3.bias", "module.layer2.12.bn3.running_mean", "module.layer2.12.bn3.running_var", "module.layer2.13.conv3.weight", "module.layer2.13.bn3.weight", "module.layer2.13.bn3.bias", "module.layer2.13.bn3.running_mean", "module.layer2.13.bn3.running_var", "module.layer2.14.conv3.weight", "module.layer2.14.bn3.weight", "module.layer2.14.bn3.bias", "module.layer2.14.bn3.running_mean", "module.layer2.14.bn3.running_var", "module.layer2.15.conv3.weight", "module.layer2.15.bn3.weight", "module.layer2.15.bn3.bias", "module.layer2.15.bn3.running_mean", "module.layer2.15.bn3.running_var", "module.layer2.16.conv3.weight", "module.layer2.16.bn3.weight", "module.layer2.16.bn3.bias", "module.layer2.16.bn3.running_mean", "module.layer2.16.bn3.running_var", "module.layer2.17.conv3.weight", "module.layer2.17.bn3.weight", "module.layer2.17.bn3.bias", "module.layer2.17.bn3.running_mean", "module.layer2.17.bn3.running_var", "module.layer3.0.conv3.weight", "module.layer3.0.bn3.weight", "module.layer3.0.bn3.bias", "module.layer3.0.bn3.running_mean", "module.layer3.0.bn3.running_var", "module.layer3.0.downsample.1.weight", "module.layer3.0.downsample.1.bias", "module.layer3.0.downsample.1.running_mean", "module.layer3.0.downsample.1.running_var", "module.layer3.1.conv3.weight", "module.layer3.1.bn3.weight", "module.layer3.1.bn3.bias", "module.layer3.1.bn3.running_mean", "module.layer3.1.bn3.running_var", "module.layer3.2.conv3.weight", "module.layer3.2.bn3.weight", "module.layer3.2.bn3.bias", "module.layer3.2.bn3.running_mean", "module.layer3.2.bn3.running_var", "module.layer3.3.conv3.weight", "module.layer3.3.bn3.weight", "module.layer3.3.bn3.bias", "module.layer3.3.bn3.running_mean", "module.layer3.3.bn3.running_var", "module.layer3.4.conv3.weight", "module.layer3.4.bn3.weight", "module.layer3.4.bn3.bias", "module.layer3.4.bn3.running_mean", "module.layer3.4.bn3.running_var", "module.layer3.5.conv3.weight", "module.layer3.5.bn3.weight", "module.layer3.5.bn3.bias", "module.layer3.5.bn3.running_mean", "module.layer3.5.bn3.running_var", "module.layer3.6.conv3.weight", "module.layer3.6.bn3.weight", "module.layer3.6.bn3.bias", "module.layer3.6.bn3.running_mean", "module.layer3.6.bn3.running_var", "module.layer3.7.conv3.weight", "module.layer3.7.bn3.weight", "module.layer3.7.bn3.bias", "module.layer3.7.bn3.running_mean", "module.layer3.7.bn3.running_var", "module.layer3.8.conv3.weight", "module.layer3.8.bn3.weight", "module.layer3.8.bn3.bias", "module.layer3.8.bn3.running_mean", "module.layer3.8.bn3.running_var", "module.layer3.9.conv3.weight", "module.layer3.9.bn3.weight", "module.layer3.9.bn3.bias", "module.layer3.9.bn3.running_mean", "module.layer3.9.bn3.running_var", "module.layer3.10.conv3.weight", "module.layer3.10.bn3.weight", "module.layer3.10.bn3.bias", "module.layer3.10.bn3.running_mean", "module.layer3.10.bn3.running_var", "module.layer3.11.conv3.weight", "module.layer3.11.bn3.weight", "module.layer3.11.bn3.bias", "module.layer3.11.bn3.running_mean", "module.layer3.11.bn3.running_var", "module.layer3.12.conv3.weight", "module.layer3.12.bn3.weight", "module.layer3.12.bn3.bias", "module.layer3.12.bn3.running_mean", "module.layer3.12.bn3.running_var", "module.layer3.13.conv3.weight", "module.layer3.13.bn3.weight", "module.layer3.13.bn3.bias", "module.layer3.13.bn3.running_mean", "module.layer3.13.bn3.running_var", "module.layer3.14.conv3.weight", "module.layer3.14.bn3.weight", "module.layer3.14.bn3.bias", "module.layer3.14.bn3.running_mean", "module.layer3.14.bn3.running_var", "module.layer3.15.conv3.weight", "module.layer3.15.bn3.weight", "module.layer3.15.bn3.bias", "module.layer3.15.bn3.running_mean", "module.layer3.15.bn3.running_var", "module.layer3.16.conv3.weight", "module.layer3.16.bn3.weight", "module.layer3.16.bn3.bias", "module.layer3.16.bn3.running_mean", "module.layer3.16.bn3.running_var", "module.layer3.17.conv3.weight", "module.layer3.17.bn3.weight", "module.layer3.17.bn3.bias", "module.layer3.17.bn3.running_mean", "module.layer3.17.bn3.running_var".
size mismatch for module.layer1.0.conv1.weight: copying a param with shape torch.Size([16, 16, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.1.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.2.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.3.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.4.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.5.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.6.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.7.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.8.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.9.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.10.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.11.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.12.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.13.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.14.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.15.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.16.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.17.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer2.0.bn1.weight: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.bn1.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.bn1.running_mean: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.bn1.running_var: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.conv1.weight: copying a param with shape torch.Size([32, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 3, 3]).
size mismatch for module.layer2.0.downsample.0.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 1, 1]).
size mismatch for module.layer2.1.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.2.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.3.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.4.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.5.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.6.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.7.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.8.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.9.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.10.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.11.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.12.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.13.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.14.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.15.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.16.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.17.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer3.0.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.conv1.weight: copying a param with shape torch.Size([64, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 3, 3]).
size mismatch for module.layer3.0.downsample.0.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 1, 1]).
size mismatch for module.layer3.1.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.2.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.3.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.4.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.5.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.6.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.7.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.8.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.9.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.10.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.11.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.12.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.13.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.14.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.15.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.16.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.17.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.fc.weight: copying a param with shape torch.Size([100, 256]) from checkpoint, the shape in current model is torch.Size([10, 64]).
size mismatch for module.fc.bias: copying a param with shape torch.Size([100]) from checkpoint, the shape in current model is torch.Size([10]).

jiteshm17 · 2020-03-17T11:59:11Z

You can also manually updated the dic. Like this:

        state_dict =checkpoint['state_dict']
        from collections import OrderedDict
        new_state_dict = OrderedDict()

        for k, v in state_dict.items():
            if 'module' not in k:
                k = 'module.'+k
            else:
                k = k.replace('features.module.', 'module.features.')
            new_state_dict[k]=v

        model.load_state_dict(new_state_dict)

I suppose this issue can be closed as the referenced post mentions the cause of the error and offers a solution

yustiks · 2021-01-28T22:48:15Z

You can also manually updated the dic. Like this:

        state_dict =checkpoint['state_dict']
        from collections import OrderedDict
        new_state_dict = OrderedDict()

        for k, v in state_dict.items():
            if 'module' not in k:
                k = 'module.'+k
            else:
                k = k.replace('features.module.', 'module.features.')
            new_state_dict[k]=v

        model.load_state_dict(new_state_dict)

life saver!

ersamo · 2021-09-27T23:46:48Z

Am getting an error similar to this one
This is what I am running:
python cifar.py -a preresnet --depth 110 --epochs 3 --schedule 81 122 --gamma 0.1 --wd 1e-4 --checkpoint checkpoints/cifar10/preresnet-110 --resume 'checkpoint.pth.tar'
('checkpoint.pth.tar' is from the onedrive folder)

RuntimeError: Error(s) in loading state_dict for DataParallel:
Missing key(s) in state_dict: "module.bn.weight", "module.bn.bias", "module.bn.running_mean", "module.bn.running_var".
Unexpected key(s) in state_dict: "module.bn1.weight", "module.bn1.bias", "module.bn1.running_mean", "module.bn1.running_var", "module.layer1.0.conv3.weight", "module.layer1.0.bn3.weight", "module.layer1.0.bn3.bias", "module.layer1.0.bn3.running_mean", "module.layer1.0.bn3.running_var", "module.layer1.0.downsample.0.weight", "module.layer1.0.downsample.1.weight", "module.layer1.0.downsample.1.bias", "module.layer1.0.downsample.1.running_mean", "module.layer1.0.downsample.1.running_var", "module.layer1.1.conv3.weight", "module.layer1.1.bn3.weight", "module.layer1.1.bn3.bias", "module.layer1.1.bn3.running_mean", "module.layer1.1.bn3.running_var", "module.layer1.2.conv3.weight", "module.layer1.2.bn3.weight", "module.layer1.2.bn3.bias", "module.layer1.2.bn3.running_mean", "module.layer1.2.bn3.running_var", "module.layer1.3.conv3.weight", "module.layer1.3.bn3.weight", "module.layer1.3.bn3.bias", "module.layer1.3.bn3.running_mean", "module.layer1.3.bn3.running_var", "module.layer1.4.conv3.weight", "module.layer1.4.bn3.weight", "module.layer1.4.bn3.bias", "module.layer1.4.bn3.running_mean", "module.layer1.4.bn3.running_var", "module.layer1.5.conv3.weight", "module.layer1.5.bn3.weight", "module.layer1.5.bn3.bias", "module.layer1.5.bn3.running_mean", "module.layer1.5.bn3.running_var", "module.layer1.6.conv3.weight", "module.layer1.6.bn3.weight", "module.layer1.6.bn3.bias", "module.layer1.6.bn3.running_mean", "module.layer1.6.bn3.running_var", "module.layer1.7.conv3.weight", "module.layer1.7.bn3.weight", "module.layer1.7.bn3.bias", "module.layer1.7.bn3.running_mean", "module.layer1.7.bn3.running_var", "module.layer1.8.conv3.weight", "module.layer1.8.bn3.weight", "module.layer1.8.bn3.bias", "module.layer1.8.bn3.running_mean", "module.layer1.8.bn3.running_var", "module.layer1.9.conv3.weight", "module.layer1.9.bn3.weight", "module.layer1.9.bn3.bias", "module.layer1.9.bn3.running_mean", "module.layer1.9.bn3.running_var", "module.layer1.10.conv3.weight", "module.layer1.10.bn3.weight", "module.layer1.10.bn3.bias", "module.layer1.10.bn3.running_mean", "module.layer1.10.bn3.running_var", "module.layer1.11.conv3.weight", "module.layer1.11.bn3.weight", "module.layer1.11.bn3.bias", "module.layer1.11.bn3.running_mean", "module.layer1.11.bn3.running_var", "module.layer1.12.conv3.weight", "module.layer1.12.bn3.weight", "module.layer1.12.bn3.bias", "module.layer1.12.bn3.running_mean", "module.layer1.12.bn3.running_var", "module.layer1.13.conv3.weight", "module.layer1.13.bn3.weight", "module.layer1.13.bn3.bias", "module.layer1.13.bn3.running_mean", "module.layer1.13.bn3.running_var", "module.layer1.14.conv3.weight", "module.layer1.14.bn3.weight", "module.layer1.14.bn3.bias", "module.layer1.14.bn3.running_mean", "module.layer1.14.bn3.running_var", "module.layer1.15.conv3.weight", "module.layer1.15.bn3.weight", "module.layer1.15.bn3.bias", "module.layer1.15.bn3.running_mean", "module.layer1.15.bn3.running_var", "module.layer1.16.conv3.weight", "module.layer1.16.bn3.weight", "module.layer1.16.bn3.bias", "module.layer1.16.bn3.running_mean", "module.layer1.16.bn3.running_var", "module.layer1.17.conv3.weight", "module.layer1.17.bn3.weight", "module.layer1.17.bn3.bias", "module.layer1.17.bn3.running_mean", "module.layer1.17.bn3.running_var", "module.layer2.0.conv3.weight", "module.layer2.0.bn3.weight", "module.layer2.0.bn3.bias", "module.layer2.0.bn3.running_mean", "module.layer2.0.bn3.running_var", "module.layer2.0.downsample.1.weight", "module.layer2.0.downsample.1.bias", "module.layer2.0.downsample.1.running_mean", "module.layer2.0.downsample.1.running_var", "module.layer2.1.conv3.weight", "module.layer2.1.bn3.weight", "module.layer2.1.bn3.bias", "module.layer2.1.bn3.running_mean", "module.layer2.1.bn3.running_var", "module.layer2.2.conv3.weight", "module.layer2.2.bn3.weight", "module.layer2.2.bn3.bias", "module.layer2.2.bn3.running_mean", "module.layer2.2.bn3.running_var", "module.layer2.3.conv3.weight", "module.layer2.3.bn3.weight", "module.layer2.3.bn3.bias", "module.layer2.3.bn3.running_mean", "module.layer2.3.bn3.running_var", "module.layer2.4.conv3.weight", "module.layer2.4.bn3.weight", "module.layer2.4.bn3.bias", "module.layer2.4.bn3.running_mean", "module.layer2.4.bn3.running_var", "module.layer2.5.conv3.weight", "module.layer2.5.bn3.weight", "module.layer2.5.bn3.bias", "module.layer2.5.bn3.running_mean", "module.layer2.5.bn3.running_var", "module.layer2.6.conv3.weight", "module.layer2.6.bn3.weight", "module.layer2.6.bn3.bias", "module.layer2.6.bn3.running_mean", "module.layer2.6.bn3.running_var", "module.layer2.7.conv3.weight", "module.layer2.7.bn3.weight", "module.layer2.7.bn3.bias", "module.layer2.7.bn3.running_mean", "module.layer2.7.bn3.running_var", "module.layer2.8.conv3.weight", "module.layer2.8.bn3.weight", "module.layer2.8.bn3.bias", "module.layer2.8.bn3.running_mean", "module.layer2.8.bn3.running_var", "module.layer2.9.conv3.weight", "module.layer2.9.bn3.weight", "module.layer2.9.bn3.bias", "module.layer2.9.bn3.running_mean", "module.layer2.9.bn3.running_var", "module.layer2.10.conv3.weight", "module.layer2.10.bn3.weight", "module.layer2.10.bn3.bias", "module.layer2.10.bn3.running_mean", "module.layer2.10.bn3.running_var", "module.layer2.11.conv3.weight", "module.layer2.11.bn3.weight", "module.layer2.11.bn3.bias", "module.layer2.11.bn3.running_mean", "module.layer2.11.bn3.running_var", "module.layer2.12.conv3.weight", "module.layer2.12.bn3.weight", "module.layer2.12.bn3.bias", "module.layer2.12.bn3.running_mean", "module.layer2.12.bn3.running_var", "module.layer2.13.conv3.weight", "module.layer2.13.bn3.weight", "module.layer2.13.bn3.bias", "module.layer2.13.bn3.running_mean", "module.layer2.13.bn3.running_var", "module.layer2.14.conv3.weight", "module.layer2.14.bn3.weight", "module.layer2.14.bn3.bias", "module.layer2.14.bn3.running_mean", "module.layer2.14.bn3.running_var", "module.layer2.15.conv3.weight", "module.layer2.15.bn3.weight", "module.layer2.15.bn3.bias", "module.layer2.15.bn3.running_mean", "module.layer2.15.bn3.running_var", "module.layer2.16.conv3.weight", "module.layer2.16.bn3.weight", "module.layer2.16.bn3.bias", "module.layer2.16.bn3.running_mean", "module.layer2.16.bn3.running_var", "module.layer2.17.conv3.weight", "module.layer2.17.bn3.weight", "module.layer2.17.bn3.bias", "module.layer2.17.bn3.running_mean", "module.layer2.17.bn3.running_var", "module.layer3.0.conv3.weight", "module.layer3.0.bn3.weight", "module.layer3.0.bn3.bias", "module.layer3.0.bn3.running_mean", "module.layer3.0.bn3.running_var", "module.layer3.0.downsample.1.weight", "module.layer3.0.downsample.1.bias", "module.layer3.0.downsample.1.running_mean", "module.layer3.0.downsample.1.running_var", "module.layer3.1.conv3.weight", "module.layer3.1.bn3.weight", "module.layer3.1.bn3.bias", "module.layer3.1.bn3.running_mean", "module.layer3.1.bn3.running_var", "module.layer3.2.conv3.weight", "module.layer3.2.bn3.weight", "module.layer3.2.bn3.bias", "module.layer3.2.bn3.running_mean", "module.layer3.2.bn3.running_var", "module.layer3.3.conv3.weight", "module.layer3.3.bn3.weight", "module.layer3.3.bn3.bias", "module.layer3.3.bn3.running_mean", "module.layer3.3.bn3.running_var", "module.layer3.4.conv3.weight", "module.layer3.4.bn3.weight", "module.layer3.4.bn3.bias", "module.layer3.4.bn3.running_mean", "module.layer3.4.bn3.running_var", "module.layer3.5.conv3.weight", "module.layer3.5.bn3.weight", "module.layer3.5.bn3.bias", "module.layer3.5.bn3.running_mean", "module.layer3.5.bn3.running_var", "module.layer3.6.conv3.weight", "module.layer3.6.bn3.weight", "module.layer3.6.bn3.bias", "module.layer3.6.bn3.running_mean", "module.layer3.6.bn3.running_var", "module.layer3.7.conv3.weight", "module.layer3.7.bn3.weight", "module.layer3.7.bn3.bias", "module.layer3.7.bn3.running_mean", "module.layer3.7.bn3.running_var", "module.layer3.8.conv3.weight", "module.layer3.8.bn3.weight", "module.layer3.8.bn3.bias", "module.layer3.8.bn3.running_mean", "module.layer3.8.bn3.running_var", "module.layer3.9.conv3.weight", "module.layer3.9.bn3.weight", "module.layer3.9.bn3.bias", "module.layer3.9.bn3.running_mean", "module.layer3.9.bn3.running_var", "module.layer3.10.conv3.weight", "module.layer3.10.bn3.weight", "module.layer3.10.bn3.bias", "module.layer3.10.bn3.running_mean", "module.layer3.10.bn3.running_var", "module.layer3.11.conv3.weight", "module.layer3.11.bn3.weight", "module.layer3.11.bn3.bias", "module.layer3.11.bn3.running_mean", "module.layer3.11.bn3.running_var", "module.layer3.12.conv3.weight", "module.layer3.12.bn3.weight", "module.layer3.12.bn3.bias", "module.layer3.12.bn3.running_mean", "module.layer3.12.bn3.running_var", "module.layer3.13.conv3.weight", "module.layer3.13.bn3.weight", "module.layer3.13.bn3.bias", "module.layer3.13.bn3.running_mean", "module.layer3.13.bn3.running_var", "module.layer3.14.conv3.weight", "module.layer3.14.bn3.weight", "module.layer3.14.bn3.bias", "module.layer3.14.bn3.running_mean", "module.layer3.14.bn3.running_var", "module.layer3.15.conv3.weight", "module.layer3.15.bn3.weight", "module.layer3.15.bn3.bias", "module.layer3.15.bn3.running_mean", "module.layer3.15.bn3.running_var", "module.layer3.16.conv3.weight", "module.layer3.16.bn3.weight", "module.layer3.16.bn3.bias", "module.layer3.16.bn3.running_mean", "module.layer3.16.bn3.running_var", "module.layer3.17.conv3.weight", "module.layer3.17.bn3.weight", "module.layer3.17.bn3.bias", "module.layer3.17.bn3.running_mean", "module.layer3.17.bn3.running_var".
size mismatch for module.layer1.0.conv1.weight: copying a param with shape torch.Size([16, 16, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.1.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.2.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.3.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.4.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.5.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.6.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.7.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.8.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.9.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.10.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.11.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.12.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.13.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.14.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.15.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.16.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.17.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer2.0.bn1.weight: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.bn1.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.bn1.running_mean: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.bn1.running_var: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.conv1.weight: copying a param with shape torch.Size([32, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 3, 3]).
size mismatch for module.layer2.0.downsample.0.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 1, 1]).
size mismatch for module.layer2.1.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.2.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.3.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.4.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.5.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.6.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.7.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.8.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.9.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.10.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.11.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.12.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.13.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.14.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.15.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.16.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.17.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer3.0.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.conv1.weight: copying a param with shape torch.Size([64, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 3, 3]).
size mismatch for module.layer3.0.downsample.0.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 1, 1]).
size mismatch for module.layer3.1.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.2.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.3.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.4.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.5.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.6.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.7.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.8.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.9.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.10.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.11.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.12.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.13.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.14.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.15.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.16.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.17.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.fc.weight: copying a param with shape torch.Size([100, 256]) from checkpoint, the shape in current model is torch.Size([10, 64]).
size mismatch for module.fc.bias: copying a param with shape torch.Size([100]) from checkpoint, the shape in current model is torch.Size([10]).

did you solve it please ?

Kin-Zhang · 2022-02-11T02:54:57Z

        state_dict =checkpoint['state_dict']
        from collections import OrderedDict
        new_state_dict = OrderedDict()

        for k, v in state_dict.items():
            if 'module' not in k:
                k = 'module.'+k
            else:
                k = k.replace('features.module.', 'module.features.')
            new_state_dict[k]=v

        model.load_state_dict(new_state_dict)

This is the solution!!!! Thanks!!!!!

bilalghanem · 2022-04-09T04:27:31Z

change:
model.load_state_dict(torch.load(path + '/pytorch_model.pt'))
to
model.load_state_dict(torch.load(path + '/pytorch_model.pt'), strict=False)

mgrachten · 2022-11-17T08:06:19Z

change: model.load_state_dict(torch.load(path + '/pytorch_model.pt')) to model.load_state_dict(torch.load(path + '/pytorch_model.pt'), strict=False)

Although it will make the RuntimeError go away, don't do this unless you know what you are doing. It will leave any parameters it can't find in the checkpoint with random values. That's not what you want if the issue is caused by a mix-up of parameter names, as was the case for the issue reporter.

Chenny0808 · 2023-12-27T09:39:49Z

Use model.module.state_dict() instead of model.state_dict() in DP mode

dusty-nv mentioned this issue Dec 16, 2020

onnx_export.py error dusty-nv/jetson-inference#847

Closed

giangdip2410 mentioned this issue Apr 2, 2021

RuntimeError: Error(s) in loading state_dict for DataParallel: zaiweizhang/H3DNet#21

Closed

TiankangXie mentioned this issue Nov 29, 2022

Debug why mobilenet performs worse when using batching cosanlab/py-feat#151

Closed

cs-mshah mentioned this issue Jan 13, 2023

Loading a saved model nayeemrizve/OpenLDN#5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error(s) in loading state_dict for DataParallel: #27

Error(s) in loading state_dict for DataParallel: #27

cosmolu commented Oct 15, 2018

imirzadeh commented Nov 27, 2018

goncalomordido commented Mar 20, 2019

xjock commented Apr 17, 2019

raghav1810 commented Jul 25, 2019

jiteshm17 commented Mar 17, 2020

yustiks commented Jan 28, 2021

ersamo commented Sep 27, 2021

Kin-Zhang commented Feb 11, 2022 •

edited

Loading

bilalghanem commented Apr 9, 2022

mgrachten commented Nov 17, 2022

Chenny0808 commented Dec 27, 2023

Error(s) in loading state_dict for DataParallel: #27

Error(s) in loading state_dict for DataParallel: #27

Comments

cosmolu commented Oct 15, 2018

imirzadeh commented Nov 27, 2018

goncalomordido commented Mar 20, 2019

xjock commented Apr 17, 2019

raghav1810 commented Jul 25, 2019

jiteshm17 commented Mar 17, 2020

yustiks commented Jan 28, 2021

ersamo commented Sep 27, 2021

Kin-Zhang commented Feb 11, 2022 • edited Loading

bilalghanem commented Apr 9, 2022

mgrachten commented Nov 17, 2022

Chenny0808 commented Dec 27, 2023

Kin-Zhang commented Feb 11, 2022 •

edited

Loading