Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VitDet huge checkpoint says about incompatible shapes & missing layers in the backbone #4566

Open
kretes opened this issue Sep 23, 2022 · 3 comments

Comments

@kretes
Copy link

kretes commented Sep 23, 2022

Instructions To Reproduce the Issue:

  1. Full runnable code or full changes you made:
no changes
  1. What exact command you run:
    tools/lazyconfig_train_net.py --config-file projects/ViTDet/configs/COCO/mask_rcnn_vitdet_h_75ep.py "dataloader.train.total_batch_size=1"
  2. Full logs or other relevant observations:
[09/23 17:55:36 fvcore.common.checkpoint]: [Checkpointer] Loading from detectron2://ImageNetPretrained/MAE/mae_pretrain_vit_huge_p14to16.pth ...
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of backbone.simfp_2.4.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of backbone.simfp_2.4.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of backbone.simfp_2.5.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of backbone.simfp_2.5.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of backbone.simfp_3.1.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of backbone.simfp_3.1.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of backbone.simfp_3.2.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of backbone.simfp_3.2.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of backbone.simfp_4.0.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of backbone.simfp_4.0.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of backbone.simfp_4.1.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of backbone.simfp_4.1.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of backbone.simfp_5.1.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of backbone.simfp_5.1.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of backbone.simfp_5.2.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of backbone.simfp_5.2.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of roi_heads.box_head.conv1.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of roi_heads.box_head.conv1.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of roi_heads.box_head.conv2.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of roi_heads.box_head.conv2.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of roi_heads.box_head.conv3.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of roi_heads.box_head.conv3.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of roi_heads.box_head.conv4.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of roi_heads.box_head.conv4.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of roi_heads.mask_head.mask_fcn1.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of roi_heads.mask_head.mask_fcn1.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of roi_heads.mask_head.mask_fcn2.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of roi_heads.mask_head.mask_fcn2.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of roi_heads.mask_head.mask_fcn3.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of roi_heads.mask_head.mask_fcn3.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.bias in checkpoint is torch.Size([1280]), while shape of roi_heads.mask_head.mask_fcn4.norm.bias in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.bias will not be loaded. Please double check and see if this is desired.
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: Shape of norm.weight in checkpoint is torch.Size([1280]), while shape of roi_heads.mask_head.mask_fcn4.norm.weight in model is torch.Size([256]).
WARNING [09/23 17:55:37 d2.checkpoint.c2_model_loading]: norm.weight will not be loaded. Please double check and see if this is desired.
[09/23 17:55:37 d2.checkpoint.c2_model_loading]: Following weights matched with submodule backbone.net:
| Names in Model        | Names in Checkpoint               | Shapes                 |
|:----------------------|:----------------------------------|:-----------------------|
| blocks.0.attn.proj.*  | blocks.0.attn.proj.{bias,weight}  | (1280,) (1280,1280)    |
| blocks.0.attn.qkv.*   | blocks.0.attn.qkv.{bias,weight}   | (3840,) (3840,1280)    |
| blocks.0.mlp.fc1.*    | blocks.0.mlp.fc1.{bias,weight}    | (5120,) (5120,1280)    |
| blocks.0.mlp.fc2.*    | blocks.0.mlp.fc2.{bias,weight}    | (1280,) (1280,5120)    |
| blocks.0.norm1.*      | blocks.0.norm1.{bias,weight}      | (1280,) (1280,)        |
| blocks.0.norm2.*      | blocks.0.norm2.{bias,weight}      | (1280,) (1280,)        |
| blocks.1.attn.proj.*  | blocks.1.attn.proj.{bias,weight}  | (1280,) (1280,1280)    |
| blocks.1.attn.qkv.*   | blocks.1.attn.qkv.{bias,weight}   | (3840,) (3840,1280)    |
| blocks.1.mlp.fc1.*    | blocks.1.mlp.fc1.{bias,weight}    | (5120,) (5120,1280)    |
| blocks.1.mlp.fc2.*    | blocks.1.mlp.fc2.{bias,weight}    | (1280,) (1280,5120)    |
| blocks.1.norm1.*      | blocks.1.norm1.{bias,weight}      | (1280,) (1280,)        |
| blocks.1.norm2.*      | blocks.1.norm2.{bias,weight}      | (1280,) (1280,)        |
| blocks.10.attn.proj.* | blocks.10.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.10.attn.qkv.*  | blocks.10.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.10.mlp.fc1.*   | blocks.10.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.10.mlp.fc2.*   | blocks.10.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.10.norm1.*     | blocks.10.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.10.norm2.*     | blocks.10.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.11.attn.proj.* | blocks.11.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.11.attn.qkv.*  | blocks.11.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.11.mlp.fc1.*   | blocks.11.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.11.mlp.fc2.*   | blocks.11.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.11.norm1.*     | blocks.11.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.11.norm2.*     | blocks.11.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.12.attn.proj.* | blocks.12.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.12.attn.qkv.*  | blocks.12.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.12.mlp.fc1.*   | blocks.12.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.12.mlp.fc2.*   | blocks.12.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.12.norm1.*     | blocks.12.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.12.norm2.*     | blocks.12.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.13.attn.proj.* | blocks.13.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.13.attn.qkv.*  | blocks.13.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.13.mlp.fc1.*   | blocks.13.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.13.mlp.fc2.*   | blocks.13.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.13.norm1.*     | blocks.13.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.13.norm2.*     | blocks.13.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.14.attn.proj.* | blocks.14.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.14.attn.qkv.*  | blocks.14.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.14.mlp.fc1.*   | blocks.14.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.14.mlp.fc2.*   | blocks.14.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.14.norm1.*     | blocks.14.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.14.norm2.*     | blocks.14.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.15.attn.proj.* | blocks.15.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.15.attn.qkv.*  | blocks.15.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.15.mlp.fc1.*   | blocks.15.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.15.mlp.fc2.*   | blocks.15.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.15.norm1.*     | blocks.15.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.15.norm2.*     | blocks.15.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.16.attn.proj.* | blocks.16.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.16.attn.qkv.*  | blocks.16.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.16.mlp.fc1.*   | blocks.16.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.16.mlp.fc2.*   | blocks.16.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.16.norm1.*     | blocks.16.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.16.norm2.*     | blocks.16.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.17.attn.proj.* | blocks.17.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.17.attn.qkv.*  | blocks.17.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.17.mlp.fc1.*   | blocks.17.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.17.mlp.fc2.*   | blocks.17.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.17.norm1.*     | blocks.17.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.17.norm2.*     | blocks.17.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.18.attn.proj.* | blocks.18.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.18.attn.qkv.*  | blocks.18.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.18.mlp.fc1.*   | blocks.18.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.18.mlp.fc2.*   | blocks.18.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.18.norm1.*     | blocks.18.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.18.norm2.*     | blocks.18.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.19.attn.proj.* | blocks.19.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.19.attn.qkv.*  | blocks.19.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.19.mlp.fc1.*   | blocks.19.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.19.mlp.fc2.*   | blocks.19.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.19.norm1.*     | blocks.19.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.19.norm2.*     | blocks.19.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.2.attn.proj.*  | blocks.2.attn.proj.{bias,weight}  | (1280,) (1280,1280)    |
| blocks.2.attn.qkv.*   | blocks.2.attn.qkv.{bias,weight}   | (3840,) (3840,1280)    |
| blocks.2.mlp.fc1.*    | blocks.2.mlp.fc1.{bias,weight}    | (5120,) (5120,1280)    |
| blocks.2.mlp.fc2.*    | blocks.2.mlp.fc2.{bias,weight}    | (1280,) (1280,5120)    |
| blocks.2.norm1.*      | blocks.2.norm1.{bias,weight}      | (1280,) (1280,)        |
| blocks.2.norm2.*      | blocks.2.norm2.{bias,weight}      | (1280,) (1280,)        |
| blocks.20.attn.proj.* | blocks.20.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.20.attn.qkv.*  | blocks.20.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.20.mlp.fc1.*   | blocks.20.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.20.mlp.fc2.*   | blocks.20.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.20.norm1.*     | blocks.20.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.20.norm2.*     | blocks.20.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.21.attn.proj.* | blocks.21.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.21.attn.qkv.*  | blocks.21.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.21.mlp.fc1.*   | blocks.21.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.21.mlp.fc2.*   | blocks.21.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.21.norm1.*     | blocks.21.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.21.norm2.*     | blocks.21.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.22.attn.proj.* | blocks.22.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.22.attn.qkv.*  | blocks.22.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.22.mlp.fc1.*   | blocks.22.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.22.mlp.fc2.*   | blocks.22.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.22.norm1.*     | blocks.22.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.22.norm2.*     | blocks.22.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.23.attn.proj.* | blocks.23.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.23.attn.qkv.*  | blocks.23.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.23.mlp.fc1.*   | blocks.23.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.23.mlp.fc2.*   | blocks.23.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.23.norm1.*     | blocks.23.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.23.norm2.*     | blocks.23.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.24.attn.proj.* | blocks.24.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.24.attn.qkv.*  | blocks.24.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.24.mlp.fc1.*   | blocks.24.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.24.mlp.fc2.*   | blocks.24.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.24.norm1.*     | blocks.24.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.24.norm2.*     | blocks.24.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.25.attn.proj.* | blocks.25.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.25.attn.qkv.*  | blocks.25.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.25.mlp.fc1.*   | blocks.25.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.25.mlp.fc2.*   | blocks.25.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.25.norm1.*     | blocks.25.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.25.norm2.*     | blocks.25.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.26.attn.proj.* | blocks.26.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.26.attn.qkv.*  | blocks.26.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.26.mlp.fc1.*   | blocks.26.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.26.mlp.fc2.*   | blocks.26.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.26.norm1.*     | blocks.26.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.26.norm2.*     | blocks.26.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.27.attn.proj.* | blocks.27.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.27.attn.qkv.*  | blocks.27.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.27.mlp.fc1.*   | blocks.27.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.27.mlp.fc2.*   | blocks.27.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.27.norm1.*     | blocks.27.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.27.norm2.*     | blocks.27.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.28.attn.proj.* | blocks.28.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.28.attn.qkv.*  | blocks.28.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.28.mlp.fc1.*   | blocks.28.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.28.mlp.fc2.*   | blocks.28.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.28.norm1.*     | blocks.28.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.28.norm2.*     | blocks.28.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.29.attn.proj.* | blocks.29.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.29.attn.qkv.*  | blocks.29.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.29.mlp.fc1.*   | blocks.29.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.29.mlp.fc2.*   | blocks.29.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.29.norm1.*     | blocks.29.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.29.norm2.*     | blocks.29.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.3.attn.proj.*  | blocks.3.attn.proj.{bias,weight}  | (1280,) (1280,1280)    |
| blocks.3.attn.qkv.*   | blocks.3.attn.qkv.{bias,weight}   | (3840,) (3840,1280)    |
| blocks.3.mlp.fc1.*    | blocks.3.mlp.fc1.{bias,weight}    | (5120,) (5120,1280)    |
| blocks.3.mlp.fc2.*    | blocks.3.mlp.fc2.{bias,weight}    | (1280,) (1280,5120)    |
| blocks.3.norm1.*      | blocks.3.norm1.{bias,weight}      | (1280,) (1280,)        |
| blocks.3.norm2.*      | blocks.3.norm2.{bias,weight}      | (1280,) (1280,)        |
| blocks.30.attn.proj.* | blocks.30.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.30.attn.qkv.*  | blocks.30.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.30.mlp.fc1.*   | blocks.30.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.30.mlp.fc2.*   | blocks.30.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.30.norm1.*     | blocks.30.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.30.norm2.*     | blocks.30.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.31.attn.proj.* | blocks.31.attn.proj.{bias,weight} | (1280,) (1280,1280)    |
| blocks.31.attn.qkv.*  | blocks.31.attn.qkv.{bias,weight}  | (3840,) (3840,1280)    |
| blocks.31.mlp.fc1.*   | blocks.31.mlp.fc1.{bias,weight}   | (5120,) (5120,1280)    |
| blocks.31.mlp.fc2.*   | blocks.31.mlp.fc2.{bias,weight}   | (1280,) (1280,5120)    |
| blocks.31.norm1.*     | blocks.31.norm1.{bias,weight}     | (1280,) (1280,)        |
| blocks.31.norm2.*     | blocks.31.norm2.{bias,weight}     | (1280,) (1280,)        |
| blocks.4.attn.proj.*  | blocks.4.attn.proj.{bias,weight}  | (1280,) (1280,1280)    |
| blocks.4.attn.qkv.*   | blocks.4.attn.qkv.{bias,weight}   | (3840,) (3840,1280)    |
| blocks.4.mlp.fc1.*    | blocks.4.mlp.fc1.{bias,weight}    | (5120,) (5120,1280)    |
| blocks.4.mlp.fc2.*    | blocks.4.mlp.fc2.{bias,weight}    | (1280,) (1280,5120)    |
| blocks.4.norm1.*      | blocks.4.norm1.{bias,weight}      | (1280,) (1280,)        |
| blocks.4.norm2.*      | blocks.4.norm2.{bias,weight}      | (1280,) (1280,)        |
| blocks.5.attn.proj.*  | blocks.5.attn.proj.{bias,weight}  | (1280,) (1280,1280)    |
| blocks.5.attn.qkv.*   | blocks.5.attn.qkv.{bias,weight}   | (3840,) (3840,1280)    |
| blocks.5.mlp.fc1.*    | blocks.5.mlp.fc1.{bias,weight}    | (5120,) (5120,1280)    |
| blocks.5.mlp.fc2.*    | blocks.5.mlp.fc2.{bias,weight}    | (1280,) (1280,5120)    |
| blocks.5.norm1.*      | blocks.5.norm1.{bias,weight}      | (1280,) (1280,)        |
| blocks.5.norm2.*      | blocks.5.norm2.{bias,weight}      | (1280,) (1280,)        |
| blocks.6.attn.proj.*  | blocks.6.attn.proj.{bias,weight}  | (1280,) (1280,1280)    |
| blocks.6.attn.qkv.*   | blocks.6.attn.qkv.{bias,weight}   | (3840,) (3840,1280)    |
| blocks.6.mlp.fc1.*    | blocks.6.mlp.fc1.{bias,weight}    | (5120,) (5120,1280)    |
| blocks.6.mlp.fc2.*    | blocks.6.mlp.fc2.{bias,weight}    | (1280,) (1280,5120)    |
| blocks.6.norm1.*      | blocks.6.norm1.{bias,weight}      | (1280,) (1280,)        |
| blocks.6.norm2.*      | blocks.6.norm2.{bias,weight}      | (1280,) (1280,)        |
| blocks.7.attn.proj.*  | blocks.7.attn.proj.{bias,weight}  | (1280,) (1280,1280)    |
| blocks.7.attn.qkv.*   | blocks.7.attn.qkv.{bias,weight}   | (3840,) (3840,1280)    |
| blocks.7.mlp.fc1.*    | blocks.7.mlp.fc1.{bias,weight}    | (5120,) (5120,1280)    |
| blocks.7.mlp.fc2.*    | blocks.7.mlp.fc2.{bias,weight}    | (1280,) (1280,5120)    |
| blocks.7.norm1.*      | blocks.7.norm1.{bias,weight}      | (1280,) (1280,)        |
| blocks.7.norm2.*      | blocks.7.norm2.{bias,weight}      | (1280,) (1280,)        |
| blocks.8.attn.proj.*  | blocks.8.attn.proj.{bias,weight}  | (1280,) (1280,1280)    |
| blocks.8.attn.qkv.*   | blocks.8.attn.qkv.{bias,weight}   | (3840,) (3840,1280)    |
| blocks.8.mlp.fc1.*    | blocks.8.mlp.fc1.{bias,weight}    | (5120,) (5120,1280)    |
| blocks.8.mlp.fc2.*    | blocks.8.mlp.fc2.{bias,weight}    | (1280,) (1280,5120)    |
| blocks.8.norm1.*      | blocks.8.norm1.{bias,weight}      | (1280,) (1280,)        |
| blocks.8.norm2.*      | blocks.8.norm2.{bias,weight}      | (1280,) (1280,)        |
| blocks.9.attn.proj.*  | blocks.9.attn.proj.{bias,weight}  | (1280,) (1280,1280)    |
| blocks.9.attn.qkv.*   | blocks.9.attn.qkv.{bias,weight}   | (3840,) (3840,1280)    |
| blocks.9.mlp.fc1.*    | blocks.9.mlp.fc1.{bias,weight}    | (5120,) (5120,1280)    |
| blocks.9.mlp.fc2.*    | blocks.9.mlp.fc2.{bias,weight}    | (1280,) (1280,5120)    |
| blocks.9.norm1.*      | blocks.9.norm1.{bias,weight}      | (1280,) (1280,)        |
| blocks.9.norm2.*      | blocks.9.norm2.{bias,weight}      | (1280,) (1280,)        |
| patch_embed.proj.*    | patch_embed.proj.{bias,weight}    | (1280,) (1280,3,16,16) |
| pos_embed             | pos_embed                         | (1, 197, 1280)         |
WARNING [09/23 17:55:37 fvcore.common.checkpoint]: Some model parameters or buffers are not found in the checkpoint:
backbone.net.blocks.0.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.1.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.10.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.11.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.12.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.13.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.14.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.15.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.16.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.17.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.18.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.19.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.2.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.20.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.21.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.22.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.23.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.24.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.25.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.26.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.27.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.28.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.29.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.3.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.30.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.31.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.4.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.5.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.6.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.7.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.8.attn.{rel_pos_h, rel_pos_w}
backbone.net.blocks.9.attn.{rel_pos_h, rel_pos_w}
backbone.simfp_2.0.{bias, weight}
backbone.simfp_2.1.{bias, weight}
backbone.simfp_2.3.{bias, weight}
backbone.simfp_2.4.norm.{bias, weight}
backbone.simfp_2.4.weight
backbone.simfp_2.5.norm.{bias, weight}
backbone.simfp_2.5.weight
backbone.simfp_3.0.{bias, weight}
backbone.simfp_3.1.norm.{bias, weight}
backbone.simfp_3.1.weight
backbone.simfp_3.2.norm.{bias, weight}
backbone.simfp_3.2.weight
backbone.simfp_4.0.norm.{bias, weight}
backbone.simfp_4.0.weight
backbone.simfp_4.1.norm.{bias, weight}
backbone.simfp_4.1.weight
backbone.simfp_5.1.norm.{bias, weight}
backbone.simfp_5.1.weight
backbone.simfp_5.2.norm.{bias, weight}
backbone.simfp_5.2.weight
proposal_generator.rpn_head.anchor_deltas.{bias, weight}
proposal_generator.rpn_head.conv.conv0.{bias, weight}
proposal_generator.rpn_head.conv.conv1.{bias, weight}
proposal_generator.rpn_head.objectness_logits.{bias, weight}
roi_heads.box_head.conv1.norm.{bias, weight}
roi_heads.box_head.conv1.weight
roi_heads.box_head.conv2.norm.{bias, weight}
roi_heads.box_head.conv2.weight
roi_heads.box_head.conv3.norm.{bias, weight}
roi_heads.box_head.conv3.weight
roi_heads.box_head.conv4.norm.{bias, weight}
roi_heads.box_head.conv4.weight
roi_heads.box_head.fc1.{bias, weight}
roi_heads.box_predictor.bbox_pred.{bias, weight}
roi_heads.box_predictor.cls_score.{bias, weight}
roi_heads.mask_head.deconv.{bias, weight}
roi_heads.mask_head.mask_fcn1.norm.{bias, weight}
roi_heads.mask_head.mask_fcn1.weight
roi_heads.mask_head.mask_fcn2.norm.{bias, weight}
roi_heads.mask_head.mask_fcn2.weight
roi_heads.mask_head.mask_fcn3.norm.{bias, weight}
roi_heads.mask_head.mask_fcn3.weight
roi_heads.mask_head.mask_fcn4.norm.{bias, weight}
roi_heads.mask_head.mask_fcn4.weight
roi_heads.mask_head.predictor.{bias, weight}
WARNING [09/23 17:55:37 fvcore.common.checkpoint]: The checkpoint state_dict contains keys that are not used by the model:
  cls_token
  norm.{bias, weight}
checkpointer resumed

Expected behavior:

The logs above are about loading the ImageNet pretrained on MAE checkpoint into a VitDet. The messages about incompatible shapes & missing weights in the backbone are unexpected and lead me to a belief this is a wrong checkpoint for the model.
I think it boils down ito missing weights in patterns:

- backbone.net.blocks.*.attn.rel_pos*
- backbone.simfp_*.*

In case those were were ignored on purpose when exporting checkpoint - I think it would be best to specify the expected missing weights (as e.g. the rpn and roi_heads are not expected to be in this checkpoint). If not - maybe good idea to add a print before loading the checkpoint about the expected output, or a comment in configuration file in places like https://github.com/facebookresearch/detectron2/blob/main/projects/ViTDet/configs/COCO/mask_rcnn_vitdet_h_75ep.py#L12

Environment:

Paste the output of the following command:

wget -nc -nv https://github.com/facebookresearch/detectron2/raw/main/detectron2/utils/collect_env.py && python collect_env.py
----------------------  --------------------------------------------------------------------------
sys.platform            linux
Python                  3.8.10 (default, Jun 22 2022, 20:18:18) [GCC 9.4.0]
numpy                   1.23.3
detectron2              0.6 @/home/appuser/focal-detectron2/detectron2
Compiler                GCC 9.4
CUDA compiler           CUDA 11.1
detectron2 arch flags   3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 7.0, 7.5
DETECTRON2_ENV_MODULE   <not set>
PyTorch                 1.10.0+cu111 @/home/appuser/.local/lib/python3.8/site-packages/torch
PyTorch debug build     False
GPU available           Yes
GPU 0                   NVIDIA A100-SXM4-40GB (arch=8.0)
Driver version          510.47.03
CUDA_HOME               /usr/local/cuda
TORCH_CUDA_ARCH_LIST    Kepler;Kepler+Tesla;Maxwell;Maxwell+Tegra;Pascal;Volta;Turing
Pillow                  8.1.0
torchvision             0.11.1+cu111 @/home/appuser/.local/lib/python3.8/site-packages/torchvision
torchvision arch flags  3.5, 5.0, 6.0, 7.0, 7.5, 8.0, 8.6
fvcore                  0.1.5
iopath                  0.1.9
cv2                     4.6.0
----------------------  --------------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.0.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

@FedericoVasile1
Copy link

Hi, any news on this? Are the weights ignored on purpose or not?

@gugibugy
Copy link

+1, have same question!

@amalshehan
Copy link

Since the weights are missing in the checkpoint and expected in the model, I am assuming they will be randomly initialized and trained with the downstream task. Any other insights workarounds ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants