implemented resnet18 and resnet34 #16363

zaccharieramzi · 2022-04-05T08:49:58Z

This should solve this issue : keras-team/keras-applications#151

Which has duplicates here:

I don't know how to test this, this is why I am making it a draft PR.
I haven't implemented the V2, to make this easy to review, and I haven't trained the networks to get the weights.

Note: this is a reopening of #16358, which I messed up with wrong emails in the commits.

See https://arxiv.org/abs/1512.03385

zaccharieramzi · 2022-04-06T09:03:54Z

Adding the model summaries here for info:

Resnet18:

Model: "resnet18"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_1 (InputLayer)           [(None, 224, 224, 3  0           []                               
                                )]                                                                
                                                                                                  
 conv1_pad (ZeroPadding2D)      (None, 230, 230, 3)  0           ['input_1[0][0]']                
                                                                                                  
 conv1_conv (Conv2D)            (None, 112, 112, 64  9472        ['conv1_pad[0][0]']              
                                )                                                                 
                                                                                                  
 conv1_bn (BatchNormalization)  (None, 112, 112, 64  256         ['conv1_conv[0][0]']             
                                )                                                                 
                                                                                                  
 conv1_relu (Activation)        (None, 112, 112, 64  0           ['conv1_bn[0][0]']               
                                )                                                                 
                                                                                                  
 pool1_pad (ZeroPadding2D)      (None, 114, 114, 64  0           ['conv1_relu[0][0]']             
                                )                                                                 
                                                                                                  
 pool1_pool (MaxPooling2D)      (None, 56, 56, 64)   0           ['pool1_pad[0][0]']              
                                                                                                  
 conv2_block1_1_conv (Conv2D)   (None, 56, 56, 64)   36928       ['pool1_pool[0][0]']             
                                                                                                  
 conv2_block1_1_bn (BatchNormal  (None, 56, 56, 64)  256         ['conv2_block1_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block1_1_relu (Activatio  (None, 56, 56, 64)  0           ['conv2_block1_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv2_block1_0_conv (Conv2D)   (None, 56, 56, 64)   4160        ['pool1_pool[0][0]']             
                                                                                                  
 conv2_block1_2_conv (Conv2D)   (None, 56, 56, 64)   36928       ['conv2_block1_1_relu[0][0]']    
                                                                                                  
 conv2_block1_0_bn (BatchNormal  (None, 56, 56, 64)  256         ['conv2_block1_0_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block1_2_bn (BatchNormal  (None, 56, 56, 64)  256         ['conv2_block1_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block1_add (Add)         (None, 56, 56, 64)   0           ['conv2_block1_0_bn[0][0]',      
                                                                  'conv2_block1_2_bn[0][0]']      
                                                                                                  
 conv2_block1_out (Activation)  (None, 56, 56, 64)   0           ['conv2_block1_add[0][0]']       
                                                                                                  
 conv2_block2_1_conv (Conv2D)   (None, 56, 56, 64)   36928       ['conv2_block1_out[0][0]']       
                                                                                                  
 conv2_block2_1_bn (BatchNormal  (None, 56, 56, 64)  256         ['conv2_block2_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block2_1_relu (Activatio  (None, 56, 56, 64)  0           ['conv2_block2_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv2_block2_2_conv (Conv2D)   (None, 56, 56, 64)   36928       ['conv2_block2_1_relu[0][0]']    
                                                                                                  
 conv2_block2_2_bn (BatchNormal  (None, 56, 56, 64)  256         ['conv2_block2_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block2_add (Add)         (None, 56, 56, 64)   0           ['conv2_block1_out[0][0]',       
                                                                  'conv2_block2_2_bn[0][0]']      
                                                                                                  
 conv2_block2_out (Activation)  (None, 56, 56, 64)   0           ['conv2_block2_add[0][0]']       
                                                                                                  
 conv3_block1_1_conv (Conv2D)   (None, 28, 28, 128)  73856       ['conv2_block2_out[0][0]']       
                                                                                                  
 conv3_block1_1_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block1_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block1_1_relu (Activatio  (None, 28, 28, 128)  0          ['conv3_block1_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block1_0_conv (Conv2D)   (None, 28, 28, 128)  8320        ['conv2_block2_out[0][0]']       
                                                                                                  
 conv3_block1_2_conv (Conv2D)   (None, 28, 28, 128)  147584      ['conv3_block1_1_relu[0][0]']    
                                                                                                  
 conv3_block1_0_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block1_0_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block1_2_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block1_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block1_add (Add)         (None, 28, 28, 128)  0           ['conv3_block1_0_bn[0][0]',      
                                                                  'conv3_block1_2_bn[0][0]']      
                                                                                                  
 conv3_block1_out (Activation)  (None, 28, 28, 128)  0           ['conv3_block1_add[0][0]']       
                                                                                                  
 conv3_block2_1_conv (Conv2D)   (None, 28, 28, 128)  147584      ['conv3_block1_out[0][0]']       
                                                                                                  
 conv3_block2_1_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block2_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block2_1_relu (Activatio  (None, 28, 28, 128)  0          ['conv3_block2_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block2_2_conv (Conv2D)   (None, 28, 28, 128)  147584      ['conv3_block2_1_relu[0][0]']    
                                                                                                  
 conv3_block2_2_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block2_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block2_add (Add)         (None, 28, 28, 128)  0           ['conv3_block1_out[0][0]',       
                                                                  'conv3_block2_2_bn[0][0]']      
                                                                                                  
 conv3_block2_out (Activation)  (None, 28, 28, 128)  0           ['conv3_block2_add[0][0]']       
                                                                                                  
 conv4_block1_1_conv (Conv2D)   (None, 14, 14, 256)  295168      ['conv3_block2_out[0][0]']       
                                                                                                  
 conv4_block1_1_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block1_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block1_1_relu (Activatio  (None, 14, 14, 256)  0          ['conv4_block1_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block1_0_conv (Conv2D)   (None, 14, 14, 256)  33024       ['conv3_block2_out[0][0]']       
                                                                                                  
 conv4_block1_2_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block1_1_relu[0][0]']    
                                                                                                  
 conv4_block1_0_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block1_0_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block1_2_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block1_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block1_add (Add)         (None, 14, 14, 256)  0           ['conv4_block1_0_bn[0][0]',      
                                                                  'conv4_block1_2_bn[0][0]']      
                                                                                                  
 conv4_block1_out (Activation)  (None, 14, 14, 256)  0           ['conv4_block1_add[0][0]']       
                                                                                                  
 conv4_block2_1_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block1_out[0][0]']       
                                                                                                  
 conv4_block2_1_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block2_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block2_1_relu (Activatio  (None, 14, 14, 256)  0          ['conv4_block2_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block2_2_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block2_1_relu[0][0]']    
                                                                                                  
 conv4_block2_2_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block2_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block2_add (Add)         (None, 14, 14, 256)  0           ['conv4_block1_out[0][0]',       
                                                                  'conv4_block2_2_bn[0][0]']      
                                                                                                  
 conv4_block2_out (Activation)  (None, 14, 14, 256)  0           ['conv4_block2_add[0][0]']       
                                                                                                  
 conv5_block1_1_conv (Conv2D)   (None, 7, 7, 512)    1180160     ['conv4_block2_out[0][0]']       
                                                                                                  
 conv5_block1_1_bn (BatchNormal  (None, 7, 7, 512)   2048        ['conv5_block1_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block1_1_relu (Activatio  (None, 7, 7, 512)   0           ['conv5_block1_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv5_block1_0_conv (Conv2D)   (None, 7, 7, 512)    131584      ['conv4_block2_out[0][0]']       
                                                                                                  
 conv5_block1_2_conv (Conv2D)   (None, 7, 7, 512)    2359808     ['conv5_block1_1_relu[0][0]']    
                                                                                                  
 conv5_block1_0_bn (BatchNormal  (None, 7, 7, 512)   2048        ['conv5_block1_0_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block1_2_bn (BatchNormal  (None, 7, 7, 512)   2048        ['conv5_block1_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block1_add (Add)         (None, 7, 7, 512)    0           ['conv5_block1_0_bn[0][0]',      
                                                                  'conv5_block1_2_bn[0][0]']      
                                                                                                  
 conv5_block1_out (Activation)  (None, 7, 7, 512)    0           ['conv5_block1_add[0][0]']       
                                                                                                  
 conv5_block2_1_conv (Conv2D)   (None, 7, 7, 512)    2359808     ['conv5_block1_out[0][0]']       
                                                                                                  
 conv5_block2_1_bn (BatchNormal  (None, 7, 7, 512)   2048        ['conv5_block2_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block2_1_relu (Activatio  (None, 7, 7, 512)   0           ['conv5_block2_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv5_block2_2_conv (Conv2D)   (None, 7, 7, 512)    2359808     ['conv5_block2_1_relu[0][0]']    
                                                                                                  
 conv5_block2_2_bn (BatchNormal  (None, 7, 7, 512)   2048        ['conv5_block2_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block2_add (Add)         (None, 7, 7, 512)    0           ['conv5_block1_out[0][0]',       
                                                                  'conv5_block2_2_bn[0][0]']      
                                                                                                  
 conv5_block2_out (Activation)  (None, 7, 7, 512)    0           ['conv5_block2_add[0][0]']       
                                                                                                  
 avg_pool (GlobalAveragePooling  (None, 512)         0           ['conv5_block2_out[0][0]']       
 2D)                                                                                              
                                                                                                  
 predictions (Dense)            (None, 1000)         513000      ['avg_pool[0][0]']               
                                                                                                  
==================================================================================================
Total params: 11,708,328
Trainable params: 11,698,600
Non-trainable params: 9,728
__________________________________________________________________________________________________

Resnet34:

Model: "resnet34"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_1 (InputLayer)           [(None, 224, 224, 3  0           []                               
                                )]                                                                
                                                                                                  
 conv1_pad (ZeroPadding2D)      (None, 230, 230, 3)  0           ['input_1[0][0]']                
                                                                                                  
 conv1_conv (Conv2D)            (None, 112, 112, 64  9472        ['conv1_pad[0][0]']              
                                )                                                                 
                                                                                                  
 conv1_bn (BatchNormalization)  (None, 112, 112, 64  256         ['conv1_conv[0][0]']             
                                )                                                                 
                                                                                                  
 conv1_relu (Activation)        (None, 112, 112, 64  0           ['conv1_bn[0][0]']               
                                )                                                                 
                                                                                                  
 pool1_pad (ZeroPadding2D)      (None, 114, 114, 64  0           ['conv1_relu[0][0]']             
                                )                                                                 
                                                                                                  
 pool1_pool (MaxPooling2D)      (None, 56, 56, 64)   0           ['pool1_pad[0][0]']              
                                                                                                  
 conv2_block1_1_conv (Conv2D)   (None, 56, 56, 64)   36928       ['pool1_pool[0][0]']             
                                                                                                  
 conv2_block1_1_bn (BatchNormal  (None, 56, 56, 64)  256         ['conv2_block1_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block1_1_relu (Activatio  (None, 56, 56, 64)  0           ['conv2_block1_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv2_block1_0_conv (Conv2D)   (None, 56, 56, 64)   4160        ['pool1_pool[0][0]']             
                                                                                                  
 conv2_block1_2_conv (Conv2D)   (None, 56, 56, 64)   36928       ['conv2_block1_1_relu[0][0]']    
                                                                                                  
 conv2_block1_0_bn (BatchNormal  (None, 56, 56, 64)  256         ['conv2_block1_0_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block1_2_bn (BatchNormal  (None, 56, 56, 64)  256         ['conv2_block1_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block1_add (Add)         (None, 56, 56, 64)   0           ['conv2_block1_0_bn[0][0]',      
                                                                  'conv2_block1_2_bn[0][0]']      
                                                                                                  
 conv2_block1_out (Activation)  (None, 56, 56, 64)   0           ['conv2_block1_add[0][0]']       
                                                                                                  
 conv2_block2_1_conv (Conv2D)   (None, 56, 56, 64)   36928       ['conv2_block1_out[0][0]']       
                                                                                                  
 conv2_block2_1_bn (BatchNormal  (None, 56, 56, 64)  256         ['conv2_block2_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block2_1_relu (Activatio  (None, 56, 56, 64)  0           ['conv2_block2_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv2_block2_2_conv (Conv2D)   (None, 56, 56, 64)   36928       ['conv2_block2_1_relu[0][0]']    
                                                                                                  
 conv2_block2_2_bn (BatchNormal  (None, 56, 56, 64)  256         ['conv2_block2_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block2_add (Add)         (None, 56, 56, 64)   0           ['conv2_block1_out[0][0]',       
                                                                  'conv2_block2_2_bn[0][0]']      
                                                                                                  
 conv2_block2_out (Activation)  (None, 56, 56, 64)   0           ['conv2_block2_add[0][0]']       
                                                                                                  
 conv2_block3_1_conv (Conv2D)   (None, 56, 56, 64)   36928       ['conv2_block2_out[0][0]']       
                                                                                                  
 conv2_block3_1_bn (BatchNormal  (None, 56, 56, 64)  256         ['conv2_block3_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block3_1_relu (Activatio  (None, 56, 56, 64)  0           ['conv2_block3_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv2_block3_2_conv (Conv2D)   (None, 56, 56, 64)   36928       ['conv2_block3_1_relu[0][0]']    
                                                                                                  
 conv2_block3_2_bn (BatchNormal  (None, 56, 56, 64)  256         ['conv2_block3_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv2_block3_add (Add)         (None, 56, 56, 64)   0           ['conv2_block2_out[0][0]',       
                                                                  'conv2_block3_2_bn[0][0]']      
                                                                                                  
 conv2_block3_out (Activation)  (None, 56, 56, 64)   0           ['conv2_block3_add[0][0]']       
                                                                                                  
 conv3_block1_1_conv (Conv2D)   (None, 28, 28, 128)  73856       ['conv2_block3_out[0][0]']       
                                                                                                  
 conv3_block1_1_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block1_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block1_1_relu (Activatio  (None, 28, 28, 128)  0          ['conv3_block1_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block1_0_conv (Conv2D)   (None, 28, 28, 128)  8320        ['conv2_block3_out[0][0]']       
                                                                                                  
 conv3_block1_2_conv (Conv2D)   (None, 28, 28, 128)  147584      ['conv3_block1_1_relu[0][0]']    
                                                                                                  
 conv3_block1_0_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block1_0_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block1_2_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block1_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block1_add (Add)         (None, 28, 28, 128)  0           ['conv3_block1_0_bn[0][0]',      
                                                                  'conv3_block1_2_bn[0][0]']      
                                                                                                  
 conv3_block1_out (Activation)  (None, 28, 28, 128)  0           ['conv3_block1_add[0][0]']       
                                                                                                  
 conv3_block2_1_conv (Conv2D)   (None, 28, 28, 128)  147584      ['conv3_block1_out[0][0]']       
                                                                                                  
 conv3_block2_1_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block2_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block2_1_relu (Activatio  (None, 28, 28, 128)  0          ['conv3_block2_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block2_2_conv (Conv2D)   (None, 28, 28, 128)  147584      ['conv3_block2_1_relu[0][0]']    
                                                                                                  
 conv3_block2_2_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block2_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block2_add (Add)         (None, 28, 28, 128)  0           ['conv3_block1_out[0][0]',       
                                                                  'conv3_block2_2_bn[0][0]']      
                                                                                                  
 conv3_block2_out (Activation)  (None, 28, 28, 128)  0           ['conv3_block2_add[0][0]']       
                                                                                                  
 conv3_block3_1_conv (Conv2D)   (None, 28, 28, 128)  147584      ['conv3_block2_out[0][0]']       
                                                                                                  
 conv3_block3_1_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block3_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block3_1_relu (Activatio  (None, 28, 28, 128)  0          ['conv3_block3_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block3_2_conv (Conv2D)   (None, 28, 28, 128)  147584      ['conv3_block3_1_relu[0][0]']    
                                                                                                  
 conv3_block3_2_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block3_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block3_add (Add)         (None, 28, 28, 128)  0           ['conv3_block2_out[0][0]',       
                                                                  'conv3_block3_2_bn[0][0]']      
                                                                                                  
 conv3_block3_out (Activation)  (None, 28, 28, 128)  0           ['conv3_block3_add[0][0]']       
                                                                                                  
 conv3_block4_1_conv (Conv2D)   (None, 28, 28, 128)  147584      ['conv3_block3_out[0][0]']       
                                                                                                  
 conv3_block4_1_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block4_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block4_1_relu (Activatio  (None, 28, 28, 128)  0          ['conv3_block4_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv3_block4_2_conv (Conv2D)   (None, 28, 28, 128)  147584      ['conv3_block4_1_relu[0][0]']    
                                                                                                  
 conv3_block4_2_bn (BatchNormal  (None, 28, 28, 128)  512        ['conv3_block4_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv3_block4_add (Add)         (None, 28, 28, 128)  0           ['conv3_block3_out[0][0]',       
                                                                  'conv3_block4_2_bn[0][0]']      
                                                                                                  
 conv3_block4_out (Activation)  (None, 28, 28, 128)  0           ['conv3_block4_add[0][0]']       
                                                                                                  
 conv4_block1_1_conv (Conv2D)   (None, 14, 14, 256)  295168      ['conv3_block4_out[0][0]']       
                                                                                                  
 conv4_block1_1_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block1_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block1_1_relu (Activatio  (None, 14, 14, 256)  0          ['conv4_block1_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block1_0_conv (Conv2D)   (None, 14, 14, 256)  33024       ['conv3_block4_out[0][0]']       
                                                                                                  
 conv4_block1_2_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block1_1_relu[0][0]']    
                                                                                                  
 conv4_block1_0_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block1_0_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block1_2_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block1_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block1_add (Add)         (None, 14, 14, 256)  0           ['conv4_block1_0_bn[0][0]',      
                                                                  'conv4_block1_2_bn[0][0]']      
                                                                                                  
 conv4_block1_out (Activation)  (None, 14, 14, 256)  0           ['conv4_block1_add[0][0]']       
                                                                                                  
 conv4_block2_1_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block1_out[0][0]']       
                                                                                                  
 conv4_block2_1_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block2_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block2_1_relu (Activatio  (None, 14, 14, 256)  0          ['conv4_block2_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block2_2_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block2_1_relu[0][0]']    
                                                                                                  
 conv4_block2_2_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block2_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block2_add (Add)         (None, 14, 14, 256)  0           ['conv4_block1_out[0][0]',       
                                                                  'conv4_block2_2_bn[0][0]']      
                                                                                                  
 conv4_block2_out (Activation)  (None, 14, 14, 256)  0           ['conv4_block2_add[0][0]']       
                                                                                                  
 conv4_block3_1_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block2_out[0][0]']       
                                                                                                  
 conv4_block3_1_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block3_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block3_1_relu (Activatio  (None, 14, 14, 256)  0          ['conv4_block3_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block3_2_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block3_1_relu[0][0]']    
                                                                                                  
 conv4_block3_2_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block3_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block3_add (Add)         (None, 14, 14, 256)  0           ['conv4_block2_out[0][0]',       
                                                                  'conv4_block3_2_bn[0][0]']      
                                                                                                  
 conv4_block3_out (Activation)  (None, 14, 14, 256)  0           ['conv4_block3_add[0][0]']       
                                                                                                  
 conv4_block4_1_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block3_out[0][0]']       
                                                                                                  
 conv4_block4_1_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block4_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block4_1_relu (Activatio  (None, 14, 14, 256)  0          ['conv4_block4_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block4_2_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block4_1_relu[0][0]']    
                                                                                                  
 conv4_block4_2_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block4_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block4_add (Add)         (None, 14, 14, 256)  0           ['conv4_block3_out[0][0]',       
                                                                  'conv4_block4_2_bn[0][0]']      
                                                                                                  
 conv4_block4_out (Activation)  (None, 14, 14, 256)  0           ['conv4_block4_add[0][0]']       
                                                                                                  
 conv4_block5_1_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block4_out[0][0]']       
                                                                                                  
 conv4_block5_1_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block5_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block5_1_relu (Activatio  (None, 14, 14, 256)  0          ['conv4_block5_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block5_2_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block5_1_relu[0][0]']    
                                                                                                  
 conv4_block5_2_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block5_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block5_add (Add)         (None, 14, 14, 256)  0           ['conv4_block4_out[0][0]',       
                                                                  'conv4_block5_2_bn[0][0]']      
                                                                                                  
 conv4_block5_out (Activation)  (None, 14, 14, 256)  0           ['conv4_block5_add[0][0]']       
                                                                                                  
 conv4_block6_1_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block5_out[0][0]']       
                                                                                                  
 conv4_block6_1_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block6_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block6_1_relu (Activatio  (None, 14, 14, 256)  0          ['conv4_block6_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv4_block6_2_conv (Conv2D)   (None, 14, 14, 256)  590080      ['conv4_block6_1_relu[0][0]']    
                                                                                                  
 conv4_block6_2_bn (BatchNormal  (None, 14, 14, 256)  1024       ['conv4_block6_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv4_block6_add (Add)         (None, 14, 14, 256)  0           ['conv4_block5_out[0][0]',       
                                                                  'conv4_block6_2_bn[0][0]']      
                                                                                                  
 conv4_block6_out (Activation)  (None, 14, 14, 256)  0           ['conv4_block6_add[0][0]']       
                                                                                                  
 conv5_block1_1_conv (Conv2D)   (None, 7, 7, 512)    1180160     ['conv4_block6_out[0][0]']       
                                                                                                  
 conv5_block1_1_bn (BatchNormal  (None, 7, 7, 512)   2048        ['conv5_block1_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block1_1_relu (Activatio  (None, 7, 7, 512)   0           ['conv5_block1_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv5_block1_0_conv (Conv2D)   (None, 7, 7, 512)    131584      ['conv4_block6_out[0][0]']       
                                                                                                  
 conv5_block1_2_conv (Conv2D)   (None, 7, 7, 512)    2359808     ['conv5_block1_1_relu[0][0]']    
                                                                                                  
 conv5_block1_0_bn (BatchNormal  (None, 7, 7, 512)   2048        ['conv5_block1_0_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block1_2_bn (BatchNormal  (None, 7, 7, 512)   2048        ['conv5_block1_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block1_add (Add)         (None, 7, 7, 512)    0           ['conv5_block1_0_bn[0][0]',      
                                                                  'conv5_block1_2_bn[0][0]']      
                                                                                                  
 conv5_block1_out (Activation)  (None, 7, 7, 512)    0           ['conv5_block1_add[0][0]']       
                                                                                                  
 conv5_block2_1_conv (Conv2D)   (None, 7, 7, 512)    2359808     ['conv5_block1_out[0][0]']       
                                                                                                  
 conv5_block2_1_bn (BatchNormal  (None, 7, 7, 512)   2048        ['conv5_block2_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block2_1_relu (Activatio  (None, 7, 7, 512)   0           ['conv5_block2_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv5_block2_2_conv (Conv2D)   (None, 7, 7, 512)    2359808     ['conv5_block2_1_relu[0][0]']    
                                                                                                  
 conv5_block2_2_bn (BatchNormal  (None, 7, 7, 512)   2048        ['conv5_block2_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block2_add (Add)         (None, 7, 7, 512)    0           ['conv5_block1_out[0][0]',       
                                                                  'conv5_block2_2_bn[0][0]']      
                                                                                                  
 conv5_block2_out (Activation)  (None, 7, 7, 512)    0           ['conv5_block2_add[0][0]']       
                                                                                                  
 conv5_block3_1_conv (Conv2D)   (None, 7, 7, 512)    2359808     ['conv5_block2_out[0][0]']       
                                                                                                  
 conv5_block3_1_bn (BatchNormal  (None, 7, 7, 512)   2048        ['conv5_block3_1_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block3_1_relu (Activatio  (None, 7, 7, 512)   0           ['conv5_block3_1_bn[0][0]']      
 n)                                                                                               
                                                                                                  
 conv5_block3_2_conv (Conv2D)   (None, 7, 7, 512)    2359808     ['conv5_block3_1_relu[0][0]']    
                                                                                                  
 conv5_block3_2_bn (BatchNormal  (None, 7, 7, 512)   2048        ['conv5_block3_2_conv[0][0]']    
 ization)                                                                                         
                                                                                                  
 conv5_block3_add (Add)         (None, 7, 7, 512)    0           ['conv5_block2_out[0][0]',       
                                                                  'conv5_block3_2_bn[0][0]']      
                                                                                                  
 conv5_block3_out (Activation)  (None, 7, 7, 512)    0           ['conv5_block3_add[0][0]']       
                                                                                                  
 avg_pool (GlobalAveragePooling  (None, 512)         0           ['conv5_block3_out[0][0]']       
 2D)                                                                                              
                                                                                                  
 predictions (Dense)            (None, 1000)         513000      ['avg_pool[0][0]']               
                                                                                                  
==================================================================================================
Total params: 21,827,624
Trainable params: 21,810,472
Non-trainable params: 17,152
__________________________________________________________________________________________________

It turns out that they do not match PyTorch's numbers which is something I do not understand.
For info, the same happens for ResNet50 (already implemented), and you can see that in the following colab: https://colab.research.google.com/drive/1RCmWkpwuKFapzzPacbqodxz0mqt9Igft?usp=sharing

This appears to be due to the fact that there are bias in TF's convs, and not in PyTorch's ones, and also due to how PyTorch counts BN's params.

However, the last dimension before the dense layer matches, and the size (WH) of the feature maps matches as well.

zaccharieramzi · 2022-04-06T16:00:39Z

So 2 things w.r.t. to the comparison with PyTorch:

indeed the only difference in the trainable parameter count is the use of bias in Keras. Imo, there shouldn't be any bias in the convolutions given we have affine BatchNorm just afterwards. Maybe having an option allowing to use it or not would be nice, I am going to implement it.
the batch norm in PyTorch indeed doesn't count the running stats as parameters but as buffers.

Side note: the default momentum values for the batch norm in Keras and PyTorch are not the same: 0.9 for PyTorch and 0.99 in Keras. This, coupled with the use of bias in TF will mean that the training will be different between the 2 frameworks.

I think it would be nice to implement the possibility to change the batch norm momentum to fit PyTorch's one, I am going to open a new issue and a new PR about this.

qlzh727 · 2022-04-06T20:40:19Z

Thanks for the PR. Could u make the sure the weights for imagenet also available? Also please make sure to run the evaluation with imagenet eval set, and report the acc number in the PR.

zaccharieramzi · 2022-04-07T07:10:00Z

@qlzh727 should I train the models also for the no bias case?

Also, could you point me to the script that were used to train the bigger models? I couldn't find them but maybe didn't look well enough

zaccharieramzi · 2022-04-07T11:47:31Z

@qlzh727 I was looking for an official script to train a classification model on imagenet, and stumbled upon this: https://github.com/tensorflow/models

There is a typical example allowing to train classification models, but I also noticed that there is already an implementation of ResNet without the bias and with the basic blocks here. I don't think the weights are available, but now my question is more: should we re-implement it here given it's already present in this other repo?

Basically, is there a difference in concern between keras applications and tensorflow models?

zaccharieramzi · 2022-04-07T12:59:45Z

I just noticed that one additional difference with the PyTorch implementation (in both keras applications and tensorflow models) is the initialization strategy for the convolution weights.

Framework	Init strategy
PyTorch	He normal, `nn.init.kaiming_normal_(m.weight, mode="fan_out", nonlinearity="relu")`
Keras	Glorot uniform, default of `Conv2D`
TensorFlow	Variance Scaling, at least by default

qlzh727 · 2022-04-07T16:41:53Z

@qlzh727 should I train the models also for the no bias case?

Also, could you point me to the script that were used to train the bigger models? I couldn't find them but maybe didn't look well enough

We currently don't have any script for retrain the model. Keras application was used for fine tuning and we usually reuse weights/checkpoints from original paper (if it was published).

qlzh727 · 2022-04-07T16:44:09Z

@qlzh727 I was looking for an official script to train a classification model on imagenet, and stumbled upon this: https://github.com/tensorflow/models

There is a typical example allowing to train classification models, but I also noticed that there is already an implementation of ResNet without the bias and with the basic blocks here. I don't think the weights are available, but now my question is more: should we re-implement it here given it's already present in this other repo?

Basically, is there a difference in concern between keras applications and tensorflow models?

tensorflow-models is more focused on end to end solutions, and if that's already available in tf-models, we probably can skip it here in keras.application (given that you can't get any existing weigths).

zaccharieramzi · 2022-04-07T17:33:00Z

Well the original paper did train both resnet 18 and 34, but not sure in which framework or even whether the weights are available.
Do you know where you obtained the resnet 50 weights ?

Another solution would be to translate the ones from PyTorch, potentially forcing the bias to 0 for the original implementations with bias. Wdyt?

EDIT

One last thing is that if we do not include the resnet 18 and 34 here, it might still be nice to have a pointer to tensorflow/models, in order for people looking for an implementation to find it easily (this is not the case rn, see keras-team/keras-applications#151)

zaccharieramzi · 2022-05-11T08:32:17Z

@qlzh727 Indeed since I am porting from PyTorch I needed to use their preprocessing.
I was not able to find the weights of the resnet34 in caffe, and the resnet18 weights appear to be only available here.

Here are my tentative answers:

Since anyway we wanted to retrain the models (cf this comment), it's only going to be a temporary issue. We can simply document it well, in particular in the model and preprocessing docs. There could by the way be a tf.keras.applications.resnet18.preprocess_input similarly to what exists for resnet50.
In the current state we could do the correction of preprocessing in the model, before retraining.

If however, you have at your disposal the caffe weights for both models (and by any chance the script to port them), I can definitely do the porting, and checks.

zaccharieramzi · 2022-05-14T15:59:53Z

I just found out something about the way torch applies batch norm at eval time that might explain the difference in accuracy I noticed here.

You can read about it here.

KaleabTessera · 2022-06-30T19:31:43Z

Any progress on this? This would be really great to have!

gbaned · 2022-07-06T13:32:00Z

@zaccharieramzi Can you please resolve conflicts? Thank you!

zaccharieramzi · 2022-07-06T14:15:32Z

@gbaned should be done

qlzh727 · 2022-08-08T19:59:35Z

Sorry for the long wait, since end user could easily miss the preprocess API with pytorch format, how about we include the preprocess as part of the model, and control it via a include_preprocessing flag on the model. We have take this approach for several other models in the applications.

LukeWood · 2022-08-25T17:19:41Z

Sorry for the long wait, since end user could easily miss the preprocess API with pytorch format, how about we include the preprocess as part of the model, and control it via a include_preprocessing flag on the model. We have take this approach for several other models in the applications.

Due to the fact that the model requires a different preprocessing for inputs in the inputs between the ResNet18/34 and the other ResNets, we would probably need to re-train these weights. Let's migrate this to a PR on keras-cv. Please send a pull request to KerasCV, and place the model in the models package:

https://github.com/keras-team/keras-cv/tree/master/keras_cv/models

from there, we can retrain the models

zaccharieramzi · 2022-09-16T10:57:58Z

@LukeWood sure, opening this PR keras-team/keras-cv#805

fchollet · 2022-09-22T17:13:04Z

@LukeWood sure, opening this PR keras-team/keras-cv#805

Thank you. Let's move to the discussion to the KerasCV PR.

zaccharieramzi · 2022-09-22T17:22:49Z

Just mentioning for those following the conversation that the corresponding PR in keras-cv has been merged.

implemented resnet18 and resnet34

7b1c7fc

google-ml-butler bot added the size:M label Apr 5, 2022

google-ml-butler bot assigned gbaned Apr 5, 2022

tomMoral added 3 commits April 5, 2022 10:51

FIX doc header

a4d0ab6

FIX architecture following Table.1 of the original paper

c9970cf

See https://arxiv.org/abs/1512.03385

CLN factorize and rename

83e7e6c

gbaned requested a review from qlzh727 April 5, 2022 14:35

google-ml-butler bot added the keras-team-review-pending Pending review by a Keras team member. label Apr 5, 2022

gbaned removed the keras-team-review-pending Pending review by a Keras team member. label Apr 5, 2022

added resnet18 and rest34 to applications test

97da126

chunduriv mentioned this pull request Apr 6, 2022

Feature request: ResNet34 tensorflow/tensorflow#44099

Closed

added the possibility to not use bias in resnet v1

324f77e

zaccharieramzi marked this pull request as ready for review April 6, 2022 16:18

zaccharieramzi added 2 commits April 7, 2022 12:15

added possibility to skip the first shortcut in small resnets

d82fd80

corrected linting

91b7f8b

gbaned requested review from qlzh727 and removed request for qlzh727 April 7, 2022 11:09

google-ml-butler bot added the keras-team-review-pending Pending review by a Keras team member. label Apr 7, 2022

zaccharieramzi mentioned this pull request Apr 7, 2022

Small ResNets have an extra convolution tensorflow/models#10583

Open

1 task

divyashreepathihalli removed the keras-team-review-pending Pending review by a Keras team member. label Apr 7, 2022

qlzh727 requested a review from fchollet May 10, 2022 20:30

google-ml-butler bot added the keras-team-review-pending Pending review by a Keras team member. label May 10, 2022

qlzh727 removed the keras-team-review-pending Pending review by a Keras team member. label May 12, 2022

gowthamkpr mentioned this pull request Jun 5, 2022

Addition of Resnet18 Model in Application #15494

Closed

gbaned added the stat:awaiting response from contributor label Jul 6, 2022

zaccharieramzi added 8 commits July 6, 2022 16:02

solved merge conflict by rejecting changes in applications resnet

c590c31

blacked keras resnet

50de862

blacked keras application test

a4c4ca2

blacked with correct line length

1db1b94

corrected isort in applications

16916a2

readded convnext in applications test

b3e9f8c

removed pylint instruction

43d8454

corrected import order in resnet

109fba4

google-ml-butler bot removed the stat:awaiting response from contributor label Jul 6, 2022

gbaned requested review from qlzh727 and removed request for qlzh727 August 5, 2022 07:39

google-ml-butler bot added the keras-team-review-pending Pending review by a Keras team member. label Aug 5, 2022

zaccharieramzi mentioned this pull request Sep 16, 2022

added the implementations of resnet 18 and 34 keras-team/keras-cv#805

Merged

5 tasks

fchollet closed this Sep 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implemented resnet18 and resnet34 #16363

implemented resnet18 and resnet34 #16363

zaccharieramzi commented Apr 5, 2022

zaccharieramzi commented Apr 6, 2022

zaccharieramzi commented Apr 6, 2022 •

edited

Loading

qlzh727 commented Apr 6, 2022

zaccharieramzi commented Apr 7, 2022

zaccharieramzi commented Apr 7, 2022

zaccharieramzi commented Apr 7, 2022 •

edited

Loading

qlzh727 commented Apr 7, 2022

qlzh727 commented Apr 7, 2022

zaccharieramzi commented Apr 7, 2022 •

edited

Loading

zaccharieramzi commented May 11, 2022

zaccharieramzi commented May 14, 2022

KaleabTessera commented Jun 30, 2022

gbaned commented Jul 6, 2022

zaccharieramzi commented Jul 6, 2022

qlzh727 commented Aug 8, 2022

LukeWood commented Aug 25, 2022

zaccharieramzi commented Sep 16, 2022

fchollet commented Sep 22, 2022

zaccharieramzi commented Sep 22, 2022

implemented resnet18 and resnet34 #16363

implemented resnet18 and resnet34 #16363

Conversation

zaccharieramzi commented Apr 5, 2022

zaccharieramzi commented Apr 6, 2022

zaccharieramzi commented Apr 6, 2022 • edited Loading

qlzh727 commented Apr 6, 2022

zaccharieramzi commented Apr 7, 2022

zaccharieramzi commented Apr 7, 2022

zaccharieramzi commented Apr 7, 2022 • edited Loading

qlzh727 commented Apr 7, 2022

qlzh727 commented Apr 7, 2022

zaccharieramzi commented Apr 7, 2022 • edited Loading

EDIT

zaccharieramzi commented May 11, 2022

zaccharieramzi commented May 14, 2022

KaleabTessera commented Jun 30, 2022

gbaned commented Jul 6, 2022

zaccharieramzi commented Jul 6, 2022

qlzh727 commented Aug 8, 2022

LukeWood commented Aug 25, 2022

zaccharieramzi commented Sep 16, 2022

fchollet commented Sep 22, 2022

zaccharieramzi commented Sep 22, 2022

zaccharieramzi commented Apr 6, 2022 •

edited

Loading

zaccharieramzi commented Apr 7, 2022 •

edited

Loading

zaccharieramzi commented Apr 7, 2022 •

edited

Loading