The drop connect rate (aka survival rate) is incorrect #200

xhluca · 2020-12-20T23:06:50Z

I originally posted this as an issue here: qubvel/efficientnet#135

However I noticed the two implementations were the same and the error exists here as well, so I decided to post it here.

I just verified with the reference tf.keras implementation, and here are the results. Below is the output for B5

This implementation's drop connect rate

(index, name, rate)

0 block1b_drop 0.9875
1 block1c_drop 0.975
2 block2b_drop 0.95
3 block2c_drop 0.9375
4 block2d_drop 0.925
5 block2e_drop 0.9125
6 block3b_drop 0.8875
7 block3c_drop 0.875
8 block3d_drop 0.8625
9 block3e_drop 0.85
10 block4b_drop 0.825
11 block4c_drop 0.8125
12 block4d_drop 0.8
13 block4e_drop 0.7875
14 block4f_drop 0.775
15 block4g_drop 0.7625
16 block5b_drop 0.7375
17 block5c_drop 0.725
18 block5d_drop 0.7124999999999999
19 block5e_drop 0.7
20 block5f_drop 0.6875
21 block5g_drop 0.675
22 block6b_drop 0.6499999999999999
23 block6c_drop 0.6375
24 block6d_drop 0.625
25 block6e_drop 0.6125
26 block6f_drop 0.6
27 block6g_drop 0.5874999999999999
28 block6h_drop 0.575
29 block6i_drop 0.5625
30 block7b_drop 0.5375
31 block7c_drop 0.5249999999999999
32 top_dropout 0.6

Tensorflow's drop connect rate

0 block1b_drop 0.9948717948717949
1 block1c_drop 0.9897435897435898
2 block2b_drop 0.9794871794871794
3 block2c_drop 0.9743589743589743
4 block2d_drop 0.9692307692307692
5 block2e_drop 0.9641025641025641
6 block3b_drop 0.9538461538461538
7 block3c_drop 0.9487179487179487
8 block3d_drop 0.9435897435897436
9 block3e_drop 0.9384615384615385
10 block4b_drop 0.9282051282051282
11 block4c_drop 0.9230769230769231
12 block4d_drop 0.9179487179487179
13 block4e_drop 0.9128205128205128
14 block4f_drop 0.9076923076923077
15 block4g_drop 0.9025641025641026
16 block5b_drop 0.8923076923076922
17 block5c_drop 0.8871794871794871
18 block5d_drop 0.882051282051282
19 block5e_drop 0.8769230769230769
20 block5f_drop 0.8717948717948718
21 block5g_drop 0.8666666666666667
22 block6b_drop 0.8564102564102564
23 block6c_drop 0.8512820512820513
24 block6d_drop 0.8461538461538461
25 block6e_drop 0.841025641025641
26 block6f_drop 0.8358974358974359
27 block6g_drop 0.8307692307692307
28 block6h_drop 0.8256410256410256
29 block6i_drop 0.8205128205128205
30 block7b_drop 0.8102564102564103
31 block7c_drop 0.8051282051282052
32 top_dropout 0.6

At index 18 it's off by a significant amount

The text was updated successfully, but these errors were encountered:

darcula1993 · 2020-12-21T07:06:19Z

I check the drop rate per block and it looks fine:

block1a_ 1.0
block1b_ 0.9875
block1c_ 0.975
block2a_ 0.9625
block2b_ 0.95
block2c_ 0.9375
block2d_ 0.925
block2e_ 0.9125
block3a_ 0.9
block3b_ 0.8875
block3c_ 0.875
block3d_ 0.8625
block3e_ 0.85
block4a_ 0.8375
block4b_ 0.825
block4c_ 0.8125
block4d_ 0.8
block4e_ 0.7875
block4f_ 0.775
block4g_ 0.7625
block5a_ 0.75
block5b_ 0.7375
block5c_ 0.725
block5d_ 0.7124999999999999
block5e_ 0.7
block5f_ 0.6875
block5g_ 0.675
block6a_ 0.6625
block6b_ 0.6499999999999999
block6c_ 0.6375
block6d_ 0.625
block6e_ 0.6125
block6f_ 0.6
block6g_ 0.5874999999999999
block6h_ 0.575
block6i_ 0.5625
block7a_ 0.55
block7b_ 0.5375
block7c_ 0.5249999999999999

xhluca · 2020-12-22T15:21:57Z

@darcula1993 I'm confused. Shouldn't the block rate be at ~0.8 for the final block since the drop_connect_rate is 0.2 by default?

xhluca · 2020-12-22T15:26:26Z

So it turns out I pasted different values. However the problem remains as indicated.

darcula1993 · 2020-12-23T05:26:35Z

        for j in range(round_repeats(args.pop('repeats'))):
            # The first block needs to take care of stride and filter size increase.
            if j > 0:
                args['strides'] = 1
                args['filters_in'] = args['filters_out']
            x = block(x, activation_fn, drop_connect_rate * b / blocks,
                      name='block{}{}_'.format(i + 1, chr(j + 97)), **args)
            b += 1

I check the code and seems that b can be greater than num of blocks,not sure why.

xhluca · 2020-12-23T17:14:04Z

I've observed the same thing as well.

fmbahrt · 2021-01-02T22:55:42Z

Qubvel's implementation does not calculate the total number of blocks correctly for configurations larger than B0.

innat · 2021-01-03T07:00:31Z

Practically it does perform better than official. -_- .

xhluca · 2021-01-04T15:48:31Z

Practically it does perform better than official. -_- .

I've only observed better performance in one case so I'm not sure it generalizes. In that case, the improved performance does indicate that extreme low survival rates (<0.3) might be a good regularization approach.

innat · 2021-01-05T08:10:22Z

Well, I'm not sure, maybe I need to look again properly. In fact, I spent almost a week assuming that there probably some problem with my data loader using the official efficient-net. But when I use non-official implementation, it was just fine.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The drop connect rate (aka survival rate) is incorrect #200

The drop connect rate (aka survival rate) is incorrect #200

xhluca commented Dec 20, 2020 •

edited

Loading

darcula1993 commented Dec 21, 2020

xhluca commented Dec 22, 2020 •

edited

Loading

xhluca commented Dec 22, 2020

darcula1993 commented Dec 23, 2020

xhluca commented Dec 23, 2020

fmbahrt commented Jan 2, 2021

innat commented Jan 3, 2021 •

edited

Loading

xhluca commented Jan 4, 2021

innat commented Jan 5, 2021

The drop connect rate (aka survival rate) is incorrect #200

The drop connect rate (aka survival rate) is incorrect #200

Comments

xhluca commented Dec 20, 2020 • edited Loading

This implementation's drop connect rate

Tensorflow's drop connect rate

darcula1993 commented Dec 21, 2020

xhluca commented Dec 22, 2020 • edited Loading

xhluca commented Dec 22, 2020

darcula1993 commented Dec 23, 2020

xhluca commented Dec 23, 2020

fmbahrt commented Jan 2, 2021

innat commented Jan 3, 2021 • edited Loading

xhluca commented Jan 4, 2021

innat commented Jan 5, 2021

xhluca commented Dec 20, 2020 •

edited

Loading

xhluca commented Dec 22, 2020 •

edited

Loading

innat commented Jan 3, 2021 •

edited

Loading