Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Gap (significantly lower than reported) #13

Open
convnets opened this issue Apr 6, 2020 · 2 comments
Open

Performance Gap (significantly lower than reported) #13

convnets opened this issue Apr 6, 2020 · 2 comments

Comments

@convnets
Copy link

convnets commented Apr 6, 2020

Hi,

I have tried to reproduce the reported result. However, my results are lower than the paper claimed. The results are shown below:

Evaluating on val_seen env ...
Epoch: [275][1/16]      Time 1.875 (1.875)      Loss inf (inf)
Epoch: [275][2/16]      Time 1.779 (1.827)      Loss inf (inf)
Epoch: [275][3/16]      Time 1.974 (1.876)      Loss inf (inf)
Epoch: [275][4/16]      Time 1.946 (1.894)      Loss inf (inf)
Epoch: [275][5/16]      Time 1.816 (1.878)      Loss inf (inf)
Epoch: [275][6/16]      Time 1.790 (1.863)      Loss inf (inf)
Epoch: [275][7/16]      Time 1.829 (1.858)      Loss inf (inf)
Epoch: [275][8/16]      Time 1.910 (1.865)      Loss inf (inf)
Epoch: [275][9/16]      Time 1.683 (1.845)      Loss inf (inf)
Epoch: [275][10/16]     Time 1.947 (1.855)      Loss inf (inf)
Epoch: [275][11/16]     Time 1.788 (1.849)      Loss inf (inf)
Epoch: [275][12/16]     Time 1.887 (1.852)      Loss inf (inf)
Epoch: [275][13/16]     Time 1.575 (1.831)      Loss inf (inf)
Epoch: [275][14/16]     Time 1.704 (1.822)      Loss inf (inf)
Epoch: [275][15/16]     Time 1.492 (1.800)      Loss inf (inf)
Epoch: [275][16/16]     Time 1.724 (1.795)      Loss inf (inf)
============================
success rate: 0.6317335945151812
rollback rate: 0.20372184133202742
rollback success rate: 0.07639569049951028
oscillating rate: 0.0
oscillating success rate: 0.0
============================
| nav_error: 3.6218101292139466 | oracle_error: 2.1551905617521285 | steps: 7.056862745098039 | lengths: 12.267354546942984 | spl: 0.5606026181302899 | success_rate: 0.6284313725490196 | oracle_rate: 0.7441176470588236
Evaluating on val_unseen env ...
Epoch: [275][1/37]      Time 1.413 (1.413)      Loss inf (inf)
Epoch: [275][2/37]      Time 1.424 (1.419)      Loss inf (inf)
Epoch: [275][3/37]      Time 1.279 (1.372)      Loss inf (inf)
Epoch: [275][4/37]      Time 1.418 (1.384)      Loss inf (inf)
Epoch: [275][5/37]      Time 1.312 (1.369)      Loss inf (inf)
Epoch: [275][6/37]      Time 1.146 (1.332)      Loss inf (inf)
Epoch: [275][7/37]      Time 1.152 (1.306)      Loss inf (inf)
Epoch: [275][8/37]      Time 1.043 (1.273)      Loss inf (inf)
Epoch: [275][9/37]      Time 1.016 (1.245)      Loss inf (inf)
Epoch: [275][10/37]     Time 1.085 (1.229)      Loss inf (inf)
Epoch: [275][11/37]     Time 1.080 (1.215)      Loss inf (inf)
Epoch: [275][12/37]     Time 0.976 (1.195)      Loss inf (inf)
Epoch: [275][13/37]     Time 0.997 (1.180)      Loss inf (inf)
Epoch: [275][14/37]     Time 1.005 (1.167)      Loss inf (inf)
Epoch: [275][15/37]     Time 1.037 (1.159)      Loss inf (inf)
Epoch: [275][16/37]     Time 0.944 (1.145)      Loss inf (inf)
Epoch: [275][17/37]     Time 0.899 (1.131)      Loss inf (inf)
Epoch: [275][18/37]     Time 0.955 (1.121)      Loss inf (inf)
Epoch: [275][19/37]     Time 0.891 (1.109)      Loss inf (inf)
Epoch: [275][20/37]     Time 0.906 (1.099)      Loss inf (inf)
Epoch: [275][21/37]     Time 0.889 (1.089)      Loss inf (inf)
Epoch: [275][22/37]     Time 0.888 (1.080)      Loss inf (inf)
Epoch: [275][23/37]     Time 0.855 (1.070)      Loss inf (inf)
Epoch: [275][24/37]     Time 0.890 (1.062)      Loss inf (inf)
Epoch: [275][25/37]     Time 0.840 (1.054)      Loss inf (inf)
Epoch: [275][26/37]     Time 0.883 (1.047)      Loss inf (inf)
Epoch: [275][27/37]     Time 0.857 (1.040)      Loss inf (inf)
Epoch: [275][28/37]     Time 0.831 (1.033)      Loss inf (inf)
Epoch: [275][29/37]     Time 0.843 (1.026)      Loss inf (inf)
Epoch: [275][30/37]     Time 0.820 (1.019)      Loss inf (inf)
Epoch: [275][31/37]     Time 0.850 (1.014)      Loss inf (inf)
Epoch: [275][32/37]     Time 0.901 (1.010)      Loss inf (inf)
Epoch: [275][33/37]     Time 0.832 (1.005)      Loss inf (inf)
Epoch: [275][34/37]     Time 0.839 (1.000)      Loss inf (inf)
Epoch: [275][35/37]     Time 0.918 (0.998)      Loss inf (inf)
Epoch: [275][36/37]     Time 0.826 (0.993)      Loss inf (inf)
Epoch: [275][37/37]     Time 0.839 (0.989)      Loss inf (inf)
============================
success rate: 0.44146445295870584
rollback rate: 0.5525755640698169
rollback success rate: 0.16773094934014474
oscillating rate: 0.0
oscillating success rate: 0.0
============================
| nav_error: 5.872141008455078 | oracle_error: 3.632631547700617 | steps: 8.510004257130694 | lengths: 15.75594853073863 | spl: 0.32215776203613483 | success_rate: 0.4384844614729672 | oracle_rate: 0.5696040868454662

In the paper, Table 1 (without data augmentation), the expected result should be
val_seen (NE | SR | OSR | SPL): 3.69 | 0.65 | 0.72 | 0.59 val_unseen(NE | SR | OSR | SPL): 5.36 | 0.48 | 0.61 | 0.37. However, I obtained val_seen SPL 0.56, 3% lower and val_unseen SPL 0.32, 5% lower.

My configurations are posted as follows:

# Name                    Version                   Build  Channel
python                    3.8.2                hcf32534_0
pytorch                   1.4.0           py3.8_cuda10.1.243_cudnn7.6.3_0    pytorch
numpy                     1.18.1           py38h4f9e942_0
networkx                  2.4                      pypi_0    pypi
torchvision               0.5.0                py38_cu101    pytorch

Can you help?

@convnets convnets changed the title Performance Gap (lower than reported) Performance Gap (significantly lower than reported) Apr 6, 2020
@convnets
Copy link
Author

convnets commented Apr 11, 2020

Even with pytorch 0.4.1, the performance gap still exists.

R2RBatch loaded with 2349 instructions, using splits: val_unseen
Evaluating on val_seen env ...
Epoch: [90][1/16]       Time 1.582 (1.582)      Loss inf (inf)
Epoch: [90][2/16]       Time 1.595 (1.588)      Loss inf (inf)
Epoch: [90][3/16]       Time 1.803 (1.660)      Loss inf (inf)
Epoch: [90][4/16]       Time 1.772 (1.688)      Loss inf (inf)
Epoch: [90][5/16]       Time 1.628 (1.676)      Loss inf (inf)
Epoch: [90][6/16]       Time 1.615 (1.666)      Loss inf (inf)
Epoch: [90][7/16]       Time 1.651 (1.664)      Loss inf (inf)
Epoch: [90][8/16]       Time 1.746 (1.674)      Loss inf (inf)
Epoch: [90][9/16]       Time 1.500 (1.655)      Loss inf (inf)
Epoch: [90][10/16]      Time 1.787 (1.668)      Loss inf (inf)
Epoch: [90][11/16]      Time 1.587 (1.661)      Loss inf (inf)
Epoch: [90][12/16]      Time 1.690 (1.663)      Loss inf (inf)
Epoch: [90][13/16]      Time 1.364 (1.640)      Loss inf (inf)
Epoch: [90][14/16]      Time 1.530 (1.632)      Loss inf (inf)
Epoch: [90][15/16]      Time 1.277 (1.608)      Loss inf (inf)
Epoch: [90][16/16]      Time 1.512 (1.602)      Loss inf (inf)
============================
success rate: 0.614103819784525
rollback rate: 0.15572967678746327
rollback success rate: 0.05484818805093046
oscillating rate: 0.0
oscillating success rate: 0.0
============================
| nav_error: 3.85835455597601 | oracle_error: 2.362730659573285 | steps: 6.845098039215686 | lengths: 11.64444015803238 | spl: 0.5553807439701335 | success_rate: 0.6107843137254902 | oracle_rate: 0.6990196078431372
Evaluating on val_unseen env ...
Epoch: [90][1/37]       Time 1.260 (1.260)      Loss inf (inf)
Epoch: [90][2/37]       Time 1.225 (1.242)      Loss inf (inf)
Epoch: [90][3/37]       Time 1.038 (1.174)      Loss inf (inf)
Epoch: [90][4/37]       Time 1.194 (1.179)      Loss inf (inf)
Epoch: [90][5/37]       Time 1.071 (1.157)      Loss inf (inf)
Epoch: [90][6/37]       Time 0.867 (1.109)      Loss inf (inf)
Epoch: [90][7/37]       Time 0.891 (1.078)      Loss inf (inf)
Epoch: [90][8/37]       Time 0.760 (1.038)      Loss inf (inf)
Epoch: [90][9/37]       Time 0.728 (1.004)      Loss inf (inf)
Epoch: [90][10/37]      Time 0.806 (0.984)      Loss inf (inf)
Epoch: [90][11/37]      Time 0.804 (0.968)      Loss inf (inf)
Epoch: [90][12/37]      Time 0.690 (0.944)      Loss inf (inf)
Epoch: [90][13/37]      Time 0.710 (0.926)      Loss inf (inf)
Epoch: [90][14/37]      Time 0.709 (0.911)      Loss inf (inf)
Epoch: [90][15/37]      Time 0.751 (0.900)      Loss inf (inf)
Epoch: [90][16/37]      Time 0.642 (0.884)      Loss inf (inf)
Epoch: [90][17/37]      Time 0.603 (0.868)      Loss inf (inf)
Epoch: [90][18/37]      Time 0.652 (0.856)      Loss inf (inf)
Epoch: [90][19/37]      Time 0.596 (0.842)      Loss inf (inf)
Epoch: [90][20/37]      Time 0.610 (0.830)      Loss inf (inf)
Epoch: [90][21/37]      Time 0.589 (0.819)      Loss inf (inf)
Epoch: [90][22/37]      Time 0.591 (0.808)      Loss inf (inf)
Epoch: [90][23/37]      Time 0.567 (0.798)      Loss inf (inf)
Epoch: [90][24/37]      Time 0.599 (0.790)      Loss inf (inf)
Epoch: [90][25/37]      Time 0.557 (0.780)      Loss inf (inf)
Epoch: [90][26/37]      Time 0.586 (0.773)      Loss inf (inf)
Epoch: [90][27/37]      Time 0.575 (0.766)      Loss inf (inf)
Epoch: [90][28/37]      Time 0.555 (0.758)      Loss inf (inf)
Epoch: [90][29/37]      Time 0.559 (0.751)      Loss inf (inf)
Epoch: [90][30/37]      Time 0.547 (0.744)      Loss inf (inf)
Epoch: [90][31/37]      Time 0.568 (0.739)      Loss inf (inf)
Epoch: [90][32/37]      Time 0.604 (0.734)      Loss inf (inf)
Epoch: [90][33/37]      Time 0.551 (0.729)      Loss inf (inf)
Epoch: [90][34/37]      Time 0.560 (0.724)      Loss inf (inf)
Epoch: [90][35/37]      Time 0.619 (0.721)      Loss inf (inf)
Epoch: [90][36/37]      Time 0.552 (0.716)      Loss inf (inf)
Epoch: [90][37/37]      Time 0.553 (0.712)      Loss inf (inf)
============================
success rate: 0.45977011494252873
rollback rate: 0.42358450404427417
rollback success rate: 0.13452532992762878
oscillating rate: 0.0
oscillating success rate: 0.0
============================
| nav_error: 5.879764965324538 | oracle_error: 3.5521622260626846 | steps: 7.909748829289059 | lengths: 14.349212914918965 | spl: 0.3555266413501602 | success_rate: 0.4559386973180077 | oracle_rate: 0.5751383567475522
# Name                    Version                   Build  Channel
python                    3.7.7           hcf32534_0_cpython
pytorch                   0.4.1           py37_cuda9.2.148_cudnn7.1.4_1  [cuda92]  pytorch
numpy                     1.15.4           py37h7e9f1db_0
networkx                  2.4                      pypi_0    pypi
torchvision               0.2.1                    py37_0

More information about training and testing script:

#!/bin/sh
CUDA_VISIBLE_DEVICES=1 python tasks/R2R-pano/main.py \
    --exp_name 'regretful-agent-data|real' \
    --batch_size 64 \
    --img_fc_dim 1024 \
    --rnn_hidden_size 512 \
    --eval_every_epochs 5 \
    --arch 'regretful' \
    --progress_marker 1
#!/bin/sh
CUDA_VISIBLE_DEVICES=1 python tasks/R2R-pano/main.py \
    --exp_name 'regretful-agent-data|real' \
    --batch_size 64 \
    --img_fc_dim 1024 \
    --rnn_hidden_size 512 \
    --eval_every_epochs 5 \
    --arch 'regretful' \
    --progress_marker 1 \
    --eval_only 1 \
    --resume 'best'

val_seen gap 3.4% lower, val_unseen gap 1.4% lower.
I do not have a clue of what's happening here....

@Hannah-hh
Copy link

Excuse me, I want to ask you a question. What are the different meanings of 'success rate' and 'success_rate' in the result? I hope to get your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants