Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-770] Remove fixed seed in flaky test #11958

Merged
merged 3 commits into from
Aug 1, 2018

Conversation

apeforest
Copy link
Contributor

@apeforest apeforest commented Jul 31, 2018

Description

Getting rid of the fixed seed for test_module.test_monitor as the flakiness cannot be reproduced.
Issue reported: #11706

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Get rid of the fixed seed for test_module.test_monitor

Comments

Can pass more than 10k times on CPU and GPU:
CPU on m4.4xlarge

test_module.test_monitor ... [DEBUG] 1 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=341502476 to reproduce.
[DEBUG] 2 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1911419844 to reproduce.
[DEBUG] 3 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=2081311389 to reproduce.
[DEBUG] 4 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1300376807 to reproduce.
[DEBUG] 5 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=747761998 to reproduce.
...
[DEBUG] 9993 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1960288796 to reproduce.
[DEBUG] 9994 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=126697537 to reproduce.
[DEBUG] 9995 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=508893563 to reproduce.
[DEBUG] 9996 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=800422860 to reproduce.
[DEBUG] 9997 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1152882755 to reproduce.
[DEBUG] 9998 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=2102456845 to reproduce.
[DEBUG] 9999 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=343132162 to reproduce.
[DEBUG] 10000 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1430358686 to reproduce.
ok

----------------------------------------------------------------------
Ran 1 test in 84.045s

GPU on p2.8xlarge

[INFO] Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=25908830 to reproduce.
test_module.test_monitor ... [DEBUG] 1 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=956191307 to reproduce.
[DEBUG] 2 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=213068082 to reproduce.
[DEBUG] 3 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1699665312 to reproduce.
[DEBUG] 4 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=393344086 to reproduce.
[DEBUG] 5 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=254104386 to reproduce.
...
[DEBUG] 9992 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=38503839 to reproduce.
[DEBUG] 9993 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1410251565 to reproduce.
[DEBUG] 9994 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1003687319 to reproduce.
[DEBUG] 9995 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=10094790 to reproduce.
[DEBUG] 9996 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1177029780 to reproduce.
[DEBUG] 9997 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1426750291 to reproduce.
[DEBUG] 9998 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1707703760 to reproduce.
[DEBUG] 9999 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1759791823 to reproduce.
[DEBUG] 10000 of 10000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1920483801 to reproduce.
ok

----------------------------------------------------------------------
Ran 1 test in 81.948s

@marcoabreu
Copy link
Contributor

What about GPU?

@apeforest apeforest changed the title Remove fixed seed in flaky test [MXNET-770] Remove fixed seed in flaky test Aug 1, 2018
@apeforest
Copy link
Contributor Author

@marcoabreu I also verified on GPU. Please see the description section.

@marcoabreu marcoabreu merged commit fc912f3 into apache:master Aug 1, 2018
aaronmarkham pushed a commit to aaronmarkham/incubator-mxnet that referenced this pull request Aug 6, 2018
* Remove fixed seed in flaky test

* Remove fixed seed in flaky test
aaronmarkham added a commit to aaronmarkham/incubator-mxnet that referenced this pull request Aug 7, 2018
[MXNET-750] fix nested call on CachedOp. (apache#11951)

* fix nested call on cachedop.

* fix.

extend reshape op to allow reverse shape inference (apache#11956)

Improve sparse embedding index out of bound error message; (apache#11940)

[MXNET-770] Remove fixed seed in flaky test (apache#11958)

* Remove fixed seed in flaky test

* Remove fixed seed in flaky test

Update ONNX docs with the latest supported ONNX version (apache#11936)

Reduced test to 3 epochs and made gpu only (apache#11863)

* Reduced test to 3 epochs and made GPU only

* Moved logger variable so that it's accessible

Fix flaky tests for test_laop_4 (apache#11972)

Updating R client docs (apache#11954)

* Updating R client docs

* Forcing build

Fix install instructions for MXNET-R (apache#11976)

* fix install instructions for MXNET-R

* fix install instructions for MXNET-R

* fix default cuda version for MXNet-R

[MXNET-751] fix ce_loss flaky (apache#11971)

* add xavier initializer

* remove comment line

[MXNET-769] set MXNET_HOME as base for downloaded models through base.data_dir() (apache#11636)

* set MXNET_DATA_DIR as base for downloaded models through base.data_dir()
push joblib to save containers so is not required when running

* MXNET_DATA_DIR -> MXNET_HOME

[MXNET-748] linker fixed on Scala issues (apache#11989)

* put force load back as a temporary solution

* use project.basedir as relative path for OSX linker

[MXNET-772] Re-enable test_module.py:test_module_set_params (apache#11979)

[MXNET-771] Fix Flaky Test test_executor.py:test_dot (apache#11978)

* use assert_almost_equal, increase rtol, reduce matrix size

* remove seed in test_bind

* add seed 0 to test_bind, it is still flaky

* add comments for tracking

remove mod from arity 2 version of load-checkpoint in clojure-package (apache#11808)

* remove mod from arity 2 version of load-checkpoint

* load-checkpoint arity 2 test

Add unit test stage for mxnet cpu in debug mode (apache#11974)

Website broken link fixes (apache#12014)

* fix broken link

* fix broken link

* switch to .md links

* fix broken link

removed seed from flaky test (apache#11975)

Disable ccache log print due to threadunsafety (apache#11997)

Added default tolerance levels for regression checks for MBCC (apache#12006)

* Added tolerance level for assert_almost_equal for MBCC

* Nudge to CI

Disable flaky mkldnn test_requantize_int32_to_int8 (apache#11748)

[MXNET-769] Usability improvements to windows builds (apache#11947)

* Windows scripted build
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049

* Fix bug

* Fix non-portable ut

* add xunit

Fix import statement (apache#12005)

array and multiply are undefined. Importing them from
ndarray

Disable flaky test test_random.test_gamma_generator (apache#12022)

[MXNET-770] Fix flaky test: test_factorization_machine_module (apache#12023)

* Remove fixed seed in flaky test

* Remove fixed seed in flaky test

* Update random seed to reproduce the issue

* Fix Flaky unit test and add a training test

* Remove fixed seed in flaky test

* Update random seed to reproduce the issue

* Fix Flaky unit test and add a training test

* Increase accuracy check

disable opencv threading for forked process (apache#12025)

Bug fixes in control flow operators (apache#11942)

Fix data narrowing warning on graph_executor.cc (apache#11969)

Fix flaky tests for test_squared_hinge_loss (apache#12017)

Fix flaky tests for test_hinge_loss (apache#12020)

remove fixed seed for test_sparse_ndarray/test_operator_gpu.test_sparse_nd_pickle (apache#12012)

Removed fixed seed from , test_loss:test_ctc_loss_train (apache#11985)

Removed fixed seed from , test_loss:test_sample_weight_loss (apache#11986)

Fix reduce_kernel_M1 (apache#12026)

* Fix reduce_kernel_M1

* Improve test_norm

Update test_loss.py to remove fixed seed (apache#11995)

[MXNET-23] Adding support to profile kvstore server during distributed training  (apache#11215)

* server profiling

merge with master

cleanup old code

added a check and better info message

add functions for C compatibility

fix doc

lint fixes

fix compile issues

lint fix

build error

update function signatures to preserve compatibility

fix comments

lint

* add part1 of test

* add integration test

Re-enabling test_ndarray/test_cached (apache#11950)

Test passes on CPU and GPU (10000 runs)

make gluon rnn layers hybrid blocks (apache#11482)

* make Gluon RNN layer hybrid block

* separate gluon gpu tests

* remove excess assert_raises_cudnn_disabled usage

* add comments and refactor

* add bidirectional test

* temporarily remove hybridize in test_gluon_rnn.test_layer_fill_shape

[MXNET-751] fix bce_loss flaky (apache#11955)

* add fix to bce_loss

* add comments

* remove unecessary comments

Doc fix for a few optimizers (apache#12034)

* Update optimizer.py

* Update optimizer.py
XinYao1994 pushed a commit to XinYao1994/incubator-mxnet that referenced this pull request Aug 29, 2018
* Remove fixed seed in flaky test

* Remove fixed seed in flaky test
@apeforest apeforest deleted the test/fix_flaky_11706 branch August 23, 2019 17:09
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants