Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update xDeepFM to work with random seed #748

Merged
merged 4 commits into from
Apr 23, 2019
Merged

Conversation

loomlike
Copy link
Collaborator

Description

Changes include:

  • Refactor xDeepFM to fully use the seed
  • Update xdeepfm smoke test
  • Add xdeepfm unit and integration tests
  • Change xDeepFM quickstart notebook name to xdeepfm_criteo
  • Update the notebook contents

Related Issues

#736

Checklist:

  • My code follows the code style of this project, as detailed in our contribution guidelines.
  • I have added tests.
  • I have updated the documentation accordingly.

Refactor xDeepFM to fully use the seed
Update xdeepfm smoke test
Add xdeepfm unit and integration tests
Change xDeepFM quickstart notebook name to xdeepfm_criteo
Update the notebook contents
@review-notebook-app
Copy link

Check out this pull request on ReviewNB: https://app.reviewnb.com/Microsoft/Recommenders/pull/748

Visit www.reviewnb.com to know how we simplify your Jupyter Notebook workflows.

@@ -14,7 +12,7 @@


class BaseModel(object):
def __init__(self, hparams, iterator_creator, graph=None, seed=42):
def __init__(self, hparams, iterator_creator, graph=None, seed=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why None here? in the rest of the notebooks we have 42

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default seed should be the same as the functions' that we use in the module, which is None. See, e.g., tensorflow.

This follows the rational that if you don't pass the seed explicitly, the behavior should be 'random' (produce different results every time you run it).
e.g. Python's random behavior:

random.seed(a=None, version=2): If a is omitted or None, the current system time is used.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is the defualt behavior we want to follow, then we should have the same idea in all the other algos, which should be changed. Do we want to follow this idea instead of fixing the seed? @anargyri @yueguoguo @gramhagen

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a good practice to use seed=None for algo modules, and set seed explicitly from example notebooks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds reasonable, i guess right now you would have to set the seed to None to actually get random values, which may be unexpected. I think we should adopt this as it matches approaches from other libraries and create a separate ticket to change the notebooks / code / tests as needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in numpy the default is None: https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.seed.html#numpy.random.seed, so I think it makes sense to change all to None

@loomlike do you want to homogenize None in all the utilities and then set 42 explicitly in the notebooks? Not sure if on this PR or in a different want

Copy link
Collaborator Author

@loomlike loomlike Apr 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miguelgfierro Created a separate issue to handle them (#753)

@loomlike loomlike requested a review from gramhagen April 19, 2019 14:51
@pytest.mark.gpu
def test_ncf_deep_dive(notebooks):
def test_ncf_deep_dive_smoke(notebooks):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change the all test names? they're already in a separate folder and marked as smoke ?

also, here we want to keep @pytest.make.notebooks as well right? hmm, odd we don't have notebooks marked on the rest of the tests ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, can we reduce the tolerance levels now?

Copy link
Collaborator Author

@loomlike loomlike Apr 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the name to be the same as the rest of the tests in the module.
Don't know who started put "_smoke" there first. haha
I kinda like it, because when it fails, it is easy to read from the pytest log what's failing, unit vs smoke vs integration.

Regarding the tolerance levels, I believe NNs models have been using their own numbers (hard-coded), correct me if I'm wrong, while the others use TOL and ABS_TOL constants. So, we can leave the numbers as it is. But, actually, I want to use TOL only, not ABS_TOL, since different metrics' range could be different for each other and thus using absolute tolerance doesn't make sense (TOL is relative).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we agree to not use ABS_TOL, we can create a separate issue to take care of that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, makes sense as long as they're consistently named

yeah, whatever tolerance is appropriate is fine, but it would be good to be small enough to detect an potential error, vs random noise now that that is fixed. fine with creating a separate issue for that

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I was wondering about notebooks mark too. Integration and smoke tests don't have that mark. Maybe because we don't use notebooks mark when we run smoke and integration tests?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's true, for nightly smoke and integration tests we run 3 versions: spark, gpu, base, there's no differentiation for notebooks so I think it's fine to remove it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have to be carefull with ABS_TOL, this was added because rel_tol was not enough when testing very small metric results (in the order of 0.001 or so) I wouldn't remove it unless we are sure that these problems are gone

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah. Make sense.

Since we don't use that mark.
@loomlike loomlike merged commit 8feba89 into staging Apr 23, 2019
@loomlike loomlike deleted the jumin/xdeepfm-random-seed branch April 23, 2019 19:04
yueguoguo pushed a commit that referenced this pull request Sep 9, 2019
* Update xDeepFM to work with random seed
    - Refactor xDeepFM to fully use the seed
    - Update xdeepfm smoke test
    - Add xdeepfm unit and integration tests
    - Remove pytest mark 'deeprec' 
* Change xDeepFM quickstart notebook name to xdeepfm_criteo
    - Update README xDeepFM notebook path
    - Update README xDeepFM path and description
* Update the notebook contents
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants