Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNN random seed (NCF, xDeepFM, W&D) #785

Merged
merged 12 commits into from
May 22, 2019
Merged

Conversation

loomlike
Copy link
Collaborator

Description

Refactor NCF (dataset and model) and W&D (input_fns) to accept random seed.
Update tests accordingly

Related Issues

#736

Checklist:

  • My code follows the code style of this project, as detailed in our contribution guidelines.
  • I have added tests.
  • I have updated the documentation accordingly.

@review-notebook-app
Copy link

Check out this pull request on ReviewNB: https://app.reviewnb.com/microsoft/recommenders/pull/785

Visit www.reviewnb.com to know how we simplify your Jupyter Notebook workflows.

Copy link
Collaborator

@miguelgfierro miguelgfierro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really good @loomlike, see my comments

reco_utils/common/gpu_utils.py Show resolved Hide resolved
import tensorflow as tf
from tensorflow.python.estimator.inputs.queues import feeding_functions
from tensorflow.python.estimator.inputs.numpy_io import (
_get_unique_target_key,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this hurts a little bit lol, but I guess it's not our fault. I guess there are no public functions that we can use right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah... yeah. Not sure why they did not expose the seed variable to numpy input function.
If we really don't want to do this, alternative way is to shuffle by ourselves like:

def numpy_input_fn(x, ...):
    ...
    if shuffle==True and seed is not None:
        # set random seed and shuffle x
        ....
        # then call tf's numpy_input_fn with the shuffled x
        tf.estimator.inputs.numpy_input_fn(x)

I actually like this solution. Much cleaner.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... the thing is, w/ this, we cannot shuffle for every epoch... hmmm

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ‘tf.set_random_seed’ will work for this case. Let me try and will update codes if works.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to use tf.data.dataset (newer API) within our pandas_input_fn and now much cleaner.

reco_utils/common/tf_utils.py Outdated Show resolved Hide resolved
tests/integration/test_notebooks_gpu.py Outdated Show resolved Hide resolved
@loomlike, hope you don't mind me commiting directly to your branch, I think it was faster to just remove the line than to tell you to do it :-)
Copy link
Collaborator

@miguelgfierro miguelgfierro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, probably we want to merge this before the merge to master

@miguelgfierro miguelgfierro merged commit a8517e9 into staging May 22, 2019
@miguelgfierro miguelgfierro deleted the jumin/dnn-random-seed branch May 22, 2019 15:33
yueguoguo pushed a commit that referenced this pull request Sep 9, 2019
DNN random seed (NCF, xDeepFM, W&D)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants