Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequentialrecsys #1010

Merged
merged 5 commits into from
Dec 18, 2019
Merged

Sequentialrecsys #1010

merged 5 commits into from
Dec 18, 2019

Conversation

Leavingseason
Copy link
Collaborator

Description

We provide 4 sequential models in this update for deeprec, namely ASVD (non-sequential, just to compare with), GRU4Rec (RNN based sequential model), Caser (CNN based sequential model), and SLi-Rec (time-aware RNN base sequential model, published in IJCAI'19 by MSRA).
We provide a jupyter notebook in quick_start.
We use a public dataset, Amazon review dataset, for demonstration. In the quick start notebook, the script will automatically download the dataset, so there is no need for us to host the dataset.
We provide unit test and smoke test for sequential models.

Sequential recommenders is a type of recommender models with increasing importance. In this update, we aim to enable the repo with sequential recommender models.

Related Issues

Checklist:

  • I have followed the contribution guidelines and code style for this project.
  • I have added tests covering my contributions.
  • I have updated the documentation accordingly.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB.

@miguelgfierro
Copy link
Collaborator

there is an error in the gpu unit test:

tests/unit/test_notebooks_gpu.py ....F.                                  [100%]

=================================== FAILURES ===================================
_________________________________ test_xdeepfm _________________________________

notebooks = {'als_deep_dive': '/data/home/recocat/agent/_work/5/s/notebooks/02_model/als_deep_dive.ipynb', 'als_pyspark': '/data/h...pynb', 'cornac_bpr_deep_dive': '/data/home/recocat/agent/_work/5/s/notebooks/02_model/cornac_bpr_deep_dive.ipynb', ...}

    @pytest.mark.notebooks
    @pytest.mark.gpu
    def test_xdeepfm(notebooks):
        notebook_path = notebooks["xdeepfm_quickstart"]
        pm.execute_notebook(
            notebook_path,
            OUTPUT_NOTEBOOK,
            kernel_name=KERNEL_NAME,
            parameters=dict(
                EPOCHS_FOR_SYNTHETIC_RUN=1,
                EPOCHS_FOR_CRITEO_RUN=1,
                BATCH_SIZE_SYNTHETIC=128,
>               BATCH_SIZE_CRITEO=512,
            ),
        )

tests/unit/test_notebooks_gpu.py:69: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/data/anaconda/envs/reco_gpu/lib/python3.6/site-packages/papermill/execute.py:100: in execute_notebook
    raise_for_execution_errors(nb, output_path)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

nb = {'cells': [{'cell_type': 'code', 'metadata': {'inputHidden': True, 'hide_input': True}, 'execution_count': None, 'sour...nd_time': '2019-12-13T03:56:00.431270', 'duration': 12.523554, 'exception': True}}, 'nbformat': 4, 'nbformat_minor': 2}
output_path = 'output.ipynb'

    def raise_for_execution_errors(nb, output_path):
        """Assigned parameters into the appropriate place in the input notebook
    
        Parameters
        ----------
        nb : NotebookNode
           Executable notebook object
        output_path : str
           Path to write executed notebook
        """
        error = None
        for cell in nb.cells:
            if cell.get("outputs") is None:
                continue
    
            for output in cell.outputs:
                if output.output_type == "error":
                    error = PapermillExecutionError(
                        exec_count=cell.execution_count,
                        source=cell.source,
                        ename=output.ename,
                        evalue=output.evalue,
                        traceback=output.traceback,
                    )
                    break
    
        if error:
            # Write notebook back out with the Error Message at the top of the Notebook.
            error_msg = ERROR_MESSAGE_TEMPLATE % str(error.exec_count)
            error_msg_cell = nbformat.v4.new_code_cell(
                source="%%html\n" + error_msg,
                outputs=[
                    nbformat.v4.new_output(output_type="display_data", data={"text/html": error_msg})
                ],
                metadata={"inputHidden": True, "hide_input": True},
            )
            nb.cells = [error_msg_cell] + nb.cells
            write_ipynb(nb, output_path)
>           raise error
E           papermill.exceptions.PapermillExecutionError: 
E           ---------------------------------------------------------------------------
E           Exception encountered at "In [9]":
E           ---------------------------------------------------------------------------
E           ValueError                                Traceback (most recent call last)
E           <ipython-input-9-1a477a30c87f> in <module>
E           ----> 1 model.fit(train_file, valid_file)
E           
E           /data/home/recocat/agent/_work/5/s/reco_utils/recommender/deeprec/models/base_model.py in fit(self, train_file, valid_file, test_file)
E               420             for batch_data_input in self.iterator.load_data_from_file(train_file):
E               421                 step_result = self.train(train_sess, batch_data_input)
E           --> 422                 (_, step_loss, step_data_loss, summary) = step_result
E               423                 if self.hparams.write_tfevents:
E               424                     self.writer.add_summary(summary, step)
E           
E           ValueError: too many values to unpack (expected 4)

@Leavingseason, one question, can you access this https://dev.azure.com/best-practices/recommenders/_build/results?buildId=18584 and see all the logs, etc?

Copy link
Collaborator

@miguelgfierro miguelgfierro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is absolutely awesome

@miguelgfierro
Copy link
Collaborator

miguelgfierro commented Dec 17, 2019

@Leavingseason, feel free to merge when you think it is finished. After this is merged, I'll start working on the 4 deep dives #1013

@Leavingseason
Copy link
Collaborator Author

@Leavingseason, feel free to merge when you think it is finished. After this is merged, I'll start working on the 4 deep dives #1013

Unfortunately I an not authorized to merge this pull request... @yueguoguo @anargyri @gramhagen do you have any comments?

@miguelgfierro miguelgfierro merged commit d64ebb2 into staging Dec 18, 2019
@miguelgfierro miguelgfierro deleted the sequentialrecsys branch December 18, 2019 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants