Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Papermill scalffold example #681

Merged

Conversation

jingyanwangms
Copy link
Collaborator

@jingyanwangms jingyanwangms commented Mar 26, 2019

Description

This noteboook provides a scaffold to directly submit an existing notebook to AzureML compute targets. Now a user doesn't have to rewrite the training script, instead, just by replacing file name, now user can submit notebook directly.

Notes:

  • I added line run_config_aml.environment.python.conda_dependencies.remove_conda_package("pip>=19.0.3") as workaround for what I think is a bug in azureml.core
  • All AzureML notebook cannot be smoke tested yet. Will track aml test infrastructure separately, see Issue 547

Related Issues

Issue 695 [BUG] Remove contrib from azureml

Checklist:

  • My code follows the code style of this project, as detailed in our contribution guidelines.
  • I have added tests.
  • I have updated the documentation accordingly.

@review-notebook-app
Copy link

Check out this pull request on ReviewNB: https://app.reviewnb.com/Microsoft/Recommenders/pull/681

Visit www.reviewnb.com to know how we simplify your Jupyter Notebook workflows.

Copy link
Collaborator

@miguelgfierro miguelgfierro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good @jingyanwangms. I have some questions and suggestions

myenv2.yaml Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
@miguelgfierro
Copy link
Collaborator

hey @jingyanwangms, this is really good, it is in fact one of the works that we need the most.

Question, what is the relation of this PR with this other one #575?

notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
Copy link
Collaborator

@yueguoguo yueguoguo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jingyanwangms just my two cents, it would be good to have some information in the notebook put at a place where all information about setting up Azure, benefits of Azure, etc. is found. I personally think the expectation of this notebook is about how to run a recommender quickstart/deepdive/hyper-tuning notebook on AzureML Service.

What do you think?

notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_aml.ipynb Outdated Show resolved Hide resolved
@@ -0,0 +1,290 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should merge #728 first and then use the get_or_create_workspace function there to minimize use of scripts post-setup

Reply via ReviewNB

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping this PR can make it into master with #720 . If #728 can get in with, then I will rebase to use get_or_create_workspace

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#728 is merged to staging

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebased to latest staging

notebooks/00_quick_start/run_notebook_on_azureml.ipynb Outdated Show resolved Hide resolved
notebooks/00_quick_start/run_notebook_on_azureml.ipynb Outdated Show resolved Hide resolved
@gramhagen
Copy link
Collaborator

How are we differentiating this and the run_sar_on_aml notebook? there's some repetition between those two, so it's a little confusing if they are basically the same, or if one way is better than another? Personally I like having a separate notebook that will just execute the notebook as is, instead of having to integrate azureml into the notebook, but clarifying the two approaches (pros/cons) might help.

@heatherbshapiro
Copy link
Contributor

@gramhagen the benefit of this notebook is to show that you can submit any notebook as is to azureML without changing any code. the other notebook is an example of how you COULD change the code to be able to utilize items like data store, estimators, etc that are found in AzureML. I think this would be better suited in the notebooks folder rather than quickstart

@miguelgfierro
Copy link
Collaborator

awesome work @jingyanwangms!!!

@miguelgfierro
Copy link
Collaborator

mmm @gramhagen it looks there is a conflict between the changes you made notebooks/00_quick_start/sar_movielens_with_azureml.ipynb and this PR

Copy link
Contributor

@heatherbshapiro heatherbshapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that run based creation for the compute will take too long. If a user is submitting multiple notebooks to test out, it will take them over 10 minutes every time to create, prepare, and then finally train. I think we should either show both ways to do it, or only show the persistent compute so that if a user does it once, they can easily switch to a new notebook and it should be much faster.

@gramhagen
Copy link
Collaborator

yeah, one of us will need to go through and resolve those changes. I can take a stab at it tomorrow unless you'd rather do it @jingyanwangms ?

"![Experiment submit output](https://recodatasets.blob.core.windows.net/images/aml_papermill_cell_output.jpg)\n",
"After clicking \"Link to Azure Portal\", experiment run details tab looks like this\n",
"![Azure Portal Experiment](https://recodatasets.blob.core.windows.net/images/aml_papermill_portal.jpg)\n",
"Experiment output is out.ipynb. You can download and view this file locally.\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the experiment output. The metrics are logged to the run, and the out.ipynb is the notebook that was initially submitted. I don't think we need the image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried submitting a notebook without output and out.ipynb comes with output. I think out.ipynb is the notebook after the run

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just meant that the text is incorrect. It is not output from the experiment. They can see the notebook that was submitted to the experiment logged as out.ipynb

@jingyanwangms
Copy link
Collaborator Author

@heatherbshapiro I changed the notebook to use persistent compute

@jingyanwangms
Copy link
Collaborator Author

@gramhagen Rebased this pr to latest staging

@heatherbshapiro
Copy link
Contributor

Can we please move to the notebook folder rather than the quickstart?

@jingyanwangms
Copy link
Collaborator Author

@heatherbshapiro moved to notebooks/ and changed text as You can find the executed notebook out.ipynb under outputs tab.

@miguelgfierro miguelgfierro merged commit 427fefd into recommenders-team:staging Apr 15, 2019
@miguelgfierro miguelgfierro added this to the Papermill in AzureML milestone Apr 25, 2019
yueguoguo pushed a commit that referenced this pull request Sep 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants