-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixing access to mml spark repo access from databricks_install in mml… #714
Conversation
…spark_lightgbm_criteo notebook
Check out this pull request on ReviewNB: https://app.reviewnb.com/Microsoft/Recommenders/pull/714 Visit www.reviewnb.com to know how we simplify your Jupyter Notebook workflows. |
@@ -80,7 +80,7 @@ | |||
" # get the maven coordinates for MML Spark from databricks_install script\n", | |||
" from scripts.databricks_install import MMLSPARK_INFO\n", | |||
" packages = [MMLSPARK_INFO[\"maven\"][\"coordinates\"]]\n", | |||
" repo = MMLSPARK_INFO[\"maven\"].get(\"repositories\", None)\n", | |||
" repo = MMLSPARK_INFO[\"maven\"][\"repo\"]\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, however, we have to add .get("repo", None)
, if not spark function will break if we use something similar to what we have originally, where there was only coordinates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wanted to force an error in that case, should we carry forward if they get out of synch?
also, fwiw if you use dict.get(), None is automatically provided as the default. reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you use dict.get(), None is automatically provided as the default.
yeah, so the idea is if we are using the old dictionary, that doesn't have repo, the spark loader doesn't fail:
MMLSPARK_INFO = {"maven": {"coordinates": "Azure:mmlspark:0.16"}}
I guess that the dictionary that we have now is temporal:
MMLSPARK_INFO = {"maven": {"coordinates": "com.microsoft.ml.spark:mmlspark_2.11:0.16.dev8+2.g6a5318b",
"repositories": "https://mmlspark.azureedge.net/maven"}
}
i wanted to force an error in that case
don't understand very well what you are thinking, can you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see so, you're just saying we shouldn't bother updating this notebook if we drop the repo piece in the future (which we're planning on doing), i'm fine with that.
it happened again that the tests hasn't triggered, this is weird |
I pushed several prs in close succession, i wonder if there's a limit to the queue of pending tests or some kind of time limit before the test status is no longer accessible? |
change to get repo to limit changes when maven coordinates change
fixing access to mml spark repo access from databricks_install in mml…
Description
fixing issue pulling in maven repository link in mmlspark lightgbm notebook
i'm still seeing issues locally on windows, but we should try this fix on windows dsvm because my environment is probably not very clean.
Related Issues
#709
Checklist: