-
-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Hyperband implementation #221
Conversation
dc7ac95
to
4f2ff24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No major comments from me yet from a dask/computing perspective. I'm curious to see how dealing with blocks eventually gets folded in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments. Going through the dask / distributed stuff now.
Overall, it's neat how clean this ended up being.
86854b4
to
10b9404
Compare
I've got it working for both {sklearn, dask-ml} models and {dask, numpy} arrays, and it's tested for all possible combos. However, the tests are non-deterministic and fail about 5% of the time. There's two good tracebacks on the py27 CircleCI tests if you're interested. I'd also have a bunch of complicated logic to avoid communicating trained models. I'd like to avoid it by defining a |
Is this already implemented? I didn't see anything crazy complicated.
I recommend that for simplicity's sake we not include any dask code within any function that we call from a task. So I recommend not doing this if possible. Can you explain more about where you're calling |
I'm taking a look at this now, it seems like we're still calling dask from within Dask. My path so far has been the following:
|
Anyway, I'd like to take the lock on this PR for a moment. I'll try to push something up soonish. |
What is the right way to handle testing data? It looks like it can be a dask array. Should we pass the entire thing to every score call or should we be iterating over it in chunks like with the training data? |
OK, I've pushed a fix. Things are flattened down now so that no task generates other tasks. I made a few possibly-wrong decisions in this work, so applying a careful eye to those changes would be welcome. I tried to add a comment in places where I wasn't sure. |
Ah – I'm starting to understand. We don't want to launch tasks from tasks, and calling dask code would do that (and of course Correct me if I'm wrong: we want to avoid launch tasks within tasks because it's complicated. At the least, it's hard to debug and produces messy tracebacks. What other downsides are there?
Can I have some clarification on "iterating over it in chunks"? I can take this two ways.This algorithm is adaptive, and needs some intermediary feedback. We can base every adaptive decision from
Looking at your implementation, it looks like both are easy to implement so I don't think it matters. But I am okay with 1 but cautious about 2,
Can you expand upon this?
I only call
This is what I'm doing. I was unsure if I could define a
Thanks, and I'll give it a careful look over on Monday. |
Yup, that's mostly it. I anticipate that it will be much more expensive to write and maintain. I also wouldn't be surprised if it had negative effects when other people use it in novel and surprising ways. Users will always add their own crazy schemes to whatever we do, so the fewer layers of complexity we add the better for everyone, especially when they come asking questions.
Given the rest of your explanation here it seems like you have a fair understanding of my question. It seems like you frequently score each model with some testing data. I didn't know if that should be the same testing data every time or if we had some different testing data for each epoch.
When writing code I find it very useful to be able to pause execution, and look at the state of any function that's running. Typically I do this by adding the following line wherever in the code I'm interested in, and then run a simple test. import pdb; pdb.set_trace()` When the test sets up separate processes then this doesn't work (you can't use pdb across processes) but if all of the client, scheduler, and workers are running in the same thread then it does work. This is one of the major advantages of using async-style programming in tests. If you want to try this, add
I see both approaches as appropriate. I don't see why dask code would need to be present in either case. In both cases we have scikit-learn models and we have numpy arrays and we want to call various scikit-learn style functions on them, like partial_fit, or like score. I don't think that Dask needs to be involved for any of this work.
Yeah, no rush. |
Sorry, I should have been more clear. From the docs, I now see the tests need to be async so they don't block – but how do you work around defining async functions but still support Python 2.7? I’m a huge fan of debuggers and a frequent user of ipdb. And thank you for the tips on running with pytest – it makes complete sense the debugger is launched, but I (naively) assumed pdb didn't work with pytest. |
Normally we use Tornado In this case I used Python 3 style |
Ah, I see you've already seen the writing tests doc. My apologies for the duplication. You may also want to take a look at http://distributed.readthedocs.io/en/latest/asynchronous.html (though I would ignore the asyncio sections and just focus on tornado). |
I'd like to give this a shot on something slightly less trivial than the example in the test. @stsievert can you recommend something? Do you have a notebook or other example that you're testing on locally that isn't present here? |
f05c223
to
4565213
Compare
Takeaways from my review #221 (review) (GitHub's UI doesn't make this easy to flag). I'll update this comment with any updates. Potential concerns:
API changes:
Test changes:
|
* Reorganize tests to test IncrementalSearchCV for history_ * Reword HyperbandSearchCV warning on patience if too low * Add comments to tests
and require specification of n_initial_iter
I think this is ready for merge now. The CI tests are unrelated (most are tracked in #517, and the Travis tests are one some unrelated documentation). I think I've resolved most of my concerns. I'd like a review of my changes. A summary of my changes is in #221 (comment), which includes some API changes. A diff of the recent changes is in https://github.com/stsievert/dask-ml/compare/d17bdb3..070784d Also, when this is merged I'd appreciate a squash and merge. My commit history is far from clean. |
Sorry I missed your last comment @stsievert. I gave a quick glance through the recent diff, and things look nice. I'll give it another look tonight and merge if things all look fine. And, I think I'll do a release shortly after merging, so that we can use this in e.g. dask-examples. Sound good? If we find issues I can easily do another release. |
+1 to putting this in dask-examples
…On Thu, Jun 13, 2019 at 11:35 PM Tom Augspurger ***@***.***> wrote:
Sorry I missed your last comment @stsievert <https://github.com/stsievert>.
I gave a quick glance through the recent diff, and things look nice. I'll
give it another look tonight and merge if things all look fine.
And, I think I'll do a release shortly after merging, so that we can use
this in e.g. dask-examples. Sound good? If we find issues I can easily do
another release.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#221?email_source=notifications&email_token=AACKZTF6ABZBGB7JRX4YGP3P2K4SRA5CNFSM4FGLYW42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXVDGUY#issuecomment-501887827>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AACKZTCMXH4BYIG4VAS6NQTP2K4SRANCNFSM4FGLYW4Q>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave another run through, and things look quite nice @stsievert. Nice work!
All of that sounds good. And I think inclusion in Dask-examples is a good idea. It also got suggested by a reviewer scipy-conference/scipy_proceedings#464 (comment) |
What does this implement?
This implements a state-of-the-art model selection algorithm, Hyperband. I give a concise summary in dask/dask-searchcv#72 (comment). Briefly, this algorithm only requires computational budget as input. Hyperband will find close to the best possible parameters with the given computational budget.* It performs the search via an adaptive scheme, and "kills" off poor performing models. The intuition for this is simple: it makes no sense to waste computation on poor performing models if the goal is to find the best performing model with as little computation as possible. More detail can be found in their paper: https://arxiv.org/abs/1603.06560
This will only work with models that implement
partial_fit
. I do not see this as blocking because optimization is iterative, and believe education and outreach will help. I aim to have this work for all classes that implementscore
,partial_fit
and{get, set}_params
.* "will find" with high probability, "close" is within a log factor of the lower bound, "best" is highest expected cross-validation score.
Reference Issues/PRs
This addresses #161, and is an item on our beta roadmap (which has an issue: #210).
This is a refactor of dask/dask-searchcv#72 and depends dask.distributed. The dependence on dask.distributed allows use of
as_completed
, which greatly simplifies the code and easily allows for an extension that makes this algorithm better suited for parallel architectures.TODO
implement and test support for scikit-learn modelsdone in Incremental model selection #288implement loggingBase class for estimators using _incremental.fit #356DaskBaseSearchCV
AdaptiveSearchCV
inheritancetwoone examplesfeed random chunks to every partial_fit calldone in Incremental model selection #288SuccessiveHalving
bracketsFuture work
Related work
partial_fit
support in sklearn'scross_validate
. This will yield more complete cross-validation (e.g., number of splits).partial_fit
support inDaskBaseSearchCV
. All Hyperband is really doing is some fancy early stopping withRandomizedSearchCV
. Providing this support will allow for caching results that are part of a pipeline.partial_fit
support for sklearn pipepines. My example will rely on a simpler version of this PR.np.logspace(-4, -1)
, and will make specifying continuous parameters in log-space eaiser (e.g., step size or regularization parameters).