-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Quantile Forest Support #384
ENH: Quantile Forest Support #384
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for providing this fix. I haven't done a proper review yet. Would you mind adding a test for quantile forest estimators? It could be similar to the tests here. If you need help with that, let us know.
-Adds base unit tests -Patches constructor visualization error
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
skops/io/_visualize.py
Outdated
@@ -232,6 +232,9 @@ def walk_tree( | |||
"here: https://github.com/skops-dev/skops/issues" | |||
) | |||
|
|||
if node_name == "constructor": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested resolution for #385.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would include that in a separate PR.
@BenjaminBossan: Sure, I took a pass at adding a test for the quantile forest estimators. This also uncovered a bug in the visualization of sklearn DecisionTreeRegressors reported as #385, with a suggested resolution included in these changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the nice PR. Other than the small nits, I think this looks good.
Might be time for us to start working on a better way to handle C extension types.
skops/io/_quantile_forest.py
Outdated
self.trusted = self._get_trusted( | ||
trusted, | ||
[get_module(QuantileForest) + ".QuantileForest"], | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be happier if we don't trust them by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
skops/io/_visualize.py
Outdated
@@ -232,6 +232,9 @@ def walk_tree( | |||
"here: https://github.com/skops-dev/skops/issues" | |||
) | |||
|
|||
if node_name == "constructor": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would include that in a separate PR.
skops/io/tests/test_external.py
Outdated
tree_methods = [ | ||
quantile_forest.RandomForestQuantileRegressor, | ||
quantile_forest.ExtraTreesQuantileRegressor, | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to @pytest.mark.parametrize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
In favor of separate PR.
Uses pytest.mark.parametrize
@reidjohnson tests failing |
Failing due to #385. |
skops/io/_quantile_forest.py
Outdated
from __future__ import annotations | ||
|
||
from typing import Any, Sequence | ||
|
||
from ._protocol import PROTOCOL | ||
from ._sklearn import ReduceNode, reduce_get_state | ||
from ._utils import LoadContext, SaveContext | ||
|
||
|
||
def _lazy_import(): | ||
from quantile_forest._quantile_forest_fast import QuantileForest | ||
|
||
def quantile_forest_get_state( | ||
obj: Any, | ||
save_context: SaveContext, | ||
) -> dict[str, Any]: | ||
state = reduce_get_state(obj, save_context) | ||
state["__loader__"] = "QuantileForestNode" | ||
return state | ||
|
||
class QuantileForestNode(ReduceNode): | ||
def __init__( | ||
self, | ||
state: dict[str, Any], | ||
load_context: LoadContext, | ||
trusted: bool | Sequence[str] = False, | ||
) -> None: | ||
super().__init__( | ||
state, | ||
load_context, | ||
constructor=QuantileForest, | ||
trusted=trusted, | ||
) | ||
self.trusted = self._get_trusted(trusted, []) | ||
|
||
return QuantileForest, QuantileForestNode, quantile_forest_get_state | ||
|
||
|
||
try: | ||
QuantileForest, QuantileForestNode, quantile_forest_get_state = _lazy_import() | ||
|
||
# tuples of type and function that gets the state of that type | ||
GET_STATE_DISPATCH_FUNCTIONS = [(QuantileForest, quantile_forest_get_state)] | ||
|
||
# tuples of type and function that creates the instance of that type | ||
NODE_TYPE_MAPPING = {("QuantileForestNode", PROTOCOL): QuantileForestNode} | ||
except ModuleNotFoundError: | ||
GET_STATE_DISPATCH_FUNCTIONS = [] | ||
NODE_TYPE_MAPPING = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of this instead? (slightly different)
from __future__ import annotations
from typing import Any, Sequence
from ._protocol import PROTOCOL
from ._sklearn import ReduceNode, reduce_get_state
from ._utils import LoadContext, SaveContext
try:
from quantile_forest._quantile_forest_fast import QuantileForest
except ImportError:
QuantileForest = None
def quantile_forest_get_state(
obj: Any,
save_context: SaveContext,
) -> dict[str, Any]:
state = reduce_get_state(obj, save_context)
state["__loader__"] = "QuantileForestNode"
return state
class QuantileForestNode(ReduceNode):
def __init__(
self,
state: dict[str, Any],
load_context: LoadContext,
trusted: bool | Sequence[str] = False,
) -> None:
if QuantileForest is None:
raise ImportError(
"`quantile_forest` is missing and needs to be installed in order to"
" load this model."
)
super().__init__(
state,
load_context,
constructor=QuantileForest,
trusted=trusted,
)
self.trusted = self._get_trusted(trusted, [])
# tuples of type and function that gets the state of that type
if QuantileForest is not None:
GET_STATE_DISPATCH_FUNCTIONS = [(QuantileForest, quantile_forest_get_state)]
# tuples of type and function that creates the instance of that type
NODE_TYPE_MAPPING = {("QuantileForestNode", PROTOCOL): QuantileForestNode}
|
||
|
||
class TestQuantileForest: | ||
"""Tests for RandomForestQuantileRegressor and ExtraTreesQuantileRegressor""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not missing QuantileForest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the user-facing classes which are explicitly tested, QuantileForest
is the underlying Cython object shared by the user-facing classes and isn't explicitly tested here (but is serialized by the loading/dumping).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot for adding the support. I have an extremely minor comment, but this is good to be merged from my point of view.
I just noticed this is missing a changelog entry. |
Updated, thanks for flagging. |
awesome work everyone, y'all are amazing! I really appreciate it! |
I finally got back to this effort and it seems like, even with the changes, I'm still seeing the following error when attempting to load a quantile forest model:
|
more specifically: |
@brentonmallen1 Just checking, but are you using the development build? I don't believe there's yet been a public release since these changes were merged. |
I was using the main release for 0.8.0 because the changelog had mentioned QuantileForest support, but I could have misread. I can try using the dev release and give it another go and report back. thank you |
well, I don't see the quantile changes in my site-packages so it does seem like I haven't installed what I was expecting. I also don't see a dev branch/build, how would I go about installing that directly? |
@brentonmallen1 I'll defer to the maintainers, but a viable route is to follow the dev setup described in the contributing doc. |
We just released 0.9.0, which includes this change. |
y'all are wonderful! thank you! |
Fixes #383.
The quantile-forest package extends sklearn random forest regressors to implement quantile regression forests. Without these changes, the Cython-based model object is unable to be loaded by skops due to its use of the
__reduce__
method.I understand adding support for another package may be outside the current scope of this project. As the primary maintainer of the quantile-forest package, I'd also be interested in better understanding if there's a recommended alternative to the use of
__reduce__
that would enable the model to be compatible with this project's existing secure persistence feature.