You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@parrt@tlapusan Looks like there is a bug with extract_params_from_pipeline(). Try running dtreeviz_sklearn_pipeline_visualisations.ipynb in the dev branch.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[10], line 1
----> 1 tree_classifier, X_train, features_model = extract_params_from_pipeline(
2 pipeline=model,
3 X_train=dataset[features],
4 feature_names=features)
File ~/dtreeviz/dtreeviz/utils.py:192, in extract_params_from_pipeline(pipeline, X_train, feature_names)
186 tree_model = pipeline.steps[-1][1]
188 feature_names = _extract_final_feature_names(
189 pipeline=pipeline,
190 features=feature_names
191 )
--> 192 X_train = pd.DataFrame(
193 data=pipeline[:-1].transform(X_train),
194 columns=feature_names
195 )
196 return tree_model, X_train, feature_names
File ~/.venvs/dtreeviz/lib64/python3.11/site-packages/pandas/core/frame.py:721, in DataFrame.__init__(self, data, index, columns, dtype, copy)
711 mgr = dict_to_mgr(
712 # error: Item "ndarray" of "Union[ndarray, Series, Index]" has no
713 # attribute "name"
(...)
718 typ=manager,
719 )
720 else:
--> 721 mgr = ndarray_to_mgr(
722 data,
723 index,
724 columns,
725 dtype=dtype,
726 copy=copy,
727 typ=manager,
728 )
730 # For data is list-like, or Iterable (will consume into list)
731 elif is_list_like(data):
File ~/.venvs/dtreeviz/lib64/python3.11/site-packages/pandas/core/internals/construction.py:349, in ndarray_to_mgr(values, index, columns, dtype, copy, typ)
344 # _prep_ndarraylike ensures that values.ndim == 2 at this point
345 index, columns = _get_axes(
346 values.shape[0], values.shape[1], index=index, columns=columns
347 )
--> 349 _check_values_indices_shape_match(values, index, columns)
351 if typ == "array":
353 if issubclass(values.dtype.type, str):
File ~/.venvs/dtreeviz/lib64/python3.11/site-packages/pandas/core/internals/construction.py:420, in _check_values_indices_shape_match(values, index, columns)
418 passed = values.shape
419 implied = (len(index), len(columns))
--> 420 raise ValueError(f"Shape of passed values is {passed}, indices imply {implied}")
ValueError: Shape of passed values is (891, 16), indices imply (891, 5)****
The text was updated successfully, but these errors were encountered:
@mepland ran on both master and dev and the notebook was working....but I was running with sklearn version 1.1.3 and I assume you have the latest version 1.2.0.
For version 1.1.3 there was a deprecated method which now is not supported for 1.2.0 and this cause the problem.
from my first debug, I replaced the hasattr(component, 'get_feature_names'):
with
'hasattr(component, 'get_feature_names_out'):'
but we still have an error. This is because now (version 1.2.0) the component[0] also has the 'get_feature_names_out' attribute. I fix it by including an 'elif' for component in pipeline[:-1]: if hasattr(component, 'get_support'): features = [f for f, s in zip(features, component.get_support()) if s] elif hasattr(component, 'get_feature_names_out'): features = component.get_feature_names_out(features)
It works but you @windisch have more in depth details about pipelines and knows better if this a good fix and will be applicable for other pipelines. Thanks.
@parrt @tlapusan Looks like there is a bug with
extract_params_from_pipeline()
. Try runningdtreeviz_sklearn_pipeline_visualisations.ipynb
in thedev
branch.The text was updated successfully, but these errors were encountered: