Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH permutation importance #142

Merged
merged 84 commits into from
Jan 30, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
18ef31b
permutation importance
merveenoyan Sep 16, 2022
2ff7a5c
Update examples/plot_model_card.py
merveenoyan Sep 29, 2022
5b8d4c9
Update examples/plot_model_card.py
merveenoyan Nov 7, 2022
ae99b89
Merge branch 'main' into feature_importance
merveenoyan Nov 7, 2022
e80359d
added test and got rid of pandas
merveenoyan Nov 7, 2022
4ef3549
change import
merveenoyan Nov 7, 2022
133cc2e
fixes
merveenoyan Nov 7, 2022
1c448bc
fixes
merveenoyan Nov 7, 2022
95ad03b
updated docs & more
merveenoyan Nov 7, 2022
2bc714f
docs
merveenoyan Nov 7, 2022
b457fb9
added another test, updated docs, will add to model card rst
merveenoyan Nov 8, 2022
7228e83
removed unnecessary files
merveenoyan Nov 8, 2022
9471c76
added importance to model card guide
merveenoyan Nov 8, 2022
48c656d
moved filepaths to tempfile
merveenoyan Nov 8, 2022
a514060
moved filepaths to tempfile
merveenoyan Nov 8, 2022
ac699c5
test windows fix
merveenoyan Nov 8, 2022
76e5b0e
added types
merveenoyan Nov 8, 2022
baccd1b
Update skops/card/_model_card.py
merveenoyan Nov 9, 2022
d3d0c1c
Update skops/card/_model_card.py
merveenoyan Nov 9, 2022
2489693
added matplotlib mock and mock test
merveenoyan Nov 10, 2022
a7ba718
fixed test
merveenoyan Nov 17, 2022
1735a01
forgot to commit this lol
merveenoyan Nov 17, 2022
8dd9692
change type
merveenoyan Nov 17, 2022
1462ac2
Merge branch 'main' into feature_importance
merveenoyan Nov 22, 2022
510d41b
added error and tests
merveenoyan Nov 22, 2022
583c16d
Merge branch 'feature_importance' of github.com:merveenoyan/skops int…
merveenoyan Nov 22, 2022
2828ae2
Merge branch 'main' into feature_importance
merveenoyan Nov 22, 2022
97ebde9
fix for windows tests
merveenoyan Nov 22, 2022
ce75f12
merger
merveenoyan Nov 22, 2022
dd6e7aa
fix for windows tests
merveenoyan Nov 22, 2022
fcc16dd
fix for windows tests
merveenoyan Nov 22, 2022
7fbaf89
fix for windows tests
merveenoyan Nov 22, 2022
a3435b8
fix for windows tests
merveenoyan Nov 23, 2022
abba95b
Merge branch 'main' into feature_importance
merveenoyan Nov 24, 2022
d0adddf
swapped with path
merveenoyan Nov 24, 2022
003544f
added import_or_raise
merveenoyan Nov 29, 2022
1da4747
changed pre-commit config
merveenoyan Nov 29, 2022
280c602
changed pre-commit config
merveenoyan Nov 29, 2022
1521444
black
merveenoyan Nov 29, 2022
61d1784
fixed test
merveenoyan Nov 29, 2022
35b5196
Merge branch 'main' into feature_importance
merveenoyan Nov 29, 2022
5e5d6b3
trigger CI
merveenoyan Nov 29, 2022
2bf6c81
minor fix after merge conflict
merveenoyan Nov 29, 2022
0465163
changed test
merveenoyan Nov 29, 2022
8bf590e
latest version
merveenoyan Dec 14, 2022
64996fb
fix, but no idea why
adrinjalali Dec 14, 2022
298f6fb
Merge branch 'main' into feature_importance
merveenoyan Dec 15, 2022
e03d30f
minor try
merveenoyan Jan 6, 2023
c9b8813
trigger ci
merveenoyan Jan 6, 2023
a71cdac
revert
merveenoyan Jan 6, 2023
9e63a42
mypy fix
merveenoyan Jan 16, 2023
ec82e47
merge main
merveenoyan Jan 16, 2023
e33e254
fixed test and fixture
merveenoyan Jan 17, 2023
da185f5
Merge branch 'main' into feature_importance
merveenoyan Jan 17, 2023
d66be43
fix
merveenoyan Jan 17, 2023
b183d9c
Merge branch 'feature_importance' of github.com:merveenoyan/skops int…
merveenoyan Jan 17, 2023
ff9c7bf
removed redundant fixture
merveenoyan Jan 17, 2023
3a93880
Merge branch 'main' into feature_importance
merveenoyan Jan 20, 2023
bdeabc4
trigger black
merveenoyan Jan 20, 2023
e999a6f
trigger black
merveenoyan Jan 20, 2023
c9219d5
trigger black
merveenoyan Jan 20, 2023
bb864a6
trigger isort
merveenoyan Jan 20, 2023
9113aad
Update skops/card/_model_card.py
merveenoyan Jan 23, 2023
5e80732
Update skops/card/_model_card.py
merveenoyan Jan 23, 2023
322ed4d
Update skops/card/_model_card.py
merveenoyan Jan 23, 2023
e5afe26
Update skops/card/_model_card.py
merveenoyan Jan 23, 2023
95189d8
Update skops/card/_model_card.py
merveenoyan Jan 23, 2023
55b968d
Update skops/card/_model_card.py
merveenoyan Jan 23, 2023
6e3fd2b
removed test, nits and more
merveenoyan Jan 23, 2023
9d98003
Update skops/card/_model_card.py
merveenoyan Jan 23, 2023
17e0253
Update skops/card/_model_card.py
merveenoyan Jan 23, 2023
2b4df99
Merge branch 'main' into feature_importance
merveenoyan Jan 24, 2023
cf18588
iterated
merveenoyan Jan 25, 2023
7a02cd4
added print to debug on ubuntu
merveenoyan Jan 29, 2023
602b8d2
more debugging
merveenoyan Jan 29, 2023
c23f63f
more debugging
merveenoyan Jan 29, 2023
5407351
Merge branch 'skops-dev:main' into feature_importance
merveenoyan Jan 29, 2023
e046f88
removed debug
merveenoyan Jan 29, 2023
a16d83f
removed debugging line from github workflow
merveenoyan Jan 29, 2023
fce9b14
removed mypy ignores
merveenoyan Jan 30, 2023
1281274
Merge branch 'main' into feature_importance
merveenoyan Jan 30, 2023
01a5d78
removed mypy ignores
merveenoyan Jan 30, 2023
77f9f8c
removed mypy ignores
merveenoyan Jan 30, 2023
7c70656
merge local
merveenoyan Jan 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions examples/plot_model_card.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
from sklearn.datasets import load_breast_cancer
from sklearn.ensemble import HistGradientBoostingClassifier
from sklearn.experimental import enable_halving_search_cv # noqa
from sklearn.inspection import permutation_importance
from sklearn.metrics import (
ConfusionMatrixDisplay,
accuracy_score,
Expand Down Expand Up @@ -148,6 +149,11 @@
disp.figure_.savefig(Path(local_repo) / "confusion_matrix.png")
model_card.add_plot(**{"Confusion matrix": "confusion_matrix.png"})

importances = permutation_importance(model, X_test, y_test, n_repeats=10)
model_card.add_permutation_importances(
importances, X_test.columns, "importance.png", "Permutation Importance"
)

cv_results = model.cv_results_
clf_report = classification_report(
y_test, y_pred, output_dict=True, target_names=["malignant", "benign"]
Expand Down
42 changes: 42 additions & 0 deletions skops/card/_model_card.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from reprlib import Repr
from typing import Any, Optional, Union

import matplotlib.pyplot as plt
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved
from huggingface_hub import CardData, ModelCard
from sklearn.utils import estimator_html_repr
from tabulate import tabulate # type: ignore
Expand Down Expand Up @@ -373,6 +374,47 @@ def add_metrics(self, **kwargs: str) -> "Card":
self._eval_results[metric] = value
return self

def add_permutation_importances(
self, feature_importances, columns, plot_file, plot_name
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved
) -> "Card":
"""Plots permutation importance and saves it to model card.

Parameters
----------
feature_importances : sklearn.utils.Bunch
Output of sklearn.inspection.permutation_importance()
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved

columns :
Column names of the data used to generate importances.

plot_file :
Filename for the plot.

plot_name :
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved
Name of the plot.

Returns
-------
self : object
Card object.
"""
sorted_importances_idx = feature_importances.importances_mean.argsort()
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved
fig, ax = plt.subplots()
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved
ax.boxplot(
x=feature_importances.importances[sorted_importances_idx].T,
labels=columns[sorted_importances_idx],
vert=False,
)
ax.set_title(plot_name)
ax.set_xlabel("Decrease in Score")
if plot_name is not None and plot_file is not None:
plt.savefig(plot_file)
self.add_plot(**{plot_name: plot_file})
else:
plt.savefig("feature_importances.png")
self.add_plot(**{"Feature Importances": "feature_importances.png"})
return self

def _generate_card(self) -> ModelCard:
"""Generate the ModelCard object

Expand Down
14 changes: 14 additions & 0 deletions skops/card/tests/test_card.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
import sklearn
from huggingface_hub import CardData, metadata_load
from sklearn.datasets import load_iris
from sklearn.inspection import permutation_importance
from sklearn.linear_model import LinearRegression, LogisticRegression

import skops
Expand Down Expand Up @@ -169,6 +170,19 @@ def test_add_plot(destination_path, model_card):
assert "![fig1](fig1.png)" in model_card


def test_permutation_importances(
merveenoyan marked this conversation as resolved.
Show resolved Hide resolved
iris_estimator, iris_data, model_card, destination_path
):
X, y = iris_data
result = permutation_importance(
iris_estimator, X, y, n_repeats=10, random_state=42, n_jobs=2
)
model_card.add_permutation_importances(
result, X.columns, "importance.png", "Permutation Importance"
)
assert "![Permutation Importance](importance.png)" in model_card.render()


def test_temporary_plot(destination_path, model_card):
# test if the additions are made to a temporary template file
# and not to default template or template provided
Expand Down