solved LIME coefficients #621

WillemSpek · 2023-06-13T12:56:07Z

See issue #620 the current LIME implementation does not make use of the coefficients that LIME actually yields and instead returns the explanation as a mask with values -1, 0 and 1 that designates between negative, neutral and positive contribution. The implementation for this masking also seems a bit slow and could be more vectorized. I propose my change that grabs the original coefficients from the surrogate model in LIME that also yield for much more granular explanations as I will show below.

quickshift:
With masks:

skimage.segmentation.slic:
With masks:

With LIME surrogate model coeffficients:

geek-yang

@WillemSpek Just a very quick comment before diving into the code. You also need to write a unit test for the get_explanation_values function, otherwise the coverage check will fail.

I don't have much time today and tomorrow. If nobody leaves any review, I can take a look on Thursday.

Thanks a lot for noticing this, and contributing to the code 😃.

cwmeijer · 2023-06-14T08:22:25Z

In addition to Yang's comments. The results look so much better!! Great job!
Let us know when you added some test(s) so we can review it.
Also, see #508 for information on how to setup pre-commit in order to have the linter check (and fix) your work before every commit.

WillemSpek · 2023-06-14T13:40:07Z

Thanks for all the input, though a big culrpit of the poor result is apparently the segmenation algorithm: quickshift is used by LIME by default and the image I used uses skimage.segmentation.slic which is k-means centroid clustering. Nevertheless, the expalantion still looks better with the full coefficients imho, I have updated the PR description accordingly :)

WillemSpek · 2023-06-14T14:58:15Z

Sorry for all the redundant commits, was struggling a bit with the testing and the random nature of things. All should be good now :)

I have added a simple unittest test_lime_values.py in tests/test_lime.py similar to test_lime_function.py.

geek-yang

Good job @WillemSpek ! Sorry for the delay. I have tested your implementation and it works well on my machine. It is good to give the user option to grab the importance scores from LIME.

Btw, I won't put a "bug" label on this PR since it makes sense to only show user the masks. For instance, most of the example notebooks from LIME also play with the segmentation masks only, rather than including importance scores (e.g. https://github.com/marcotcr/lime/blob/master/doc/notebooks/Tutorial%20-%20images%20-%20Pytorch.ipynb).

There is some logic behind it. LIME uses linear surrogate model to delegate the local importance of each data point/pixel to the predicted classes. To train these surrogate models, we need to generate many sample data around each data point. It is very tricky to define the neighborhood and perturb your data to get good samples. Here are the explanations about how to define a meaningful neighborhood https://christophm.github.io/interpretable-ml-book/lime.html and how to make a good explainer https://slds-lmu.github.io/iml_methods_limitations/lime.html.

To conclude, the importance scores, which are basically the weight of your surrogate models, are very sensitive to the perturbations. Therefore, the scores are not robust and simply displaying these scores could be sometimes very misleading, especially to those who have little knowledge about LIME. Therefore, even in the documentation/tutorials from the original LIME implementation, they always show segmentation masks rather than scores.

But I agree with you that for some advanced users, it is necessary to have these scores. Thanks a lot for the contribution. Please address all my comments (simply by adding commits should be fine) and test it again before merging it to the code base. I will give you a go afterwards.

geek-yang · 2023-07-03T14:21:43Z

dianna/methods/lime.py

 from lime.lime_image import LimeImageExplainer
 from lime.lime_text import LimeTextExplainer
+from numpy.typing import NDArray


Suggested change

from numpy.typing import NDArray

Not necessary. The returned type can just be np.array, see below.

Ah yes, I have used this from the np.typing module as it supports hinting towards dimensions and types, but I never end up doing that. np.array is a higher level function to create an np.ndarray object which is what all arrays in numpy are defined as. So shouldn't we use np.ndarray?

Yes, I think np.ndarray is much better. Thanks for the suggestion.

I guess this is now consistent for now with the rest of the codebase so lets leave it as np.array for now but maybe this could be listed as a more general issue.

dianna/methods/lime.py

geek-yang · 2023-07-03T14:31:19Z

dianna/methods/lime.py

+
+        Leverages the `ImageExplanation` class from LIME to generate salience maps
+        based on the segmentation that was produced on the images, also stored in
+        `ImageExplanation`.


Please also consider adding more explanation about the scores, e.g. "The importance values are weight of surrogate models (e.g. weight of a ridge model) which are used by LIME to evaluate the contribution of each data/pixel/segmentation to the predicted classes."

geek-yang · 2023-07-03T14:58:11Z

Btw, I will remove bug tag from this PR and the issue. But thanks for your thought 👍.

WillemSpek · 2023-07-03T15:18:02Z

Thanks so much for the review @geek-yang! Your comment on the coefficient stability is also very insightful for me and I agree that from a user perspective, only returning the masks makes more sense. My usecase (metrics) does require coefficient scores for instance as discretizing the output leads to some loss of granularity which affects the metrics. Thanks for all the review comments, I will dive into them!

Based upon your comment, I have also changed the must have label to should have.

Co-authored-by: Yang <y.liu@esciencecenter.nl>

geek-yang

Nice work! Thanks for addressing my comments. Looks good now!

solved LIME coefficients

37ee9a5

WillemSpek added bug Something isn't working must have labels Jun 13, 2023

WillemSpek requested review from elboyran, cwmeijer, loostrum and geek-yang June 13, 2023 12:56

WillemSpek self-assigned this Jun 13, 2023

WillemSpek added 6 commits June 13, 2023 14:58

added documentation and minor fixes

700509a

fix linting and bypass some tests

dcda7b3

autopep8 formatting, non-agressive

465d4a5

linting fixes

78b9c85

keep liniting, keep linting

9866fed

finally used ruff to fix my linting

92c67fb

geek-yang reviewed Jun 13, 2023

View reviewed changes

added unit test and optimized code

3f84d76

WillemSpek added 2 commits June 14, 2023 15:50

fixed unittest by adding staticmethod decorator

d912b0d

updated test data

b88fbb6

geek-yang requested changes Jul 3, 2023

View reviewed changes

geek-yang removed the bug Something isn't working label Jul 3, 2023

WillemSpek added should have and removed must have labels Jul 3, 2023

WillemSpek and others added 2 commits July 4, 2023 09:47

Apply suggestions from code review

92145ba

Co-authored-by: Yang <y.liu@esciencecenter.nl>

Improve docstring of LIMEImage._get_explanation_values()

71c03c5

WillemSpek removed the request for review from elboyran July 4, 2023 08:15

WillemSpek removed the request for review from loostrum July 4, 2023 08:15

fix trailing whitespaces

bd9aa19

geek-yang approved these changes Jul 4, 2023

View reviewed changes

WillemSpek merged commit 72c27a7 into main Jul 4, 2023
22 checks passed

WillemSpek deleted the LIME_coefficients branch July 4, 2023 12:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

solved LIME coefficients #621

solved LIME coefficients #621

WillemSpek commented Jun 13, 2023 •

edited

Loading

geek-yang left a comment

cwmeijer commented Jun 14, 2023 •

edited

Loading

WillemSpek commented Jun 14, 2023

WillemSpek commented Jun 14, 2023 •

edited

Loading

geek-yang left a comment

geek-yang Jul 3, 2023

WillemSpek Jul 4, 2023 •

edited

Loading

geek-yang Jul 4, 2023

WillemSpek Jul 4, 2023

geek-yang Jul 3, 2023

WillemSpek Jul 4, 2023

geek-yang commented Jul 3, 2023

WillemSpek commented Jul 3, 2023 •

edited

Loading

geek-yang left a comment

solved LIME coefficients #621

solved LIME coefficients #621

Conversation

WillemSpek commented Jun 13, 2023 • edited Loading

geek-yang left a comment

Choose a reason for hiding this comment

cwmeijer commented Jun 14, 2023 • edited Loading

WillemSpek commented Jun 14, 2023

WillemSpek commented Jun 14, 2023 • edited Loading

geek-yang left a comment

Choose a reason for hiding this comment

geek-yang Jul 3, 2023

Choose a reason for hiding this comment

WillemSpek Jul 4, 2023 • edited Loading

Choose a reason for hiding this comment

geek-yang Jul 4, 2023

Choose a reason for hiding this comment

WillemSpek Jul 4, 2023

Choose a reason for hiding this comment

geek-yang Jul 3, 2023

Choose a reason for hiding this comment

WillemSpek Jul 4, 2023

Choose a reason for hiding this comment

geek-yang commented Jul 3, 2023

WillemSpek commented Jul 3, 2023 • edited Loading

geek-yang left a comment

Choose a reason for hiding this comment

WillemSpek commented Jun 13, 2023 •

edited

Loading

cwmeijer commented Jun 14, 2023 •

edited

Loading

WillemSpek commented Jun 14, 2023 •

edited

Loading

WillemSpek Jul 4, 2023 •

edited

Loading

WillemSpek commented Jul 3, 2023 •

edited

Loading