646 Implement LIME for tabular data #661

geek-yang · 2023-11-14T14:16:33Z

Add LIME tabular to the methods and provide example notebooks for both regression and classification tasks:

Also fixes the inconsistent input variable names in init (https://github.com/dianna-ai/dianna/blob/add_lime_tabular/dianna/__init__.py).

Note that the method returns an array of importance scores for all input features (which means num_features is disabled). For more information about this decision, check this issue #665.

review-notebook-app · 2023-11-21T15:54:50Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

geek-yang · 2023-11-21T17:14:46Z

We need a visualizer for the results. I created an issue for it #662.

cwmeijer

Looks great Yang! I still have some requests like additional tests, and also something to discuss maybe: the necessity to pass the train data to the method.
Great work, also on the notebooks!

cwmeijer · 2023-11-29T14:08:36Z

dianna/__init__.py

+    )
+    return explainer.explain(
+        model_or_function=model_or_function,
+        input_tabular=input_tabular,


haha input_tabular! I was already guessing what this new variable's name would be after the changes you made above

cwmeijer · 2023-11-29T14:22:52Z

tests/methods/test_lime_tabular.py

+class LIMEOnTabular(TestCase):
+    """Suite of LIME tests for the tabular case."""
+
+    def test_lime_tabular_classification_correct_output_shape(self):
+        """Test the output of explainer."""
+        training_data = np.random.random((10, 2))
+        input_data = np.random.random(2)
+        feature_names = ["feature_1", "feature_2"]
+        explainer = LIMETabular(training_data,
+                                mode ='classification',
+                                feature_names=feature_names,
+                                class_names = ["class_1", "class_2"])
+        exp = explainer.explain(
+            run_model,
+            input_data,
+        )
+        assert len(exp[0]) == len(feature_names)
+
+    def test_lime_tabular_regression_correct_output_shape(self):
+        """Test the output of explainer."""
+        training_data = np.random.random((10, 2))
+        input_data = np.random.random(2)
+        feature_names = ["feature_1", "feature_2"]
+        exp = dianna.explain_tabular(run_model, input_tabular=input_data, method='lime',
+                             mode ='regression', training_data = training_data,
+                             feature_names=feature_names, class_names=['class_1'])
+
+        assert len(exp) == len(feature_names)


Now you're only testing for shapes. It would be nice to add a test with synthetic data+model with some stupidly simple pattern to check if the method can detect it. Just like we have for timeseries RISE.

I like the idea of adding a "naive test" for it. I thought about it. But then I was thinking that we added a test for timeseries mainly because we implement those methods ourselves (referring to some papers and methods like lime segmentation, for instance, but purely implementing those methods mostly from scratch).

This LIME tabular is entirely from LIME. We simply put a wrapper around it. It is somehow tested in the original implementation of LIME:

https://github.com/marcotcr/lime/blob/master/lime/tests/test_lime_tabular.py

But I do agree that it is nice to add a test for it. Maybe let's create an issue and design a simple test for it?? This PR already gets too big...I would prefer to do it in a separate PR.

Issue #669 created! 😄

dianna/methods/lime_tabular.py

cwmeijer · 2023-11-29T15:48:18Z

dianna/methods/lime_tabular.py

+        self.top_labels = len(class_names)
+
+        self.explainer = LimeTabularExplainer(
+            training_data,


Would it be possible to call LIME without training_data?

See my comments above.

cwmeijer · 2023-11-29T15:49:28Z

dianna/methods/lime_tabular.py

+    def explain(
+        self,
+        model_or_function,
+        input_tabular,
+        labels=(1, ),
+        num_samples=5000,
+        **kwargs,


Could you add Typing to all public methods? Also for the return types.

Oh yes, well spot! Thanks Chris! Just miss those stuff. I will add them.

cwmeijer · 2023-11-29T16:00:21Z

tutorials/lime_tabular_penguin.ipynb

The data_instance loading is done in an unexpected section. It would be better to do this in the data loading section above.

A bit lower, you're using the variable name 'exp' which is short but ambiguous. It would be better to call it 'explanation'.

Thanks for the suggestions! Done.

cwmeijer · 2023-11-29T16:00:45Z

tutorials/lime_tabular_weather.ipynb

Idem to my comments on the penguin notebook.

cwmeijer

🏅 🧸 🏆 🦾 🎉

elboyran · 2023-12-07T14:48:03Z

[heart] Elena Ranguelova reacted to your message:

…

________________________________ From: Christiaan Meijer ***@***.***> Sent: Thursday, December 7, 2023 2:23:00 PM To: dianna-ai/dianna ***@***.***> Cc: Subscribed ***@***.***> Subject: Re: [dianna-ai/dianna] 646 Implement LIME for tabular data (PR #661) @cwmeijer approved this pull request. 🏅 🧸 🏆 🦾 🎉 — Reply to this email directly, view it on GitHub<#661 (review)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAYYBWL7UUMEG7R5PR2IMY3YIHGMJAVCNFSM6AAAAAA7K4ES36VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTONZQGI3TOMJUGY>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

add lime tabular to methods

ce26759

geek-yang linked an issue Nov 14, 2023 that may be closed by this pull request

Implement tabular version of LIME (wrap existing) #646

Closed

Yang added 2 commits November 14, 2023 17:14

support more kwargs

6206665

add docstrings

afa3f89

geek-yang mentioned this pull request Nov 20, 2023

Sunshine hours prediction regression model dianna-ai/dianna-exploration#192

Merged

Yang added 4 commits November 21, 2023 15:39

add lime tabular classification tutorial

e611668

lime tabular with dianna api

191a862

consistent input names in init function

f4bbb77

fix test

ccced97

geek-yang mentioned this pull request Nov 21, 2023

Add visualization module for dianna tabular data methods #662

Closed

lime tabular tutorial for regression

d440b42

cwmeijer added this to In progress in SS Sprint 3 - improve code quality, adding tabular funcitonality Nov 22, 2023

geek-yang mentioned this pull request Nov 22, 2023

num_features keyword not working in LIME timeseries #665

Open

Yang added 4 commits November 23, 2023 12:28

return numpy as output and update notebooks

2cfdf20

add tests for lime tabular

25bea49

bring back single quotation mark

b3a7a9e

fix tests

85f152a

geek-yang marked this pull request as ready for review November 23, 2023 15:38

geek-yang requested a review from cwmeijer November 23, 2023 15:38

cwmeijer requested changes Nov 29, 2023

View reviewed changes

geek-yang mentioned this pull request Nov 30, 2023

Add a synthetic data+model test to tabular methods #669

Open

Yang added 3 commits December 7, 2023 10:25

add typing to arguments

7614265

update notebooks

09291f8

fix typing

0dcbd0c

geek-yang requested a review from cwmeijer December 7, 2023 12:29

cwmeijer approved these changes Dec 7, 2023

View reviewed changes

geek-yang merged commit 53188c3 into main Dec 7, 2023
15 of 18 checks passed

SS Sprint 3 - improve code quality, adding tabular funcitonality automation moved this from In progress to Done Dec 7, 2023

geek-yang deleted the add_lime_tabular branch December 7, 2023 15:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

646 Implement LIME for tabular data #661

646 Implement LIME for tabular data #661

geek-yang commented Nov 14, 2023 •

edited

Loading

review-notebook-app bot commented Nov 21, 2023

geek-yang commented Nov 21, 2023 •

edited

Loading

cwmeijer left a comment

cwmeijer Nov 29, 2023

cwmeijer Nov 29, 2023

geek-yang Nov 30, 2023

geek-yang Nov 30, 2023

cwmeijer Nov 29, 2023

geek-yang Nov 30, 2023

cwmeijer Nov 29, 2023

geek-yang Nov 30, 2023

cwmeijer Nov 29, 2023

geek-yang Dec 7, 2023

cwmeijer Nov 29, 2023

geek-yang Dec 7, 2023

cwmeijer left a comment

elboyran commented Dec 7, 2023 via email

646 Implement LIME for tabular data #661

646 Implement LIME for tabular data #661

Conversation

geek-yang commented Nov 14, 2023 • edited Loading

review-notebook-app bot commented Nov 21, 2023

geek-yang commented Nov 21, 2023 • edited Loading

cwmeijer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cwmeijer left a comment

Choose a reason for hiding this comment

elboyran commented Dec 7, 2023 via email

geek-yang commented Nov 14, 2023 •

edited

Loading

geek-yang commented Nov 21, 2023 •

edited

Loading