Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

560 improve results of lime timeseries notebooks #589

Merged
merged 13 commits into from
May 23, 2023

Conversation

geek-yang
Copy link
Member

We try to fine-tune the parameters of lime timeseries and improve the results in the notebooks (lime_timeseries_coffee.ipynb and lime_timeseries_weather.ipynb).

Note that because of the absence of strategic segmentation and multi-channels masking, the results and visualization are not perfect. But they got much improved than before (at least good enough to show during the surf event.)

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@geek-yang geek-yang linked an issue May 11, 2023 that may be closed by this pull request
@geek-yang geek-yang marked this pull request as ready for review May 15, 2023 07:10
@geek-yang geek-yang requested a review from stefsmeets May 15, 2023 07:10
Copy link
Contributor

@stefsmeets stefsmeets left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I noticed that you use explain_timeseries in the notebooks. But when using method='lime' or method='rise' gives different output. The interface is not consistant. Is this something that can be fixed in this PR or should it be a new issue?

Lime gives back an explanation instance:

>>> lime_exp = dianna.explain_timeseries(
>>>     run_expert_model, 
>>>     timeseries_data=data_extreme,
>>>     method='lime', 
>>>     labels=[0,1], 
>>>     class_names=["summer", "winter"],
>>>     num_features=len(data_extreme),
>>>     num_samples=10000,
>>>     num_slices=len(data_extreme), 
>>>     distance_method='euclidean',
>>>     mask_type=input_train_mean)
>>> rise_exp
<lime.explanation.Explanation at 0x7f8dbf2dfe20>

Rise gives back a numpy array:

>>> rise_exp = dianna.explain_timeseries(
>>>     run_expert_model, 
>>>     timeseries_data=data_extreme,
>>>     method='rise', 
>>>     labels=[0,1], 
>>>     p_keep=0.1,
>>>     n_masks=10000, 
>>>     mask_type=input_train_mean)
>>> rise_exp
array([[[7.700e-02],
        [3.000e-03],
        ...
        [4.000e-03]],

       [[9.650e-01],
        [9.820e-01],
        ...
       [1.053e+00]]])
>>> rise_exp.shape
(2, 28, 1)

@geek-yang
Copy link
Member Author

Nice work! I noticed that you use explain_timeseries in the notebooks. But when using method='lime' or method='rise' gives different output. The interface is not consistant. Is this something that can be fixed in this PR or should it be a new issue?

Hi @stefsmeets , thanks a lot for your review and quick feedback 😄. Let me take a quick look into the interface, that's a bit unexpected. I will try to fix it in this PR, if possible.

Lime gives back an explanation instance:

>>> lime_exp = dianna.explain_timeseries(
>>>     run_expert_model, 
>>>     timeseries_data=data_extreme,
>>>     method='lime', 
>>>     labels=[0,1], 
>>>     class_names=["summer", "winter"],
>>>     num_features=len(data_extreme),
>>>     num_samples=10000,
>>>     num_slices=len(data_extreme), 
>>>     distance_method='euclidean',
>>>     mask_type=input_train_mean)
>>> rise_exp
<lime.explanation.Explanation at 0x7f8dbf2dfe20>

Rise gives back a numpy array:

>>> rise_exp = dianna.explain_timeseries(
>>>     run_expert_model, 
>>>     timeseries_data=data_extreme,
>>>     method='rise', 
>>>     labels=[0,1], 
>>>     p_keep=0.1,
>>>     n_masks=10000, 
>>>     mask_type=input_train_mean)
>>> rise_exp
array([[[7.700e-02],
        [3.000e-03],
        ...
        [4.000e-03]],

       [[9.650e-01],
        [9.820e-01],
        ...
       [1.053e+00]]])
>>> rise_exp.shape
(2, 28, 1)

This is because we use the lime_base function from the original implementation of lime to compute the scores. And it returns an explainer object, which contains LIME scores in explainer.local_exp. It is nice that you raise this point. I think we can simply only return explainer.local_exp, and convert it to a numpy array as well, to be consistent with rise. Let me fix this then.

@geek-yang geek-yang requested a review from stefsmeets May 17, 2023 12:46
@geek-yang
Copy link
Member Author

geek-yang commented May 17, 2023

@stefsmeets I just updated the returned values of LIME timeseries explainer and it is an array, which is consistent with RISE.

I also updated the notebook (lime_timeseries_weather.ipynb) to use explain_timeseries interface. The results from LIME timeseries are different from RISE, which is expected (not due to the difference in interface, but for several other reasons, e.g. the algorithms are different, also due to the absence of segmentation strategy #546).

Just take another look and let me know if you have more comments, thanks @stefsmeets !

Copy link
Contributor

@stefsmeets stefsmeets left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Looks good to me! 🚀

@stefsmeets stefsmeets merged commit 0155a5a into main May 23, 2023
22 checks passed
Sprint 30 - DIANNA 1.0.0 release including timeseries tutorials automation moved this from Ready for review to Done May 23, 2023
@stefsmeets stefsmeets deleted the 560-improve-results-notebook-LIME-timeseries branch May 23, 2023 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Improve results in tutorial notebooks for LIME timeseries
2 participants