Conditional Probability Estimation in SAGE #16

alizia · 2023-03-08T05:40:39Z

alizia
Mar 8, 2023

I just read your paper 'Understanding Global Feature Contributions With Additive Importance Measures' and am looking forward to using SAGE in my own work.

There is one bit I am confused about that I hope you can clarify for me. Your colleagues published a paper titled 'True to the Model or True to the Data?' soon after your own paper in which they discuss the trade-off between using 'observational conditional expectation' vs 'interventional conditional expectation'. I can't quite put my finger on which method SAGE uses. When I read your paper I get the sense that the sampling method you use is interventional since it can produce samples off-manifold. Yet in section 3.1 of your paper in the properties of SAGE you say:

Due to the symmetry property, pairs of features ... e.g. perfect correlation) always have equal importance.

You also say that:
features may receive non-zero importance even if they are not used by f.

As I understand from reading Chen and Janizek's paper, those are qualities that describe the observational approach.

Can you please help me understand where SAGE falls? Are SAGE scores true to model or to data?

Thank you so much!

iancovert · 2023-03-09T22:24:33Z

iancovert
Mar 9, 2023
Maintainer

Hi Ali, thanks for reaching out about this. You're right, the paper was mostly written assuming we could follow the "true to the data" approach (the observational conditional expectation). This means handling held-out features by calculating f(x_s) := E[f(x) | x_s], and it's why we described the two properties you mentioned - correlated features having equal importance, and features potentially having non-zero importance even if they aren't used by f.

However, this is challenging to calculate, so SAGE was first implemented using the same approximation used in the shap paper (their eq. 11), or f(x_s) := E_{x_{\bar s}}[f(x_s, x_{\bar s})]. Our intent at the time wasn't to follow the "true to the model" approach (although you could view it that way), we used this because it was the best approximation we were aware of.

Since then, some better approximations have been developed. This recent review paper covers some of them (see Section 5.1.3), but long story short the main options are:

Train your model with random input masking (or input dropout). When you evaluate this model with subsets of features, it should theoretically handle the missing features by marginalizing them out with their conditional distribution. (Note that this model should probably be a neural network.)
Train a surrogate model to match your original model's predictions given subsets of features. This is pretty similar to the approach above, and once again should probably involve a neural network.
Train a conditional generative model to sample missing features. This approach is theoretically sound but probably more difficult to train properly than the above techniques.
Make parametric assumptions about the data distribution, e.g., multivariate Gaussian. Of course this won't work well if the assumptions are incorrect.

These days I often find that 1. or 2. are good options when I'm using with deep learning, particularly with image models. I've tested them a bit less with other data types.

Hopefully that's helpful. If none of these options are possible in your case, using the "true to the model" approach isn't a bad option - it will still yield useful importance scores. It's just that some of the properties we outlined will no longer hold.

Ian

1 reply

alizia Mar 30, 2023
Author

Thank you so much @iancovert! Very helpful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conditional Probability Estimation in SAGE #16

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Conditional Probability Estimation in SAGE #16

alizia Mar 8, 2023

Replies: 1 comment · 1 reply

iancovert Mar 9, 2023 Maintainer

alizia Mar 30, 2023 Author

alizia
Mar 8, 2023

Replies: 1 comment 1 reply

iancovert
Mar 9, 2023
Maintainer

alizia Mar 30, 2023
Author