Replies: 1 comment 1 reply
-
Hi Ali, thanks for reaching out about this. You're right, the paper was mostly written assuming we could follow the "true to the data" approach (the observational conditional expectation). This means handling held-out features by calculating However, this is challenging to calculate, so SAGE was first implemented using the same approximation used in the shap paper (their eq. 11), or Since then, some better approximations have been developed. This recent review paper covers some of them (see Section 5.1.3), but long story short the main options are:
These days I often find that 1. or 2. are good options when I'm using with deep learning, particularly with image models. I've tested them a bit less with other data types. Hopefully that's helpful. If none of these options are possible in your case, using the "true to the model" approach isn't a bad option - it will still yield useful importance scores. It's just that some of the properties we outlined will no longer hold. Ian |
Beta Was this translation helpful? Give feedback.
-
Hey @iancovert,
I just read your paper 'Understanding Global Feature Contributions With Additive Importance Measures' and am looking forward to using SAGE in my own work.
There is one bit I am confused about that I hope you can clarify for me. Your colleagues published a paper titled 'True to the Model or True to the Data?' soon after your own paper in which they discuss the trade-off between using 'observational conditional expectation' vs 'interventional conditional expectation'. I can't quite put my finger on which method SAGE uses. When I read your paper I get the sense that the sampling method you use is interventional since it can produce samples off-manifold. Yet in section 3.1 of your paper in the properties of SAGE you say:
Due to the symmetry property, pairs of features ... e.g. perfect correlation) always have equal importance.
You also say that:
features may receive non-zero importance even if they are not used by f.
As I understand from reading Chen and Janizek's paper, those are qualities that describe the observational approach.
Can you please help me understand where SAGE falls? Are SAGE scores true to model or to data?
Thank you so much!
Beta Was this translation helpful? Give feedback.
All reactions