Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculation of SHAP values for continuous and categorical variables #93

Open
lzxcvn opened this issue Mar 17, 2024 · 1 comment
Open
Labels
question ❔ Further information is requested

Comments

@lzxcvn
Copy link

lzxcvn commented Mar 17, 2024

Hello, While using model_survshap() for global interpretation with shaps, I found that this function seems to be unable to distinguish between continuous variables and categorical variables, which has caused my concern. Even though I added a list of categorical variables, is the calculation of shap values the same for different variables? This seems to change the order of importance of the variables.
image
My code can run, but the results seem unreliable. Previously unimportant variables such as categorical variables like sex seem to have become more important

@hbaniecki
Copy link
Member

hbaniecki commented Jun 13, 2024

Hi @lzxcvn, where did you find the categorical_variables parameter?

In most cases, SHAP does not distinguish between continuous and categorical variables. It might be important when conditional imputation is used for feature marginalization (instead of the default marginal feature distribution). For details, refer to the shapr R package https://github.com/NorskRegnesentral/shapr, and the related research e.g. https://doi.org/10.1007/s10618-024-01016-z.

Moreover, KernelSHAP is an approximation algorithm that includes randomness, which can lead to changes in the order of importance of the variables.

@hbaniecki hbaniecki added the question ❔ Further information is requested label Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question ❔ Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants