Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of deepExplain for muilti-label classifier in keras #44

Open
amjass12 opened this issue Aug 3, 2019 · 2 comments
Open

Comments

@amjass12
Copy link

amjass12 commented Aug 3, 2019

Hi all,

I was wondering if it would be possible to explain some scores from deepExplain for an occlusion analysis i am attempting to carry out. It is actually a little confusing as I am unsure about whether positive scores mean the feature contributes positively to the model or if the scores mean something entirely different.

Added to this confusion is the fact that this is a multi-label classification, so one sample may have more than one label. the code is as follows:

from deepexplain.tensorflow import DeepExplain
from keras import Model
with DeepExplain(session=K.get_session()) as de:
    input_tensors = model.inputs
    fModel = Model(inputs=input_tensors, outputs=model.outputs)
    target_tensor = fModel(input_tensors)

    input_tensor = model.layers[0].input

    
    fModel = Model(inputs=input_tensor, outputs=model.layers[-1].output) **[final layer 24 classes]**
    target_tensor = fModel(input_tensor)
    xs = X_train[0:24]... **this is confusing as X_train contains many samples however xs must match ys?**
    ys = y_train[0:24]... **y_train, number of classes**


    attributions = de.explain('occlusion', target_tensor, input_tensor, xs, ys=ys)
    print("Attributions:\n", attributions)

    attributions=attributions.transpose()
    attributions=pd.DataFrame(attributions, index=X_train1.index)

The scores i get range between -ve and positive values. Would somebody be kind enough to correct my code as I am pretty sure it is wrong especially as i am confused about the xs and ys as well as explain the output values, what -ve values mean and what +ve values mean, and what sort of range I should expect these to be in?

Many thanks!

@marcoancona
Copy link
Owner

marcoancona commented Aug 6, 2019

Hi,
positive/negative scores mean positive or negative contributions to the target output, respectively.
Your code seems generally correct to me, but I would need to know what is your xs and ys to be more precise. You also mention -ve and +ve but what are these?

@amjass12
Copy link
Author

amjass12 commented Aug 6, 2019

Hi @marcoancona

Thank you for your response!

Yes that indeed makes sense! I guess the confusing thing for me is that that was my assumption too, however for a target output for a class, positive samples do not necessarily reflect this, and if i look at the features, they are very variable and do not seem to be representative of a given class which is why i am unsure about why they are given positive scores. If i do shap importance values, I get features that show a unique importance to some classes and indeed the raw values of the feature for a given class reflect this.

This is why i thought my xs and ys is wrong. xs is is my X_train (training samples) array that contain x amount of samples, by x amount of features to be trained on. I think therefore my subset of 1:24 is wrong as that is the number of labels in my data. ys is my one-hot encoded array. This is 24 classes (24 labels) which specifies for each sample, what they are (0,1,0 etc)

-ve and +ve was just a question about the output of deep-explain. what do negative and positive values mean , however you answered this!! thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants