Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues about interpreting a CNN-LSTM model using DeepExplain #72

Open
yilululu opened this issue May 24, 2023 · 1 comment
Open

Issues about interpreting a CNN-LSTM model using DeepExplain #72

yilululu opened this issue May 24, 2023 · 1 comment

Comments

@yilululu
Copy link

I want to build a CNN+LSTM hybrid model using gene data (5516,1) to predict the symptom, which is a continuous variable. The architecture is showing above:

# define the CNN+LSTM model
def build_model(input_shape=(5516,1),kernel=3,drop=0.2):
   inputs = keras.Input(shape=input_shape, name='input')
   x = layers.Conv1D(16,kernel,
                     activation='relu',
                     padding='valid',
                     strides=2,
                     use_bias=True,
                     kernel_initializer=keras.initializers.glorot_normal,
                     bias_initializer=keras.initializers.glorot_normal,
                     )(inputs)
   x = layers.MaxPool1D(pool_size=2)(x)
   x = layers.Dropout(drop)(x)
   x = layers.BatchNormalization()(x)
   
   lstm = keras.layers.LSTM(32,
                            use_bias=True,
                   kernel_initializer=keras.initializers.glorot_normal,
                   bias_initializer=keras.initializers.glorot_normal)(x) 
 x = layers.Dropout(drop)(lstm)
   x = layers.BatchNormalization()(x)
   x = layers.Dense(1,
                    activation=keras.activations.linear,
                   use_bias=True,
                   kernel_initializer=keras.initializers.glorot_normal,
                   bias_initializer=keras.initializers.glorot_normal)(x)
   
   model = keras.Model(inputs,x)
   return model

# build the model
model=build_model(input_shape=(5516,1),drop=0.2)
model.summary()
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.005),loss=[keras.losses.mse],metrics=[tfa.metrics.RSquare()])

Using the "Saliency" and "e-LRP" functions, I successfully calculated the attribution map for 5516 inputs.

with DeepExplain(session=K.get_session()) as de:
    input_tensor = model.layers[0].input
    print(input_tensor)
    # 2. We now target the output of the last dense layer (pre-softmax)
    # To do so, create a new model sharing the same layers untill the last dense (index -2)
    
    fModel = keras.Model(inputs=input_tensor, outputs = model.output)
    target_tensor = fModel(input_tensor)
    print(target_tensor,model.output)
    attributions = de.explain('saliency', target_tensor, input_tensor,train_gene,ys=train_label)

However, I observed a concentration of high attribution scores in the last chromosome (i.e., the last 200 inputs) using both methods. This concentration poses a potential problem because input features should ideally contribute to the trained model in a whole-genome wide manner, with top attribution scores being more evenly distributed.

In addition, we also constructed a CNN-based model and interpreted it, which yielded more favorable results with top attribution scores spread across a wider range of input features.

I am seeking suggestions regarding this issue, as my aim is to identify the most relevant input features (i.e., genetic loci), but the current results are challenging to interpret.
Would it be advisable to separate the attribution map into different chromosomes and identify the most significant features within each range?

Thanks for any help.

Yilu Zhao

@Zhangyang823
Copy link

Zhangyang823 commented May 24, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants