AI-based workflow for the Identification of pivotal biomarkers that distinguish MGUS and MM

Introduction

This repository contains all the codes for the classsification of MGUS and MM and identification of pivotal biomarkers that helps to distinguish the MGUS and MM using AI-based workflow.

Project Structure

|----- BDL_SP_Model_results
          |----- Supplementory_File_1_Significant_Genes.xlsx
          |----- Supplementory_File_2_significant_pathways_MM_MGUS.xlsx
          |----- Supplementory_File_3_SHAP_Analysis_Beeswarm_plot.xlsx
          |----- Supplementory_File_4_combinedSHAPRanking.xlsx
          |----- Supplementory_File_5_Sanky_Diagrams.docx
          |----- Supplementory_File_6_Graph_Convolutional_Network.docx
          |----- Supplementory_File_7_Pseudo_codes_best_shap_score_estimation.docx

|----- LICENSE
|----- figures
          |----- bdl-sp-architecture_v6.jpg
|----- src
          |----- bdl-sp-top-feature-extraction.py
          |----- Notebooks
                   |----- BDL_SP_SHAP_Analysis.ipynb
                   |----- samplewise_shap_analysis.ipynb
                   |----- shap_individual_feature_plot.ipynb
|----- README.md
|----- requirements.txt

System Requirements

Presently, ML codes are tested only for the Linux OS.

LINUX Operating System:

System Requirements:

• 64bit, 8.00 GB RAM

• OS version used for this pipeline: Ubuntu 18.04.

Prerequisites

All the prequisites are mentioned in requirements.txt

Data Preparation for model training

Started with BAM files from WES data.
Generated vcf filef from 4 variant callers softwares i.e. MuSE, Mutect2, Somatic-Sniper and Varscan2.
Annotation of above vcf files are done using the software ANNOVAR.
Identification of significantly mutated genes (SNV's) from the above vcf file using software dndscv (.csv).

Model Training

For model training, you need to follow the following steps in order to train the model.

• Get the annotated vcf files and significantly mutated genes for MGUS and MM.

• Run bdl-sp-top-feature-extraction.py model and train the cost-sensitive BDL-SP model using 5-fold cross validation.

• Once you have the trained BDL-SP model, open BDL_SP_SHAP_Analysis.ipynb, samplewise_shap_analysis.ipynb, shap_individual_feature_plot.ipynb for group-level and sample-level post-hoc model explainability using SHAP algorithm.

7. Citation

If you use BDL-SP for your research, please cite the following paper:

Ruhela, V., Jena, L., Kaur, G., Gupta, R. and Gupta, A., 2023. BDL-SP: A Bio-inspired DL model for the identification of altered Signaling Pathways in Multiple Myeloma using WES data. American Journal of Cancer Research, 13(4), p.1155.

8. License

See the LICENSE file for license rights and limitations (Apache2.0).

9. Acknowledgements

Authors would like to gratefully acknowledge the grant from Department of Biotechnology, Govt. of India [Grant: BT/MED/30/SP11006/2015] and Department of Science and Technology, Govt. of India [Grant: DST/ICPS/CPS-Individual/2018/279(G)].
Authors would like to gratefully acknowledge the support of SBILab, Deptt. of ECE & Centre of Excellence in Healthcare, Indraprastha Institute of Information Technology-Delhi (IIIT-D), India for providing guidance in tool methology and development.
Authors would like to gratefully acknowledge the support of Computational Biology Dept., Indraprastha Institute of Information Technology-Delhi (IIIT-D), India for providing resources for tool development.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI-based workflow for the Identification of pivotal biomarkers that distinguish MGUS and MM

Introduction

Project Structure

System Requirements

LINUX Operating System:

Prerequisites

Data Preparation for model training

Model Training

7. Citation

8. License

9. Acknowledgements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
BDL-SP Model results		BDL-SP Model results
figures		figures
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

vivekruhela/BDL-SP-Bio-inspired-DL-architecture-for-Identification-of-altered-Signaling-Pathways-in-MM

Folders and files

Latest commit

History

Repository files navigation

AI-based workflow for the Identification of pivotal biomarkers that distinguish MGUS and MM

Introduction

Project Structure

System Requirements

LINUX Operating System:

Prerequisites

Data Preparation for model training

Model Training

7. Citation

8. License

9. Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages