Skip to content

This script reads a list of SMILES (Simplified Molecular Input Line Entry System) strings from a file, converts them into chemical structures using RDKit, and displays the structures as images directly within a Jupyter notebook. Each structure is labeled with its entry number.

Notifications You must be signed in to change notification settings

quantaosun/Smiles_to_Structure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Smiles_to_Structure

This script reads a list of SMILES (Simplified Molecular Input Line Entry System) strings from a file, converts them into chemical structures using RDKit, and displays the structures as images directly within a Jupyter notebook. Each structure is labeled with its entry number.

Open In Colab

Tutorial:

  1. Generate a csv file, for example, for a specific protein target, you can download a csv file from Chembl database
image
  1. Open this csv file with your excel or Mac number software, export it to csv again, since the downloaded csv may have some minor format issue not recognised by this code.

  2. Upload this new csv file to the code via the Colab budge link, and it is ready to process through the SMILES inside the csv you provided.

  3. read the file by modifying the file name df = pd.read_csv('11.csv') df

  4. Change the column number based on where is your SMILES in the csv file, if they are stored at 9th column, it should be !awk -F "\"*,\"*" '{print $8}' 11.csv > smile.smi

(6. Delete rows that not a SMILES, and export the new smiles string to a new csv file.)

Display the molecules as images inside Jupyter Notebook

image

About

This script reads a list of SMILES (Simplified Molecular Input Line Entry System) strings from a file, converts them into chemical structures using RDKit, and displays the structures as images directly within a Jupyter notebook. Each structure is labeled with its entry number.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages