Skip to content

Latest commit

 

History

History
26 lines (14 loc) · 1.58 KB

README.md

File metadata and controls

26 lines (14 loc) · 1.58 KB

Smiles_to_Structure

This script reads a list of SMILES (Simplified Molecular Input Line Entry System) strings from a file, converts them into chemical structures using RDKit, and displays the structures as images directly within a Jupyter notebook. Each structure is labeled with its entry number.

Open In Colab

Tutorial:

  1. Generate a csv file, for example, for a specific protein target, you can download a csv file from Chembl database
image
  1. Open this csv file with your excel or Mac number software, export it to csv again, since the downloaded csv may have some minor format issue not recognised by this code.

  2. Upload this new csv file to the code via the Colab budge link, and it is ready to process through the SMILES inside the csv you provided.

  3. read the file by modifying the file name df = pd.read_csv('11.csv') df

  4. Change the column number based on where is your SMILES in the csv file, if they are stored at 9th column, it should be !awk -F "\"*,\"*" '{print $8}' 11.csv > smile.smi

(6. Delete rows that not a SMILES, and export the new smiles string to a new csv file.)

Display the molecules as images inside Jupyter Notebook

image