Introduction to Chemoinformatics

40
Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course

description

Introduction to Chemoinformatics. Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course. Drug Discovery Process. The drug candidate. - PowerPoint PPT Presentation

Transcript of Introduction to Chemoinformatics

Introduction to Chemoinformatics

Irene KouskoumvekakiAssociate Professor

December 12th, 2012Biological Sequence Analysis course

2 CBS, Department of Systems Biology

Drug Discovery Process

3 CBS, Department of Systems Biology

The drug candidate

... is a (ligand) compound that binds to a biological target (protein, enzyme, receptor, ...) and in this way either initiates a process (agonist) or inhibits it (antagonist/inhibitor)

The structure/conformation of the ligand is complementary to the space defined by the protein’s active site

The binding is caused by favorable interactions between the ligand and the side chains of the amino acids in the active site. (electrostatic interactions, hydrogen bonds, hydrophobic contacts...)

4 CBS, Department of Systems Biology

5 CBS, Department of Systems Biology

Wet-lab drug discovery process

Screening collection

HTS

Actives

103 actives106 cmp.

6 CBS, Department of Systems Biology

Screening collection

HTS

Actives

103 actives106 cmp.

High rate of false actives!!!

High throughput is not enough to get high output…..

Wet-lab drug discovery process

7 CBS, Department of Systems Biology

Screening collection

HTS

Actives

103 actives106 cmp.

Follow-up Chemical structure

Purity

Mechanism

Activity value

Wet-lab drug discovery process

8 CBS, Department of Systems Biology

Screening collection

HTS

Actives

103 actives106 cmp.

Follow-up

Hits

1-10 hits

Analogues synthesis and testing

ADMET properties

Wet-lab drug discovery process

9 CBS, Department of Systems Biology

Wet-lab drug discovery process

Screening collection

HTS

Actives

103 actives106 cmp.

Follow-up

Hits

1-10 hits

Lead series

0-3 lead series

Hit-to-lead

Analogues synthesis and testing

ADMET properties

10 CBS, Department of Systems Biology

Wet-lab drug discovery process

Screening collection

HTS

Actives

103 actives106 cmp.

Follow-up

Hits

1-10 hits

Lead series

0-3 lead series

Hit-to-lead

Drug candidate

0-1

Lead-to-drug

Analogues synthesis and testing

ADMET properties

11 CBS, Department of Systems Biology

12 CBS, Department of Systems Biology

Failures

13 CBS, Department of Systems Biology

We need more.. to find less..

14 CBS, Department of Systems Biology

Drug Discovery Process

Chemoinformatics

15 CBS, Department of Systems Biology

Wet-lab + Dry-lab drug discovery

Diverse set of molecules tested in the lab

in vitro in silico + in vitro

Computational methods to select subsets (to be tested in the lab) based on prediction of drug-likeness, solubility, binding, pharmacokinetics, toxicity, side effects, ...

16 CBS, Department of Systems Biology

The Lipinski ‘rule of five’ for drug-likeness prediction

Molecular weight ≤ 500 # hydrogen bond acceptors (HBA) ≤ 10 # hydrogen bond donors (HBD) ≤ 5 Octanol-water partition coefficient (logP) ≤ 5 (MlogP ≤ 4.15)

If two or more of these rules are violated, the compound might

have problems with oral bioavailability.(Lipinski et al., Adv. Drug Delivery Rev., 23, 1997, 3.)

17 CBS, Department of Systems Biology

Exercise : Prediction of drug-likeness

•Go to the following webpagewww.molsoft.com/mprop

•Draw proguanil and decide if it is a drug-like compound

18 CBS, Department of Systems Biology

19 CBS, Department of Systems Biology

Proguanil antimalarian tablets

20 CBS, Department of Systems Biology

Chemoinformatics

Gathering and systematic use of chemical information, and application of this information to predict the behavior of unknown compounds in silico.

data prediction

21 CBS, Department of Systems Biology

Major Aspects of Chemoinformatics

•Databases: Development of databases for storage and retrieval of small molecule structures and their properties.

•Machine learning: Training of Decision Trees, Neural Networks, Self Organizing Maps, etc. on molecular data.

•Predictions: Molecular properties relevant to drugs, virtual screening of chemical libraries, system chemical biology networks…

22 CBS, Department of Systems Biology

23 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types– stereochemical configuration– charges– isotopes – 3D-coordinates for atoms

C8H9NO3

24 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types– stereochemical configuration– charges– isotopes – 3D-coordinates for atoms

OH

CH2

CHNH2OH

O

25 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types (aromatic ring identification)

– stereochemical configuration– charges– isotopes – 3D-coordinates for atoms

OH

CH2

CHNH2OH

O

26 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types– stereochemical configuration– charges– isotopes – 3D-coordinates for atoms

OH

CH2

CHNH2OH

O

27 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types– stereochemical configuration– charges– isotopes– 3D-coordinates for atoms

OH

CH2

CHNH3+

O

O

28 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types– stereochemical configuration– charges– isotopes– 3D-coordinates for atoms

OH

CH2

C14 HNH2OH

O

29 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types– stereochemical configuration– charges– isotopes– 3D-coordinates for atoms

30 CBS, Department of Systems Biology

From chemists to representations

31 CBS, Department of Systems Biology

Structural representation of molecules

Structural representation of moleculesLine notations

Connection tables

32 CBS, Department of Systems Biology

SMILES (Simplified Molecular Input Line Entry System)

Canonical SMILES: unique for each structure

Isomeric SMILES: describe isotopism, configuration around double bonds and tetrahedral centers, chirality

33 CBS, Department of Systems Biology

InChI(IUPAC International Chemical Identifier)

34 CBS, Department of Systems Biology

MOLfile format (.sdf)

35 CBS, Department of Systems Biology

Small molecule databases

36 CBS, Department of Systems Biology

Try it yourself!

• Go to PubChem: pubchem.ncbi.nlm.nih.gov/

• Type proguanil and press Go

• Click on the first result on the list

37 CBS, Department of Systems Biology

Try it yourself!

• Scroll down and find the SMILES and InChI

38 CBS, Department of Systems Biology

Try it yourself!

• Click on SDF (top right icon)

• Select: 2D SDF: Display

39 CBS, Department of Systems Biology

Try it yourself!

• Go back and click again on SDF

• Select: 3D SDF: Display

40 CBS, Department of Systems Biology

Questions?