Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012...

40
Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course

Transcript of Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012...

Page 1: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

Introduction to Chemoinformatics

Irene KouskoumvekakiAssociate Professor

December 12th, 2012Biological Sequence Analysis course

Page 2: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

2 CBS, Department of Systems Biology

Drug Discovery Process

Page 3: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

3 CBS, Department of Systems Biology

The drug candidate

... is a (ligand) compound that binds to a biological target (protein, enzyme, receptor, ...) and in this way either initiates a process (agonist) or inhibits it (antagonist/inhibitor)

The structure/conformation of the ligand is complementary to the space defined by the protein’s active site

The binding is caused by favorable interactions between the ligand and the side chains of the amino acids in the active site. (electrostatic interactions, hydrogen bonds, hydrophobic contacts...)

Page 4: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

4 CBS, Department of Systems Biology

Page 5: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

5 CBS, Department of Systems Biology

Wet-lab drug discovery process

Screening collection

HTS

Actives

103 actives106 cmp.

Page 6: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

6 CBS, Department of Systems Biology

Screening collection

HTS

Actives

103 actives106 cmp.

High rate of false actives!!!

High throughput is not enough to get high output…..

Wet-lab drug discovery process

Page 7: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

7 CBS, Department of Systems Biology

Screening collection

HTS

Actives

103 actives106 cmp.

Follow-up Chemical structure

Purity

Mechanism

Activity value

Wet-lab drug discovery process

Page 8: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

8 CBS, Department of Systems Biology

Screening collection

HTS

Actives

103 actives106 cmp.

Follow-up

Hits

1-10 hits

Analogues synthesis and testing

ADMET properties

Wet-lab drug discovery process

Page 9: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

9 CBS, Department of Systems Biology

Wet-lab drug discovery process

Screening collection

HTS

Actives

103 actives106 cmp.

Follow-up

Hits

1-10 hits

Lead series

0-3 lead series

Hit-to-lead

Analogues synthesis and testing

ADMET properties

Page 10: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

10 CBS, Department of Systems Biology

Wet-lab drug discovery process

Screening collection

HTS

Actives

103 actives106 cmp.

Follow-up

Hits

1-10 hits

Lead series

0-3 lead series

Hit-to-lead

Drug candidate

0-1

Lead-to-drug

Analogues synthesis and testing

ADMET properties

Page 11: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

11 CBS, Department of Systems Biology

Page 12: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

12 CBS, Department of Systems Biology

Failures

Page 13: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

13 CBS, Department of Systems Biology

We need more.. to find less..

Page 14: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

14 CBS, Department of Systems Biology

Drug Discovery Process

Chemoinformatics

Page 15: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

15 CBS, Department of Systems Biology

Wet-lab + Dry-lab drug discovery

Diverse set of molecules tested in the lab

in vitro in silico + in vitro

Computational methods to select subsets (to be tested in the lab) based on prediction of drug-likeness, solubility, binding, pharmacokinetics, toxicity, side effects, ...

Page 16: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

16 CBS, Department of Systems Biology

The Lipinski ‘rule of five’ for drug-likeness prediction

Molecular weight ≤ 500 # hydrogen bond acceptors (HBA) ≤ 10 # hydrogen bond donors (HBD) ≤ 5 Octanol-water partition coefficient (logP) ≤ 5 (MlogP ≤ 4.15)

If two or more of these rules are violated, the compound might

have problems with oral bioavailability.(Lipinski et al., Adv. Drug Delivery Rev., 23, 1997, 3.)

Page 17: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

17 CBS, Department of Systems Biology

Exercise : Prediction of drug-likeness

•Go to the following webpagewww.molsoft.com/mprop

•Draw proguanil and decide if it is a drug-like compound

Page 18: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

18 CBS, Department of Systems Biology

Page 19: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

19 CBS, Department of Systems Biology

Proguanil antimalarian tablets

Page 20: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

20 CBS, Department of Systems Biology

Chemoinformatics

Gathering and systematic use of chemical information, and application of this information to predict the behavior of unknown compounds in silico.

data prediction

Page 21: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

21 CBS, Department of Systems Biology

Major Aspects of Chemoinformatics

•Databases: Development of databases for storage and retrieval of small molecule structures and their properties.

•Machine learning: Training of Decision Trees, Neural Networks, Self Organizing Maps, etc. on molecular data.

•Predictions: Molecular properties relevant to drugs, virtual screening of chemical libraries, system chemical biology networks…

Page 22: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

22 CBS, Department of Systems Biology

Page 23: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

23 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types– stereochemical configuration– charges– isotopes – 3D-coordinates for atoms

C8H9NO3

Page 24: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

24 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types– stereochemical configuration– charges– isotopes – 3D-coordinates for atoms

OH

CH2

CHNH2OH

O

Page 25: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

25 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types (aromatic ring identification)

– stereochemical configuration– charges– isotopes – 3D-coordinates for atoms

OH

CH2

CHNH2OH

O

Page 26: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

26 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types– stereochemical configuration– charges– isotopes – 3D-coordinates for atoms

OH

CH2

CHNH2OH

O

Page 27: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

27 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types– stereochemical configuration– charges– isotopes– 3D-coordinates for atoms

OH

CH2

CHNH3+

O

O

Page 28: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

28 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types– stereochemical configuration– charges– isotopes– 3D-coordinates for atoms

OH

CH2

C14 HNH2OH

O

Page 29: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

29 CBS, Department of Systems Biology

Representing a chemical structure

• How much information do you want to include?– atoms present– connections between atoms

• bond types– stereochemical configuration– charges– isotopes– 3D-coordinates for atoms

Page 30: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

30 CBS, Department of Systems Biology

From chemists to representations

Page 31: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

31 CBS, Department of Systems Biology

Structural representation of molecules

Structural representation of moleculesLine notations

Connection tables

Page 32: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

32 CBS, Department of Systems Biology

SMILES (Simplified Molecular Input Line Entry System)

Canonical SMILES: unique for each structure

Isomeric SMILES: describe isotopism, configuration around double bonds and tetrahedral centers, chirality

Page 33: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

33 CBS, Department of Systems Biology

InChI(IUPAC International Chemical Identifier)

Page 34: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

34 CBS, Department of Systems Biology

MOLfile format (.sdf)

Page 35: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

35 CBS, Department of Systems Biology

Small molecule databases

Page 36: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

36 CBS, Department of Systems Biology

Try it yourself!

• Go to PubChem: pubchem.ncbi.nlm.nih.gov/

• Type proguanil and press Go

• Click on the first result on the list

Page 37: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

37 CBS, Department of Systems Biology

Try it yourself!

• Scroll down and find the SMILES and InChI

Page 38: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

38 CBS, Department of Systems Biology

Try it yourself!

• Click on SDF (top right icon)

• Select: 2D SDF: Display

Page 39: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

39 CBS, Department of Systems Biology

Try it yourself!

• Go back and click again on SDF

• Select: 3D SDF: Display

Page 40: Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.

40 CBS, Department of Systems Biology

Questions?