Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki,...

49
Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS, DTU-Systems Biology
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    223
  • download

    4

Transcript of Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki,...

Page 1: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

Chemoinformatics in Drug Design

Biological Sequence Analysis, May 6, 2011

Irene Kouskoumvekaki,Associate Professor,Computational Chemical Biology,CBS, DTU-Systems Biology

Page 2: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

2 CBS, Department of Systems Biology

Computational Chemical Biology group

Irene Kouskoumvekaki

Associate Professor

Olivier Taboureau

Associate Professor

Sonny Kim Nielsen

PhD student Kasper Jensen

PhD student

Tudor Oprea

Guest Professor

Ulrik Plesner

master student

Page 3: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

3 CBS, Department of Systems Biology

Page 4: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

4 CBS, Department of Systems Biology

Page 5: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

5 CBS, Department of Systems Biology

Definition:Chemoinformatics

Gathering and systematic use of chemical information, and application of this information to predict the behavior of unknown compounds in silico.

data prediction

Page 6: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

6 CBS, Department of Systems Biology

Definition:A drug candidate…

... is a (ligand) compound that binds to a biological target (protein, enzyme, receptor, ...) and in this way either initiates a process (agonist) or inhibits it (antagonist)

The structure/conformation of the ligand is complementary to the space defined by the protein’s active site

The binding is caused by favorable interactions between the ligand and the side chains of the amino acids in the active site. (electrostatic interactions, hydrogen bonds, hydrophobic contacts...)

Page 7: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

7 CBS, Department of Systems Biology

In vitro / In silico studies

Drug Discovery

Clinical studies

Animal studies

Page 8: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

8 CBS, Department of Systems Biology

The Drug Discovery Process

Chemoinformatics

Page 9: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

9 CBS, Department of Systems Biology

The Drug Discovery Process

MKTAALAPLFFLPSALATTVYLA

GDSTMAKNGGGSGTNGWGEYL

ASYLSATVVNDAVAGRSAR…(etc)

We know the structure of the biological target

We identify/predict the binding pocket

Challenge:

To design an organic molecule that would bind strong enough to the biological target and modute it’s activity.

New drug candidate

Page 10: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

10 CBS, Department of Systems Biology

What is it?Alzheimer's is a disease that causes failure of brain

functions and dementia. It starts with bad memory and disability to function in common everyday activities.

Example: – Alzheimer’s disease

How do you get it?Alzheimer's disease is the result of malfunctioning

neurons at different parts of the brain. This, in turn, is due to an inbalance in the concentration of neurotranmitters.

Page 11: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

11 CBS, Department of Systems Biology

How can we treat it?

Example: – Alzheimer’s disease

Acetylkolin neurotransmitter

Drug against Alzheimer’s

Page 12: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

12 CBS, Department of Systems Biology

Old School Drug discovery process

Screening collection

HTS

Actives

103 actives106 cmp.

Follow-up

Hits

1-10 hits

Lead series

0-3 lead series

Hit-to-lead

Clinical trials

Drug candidate

0-1

Lead-to-drug

High rate of false positives !!!

Page 13: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

13 CBS, Department of Systems Biology

Page 14: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

14 CBS, Department of Systems Biology

Failures

Page 15: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

15 CBS, Department of Systems Biology

Drug discovery in the 21st Century

Diverse set of molecules tested in the lab

in vitro in silico + in vitro

Computational methods to select subsets (to be tested in the lab) based on prediction of drug-likeness, solubility, binding, pharmacokinetics, toxicity, side effects, ...

Page 16: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

16 CBS, Department of Systems Biology

The Lipinski ‘rule of five’ for drug-likeness prediction

Octanol-water partition coefficient (logP) ≤ 5 Molecular weight ≤ 500 # hydrogen bond acceptors (HBA) ≤ 10 # hydrogen bond donors (HBD) ≤ 5

If two or more of these rules are violated, the compound might

have problems with oral bioavailability.(Lipinski et al., Adv. Drug Delivery Rev., 23, 1997, 3.)

Page 17: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

17 CBS, Department of Systems Biology

Major Aspects of Chemoinformatics

Page 18: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

18 CBS, Department of Systems Biology

Major Aspects of Chemoinformatics

•Information Acquisition and Management: Methods for collecting data (mainly experimental). Development of databases for storage and retrieval of information.

•Information Use: Data analysis, correlation and model building.

•Information Application: Prediction of molecular properties relevant to chemical and biochemical sciences.

Page 19: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

19 CBS, Department of Systems Biology

Major Aspects of Chemoinformatics

•Information Acquisition and Management: Methods for collecting data (mainly experimental). Development of databases for storage and retrieval of information.

•Information Use: Data analysis, correlation and model building.

•Information Application: Prediction of molecular properties relevant to chemical and biochemical sciences.

Page 20: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

20 CBS, Department of Systems Biology

Information Acquisition and Management

Page 21: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

21 CBS, Department of Systems Biology

Small molecule databases

Page 22: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

22 CBS, Department of Systems Biology

0

2,000,000

4,000,000

6,000,000

8,000,000

10,000,000

12,000,000

14,000,000

16,000,000

18,000,000

20,000,000

May-05 Sep-05 Jan-06 May-06 Sep-06 Jan-07 May-07 Sep-07

Compound

Substance

Growth In PubChem Substances & Compounds

Recent count: Substance: 72,156,631 Compound: 28,807,320 Rule of 5: 20,692,980

Page 23: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

23 CBS, Department of Systems Biology

Searching in PubChem

Page 24: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

24 CBS, Department of Systems Biology

Structural representation of molecules

Structural representation of molecules

Page 25: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

25 CBS, Department of Systems Biology

Major Aspects of Chemoinformatics

•Information Acquisition and Management: Methods for collecting data (mainly experimental). Development of databases for storage and retrieval of information.

•Information Use: Data analysis, correlation and model building.

•Information Application: Prediction of molecular properties relevant to chemical and biochemical sciences.

Page 26: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

26 CBS, Department of Systems Biology

Beyond the Lipinski Rule of 5...

•Chemometrics: The application of mathematical or statistical methods to chemical data (simple, linear methods)e.g. Principal Component Analysis

•Machine Learning: The design and development of algorithms and techniques that allow computers to learn (complex, non-linear algorithms)e.g. Artificial Neural Networks, K-means clustering

Page 27: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

27 CBS, Department of Systems Biology

Major Aspects of Chemoinformatics

•Information Acquisition and Management: Methods for collecting data (mainly experimental). Development of databases for storage and retrieval of information.

•Information Use: Data analysis, correlation and model building.

•Information Application: Prediction of molecular properties relevant to chemical and biochemical sciences.

Page 28: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

28 CBS, Department of Systems Biology

Prediction of Solubility, ADME & Toxicity

Solid

drug

Dissolution

Solubility

Drug in

solution

Membrane

transfer

Absorption

Absorbed

drug

Liver extraction Systemic

circulation

Metabolism

Page 29: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

29 CBS, Department of Systems Biology

Prediction of biological activity/selectivity

Page 30: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

30 CBS, Department of Systems Biology

Prediction models at CBS

Page 31: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

31 CBS, Department of Systems Biology

Virtual screening

Page 32: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

32 CBS, Department of Systems Biology

Virtual Screening Flavors

1D1D filters

e.g. Lipinskis Rule of Five

LIGAND-BASED

TARGET-BASED

Page 33: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

33 CBS, Department of Systems Biology

Molecular similarity on the Chemical Space

• Similar Property Principle – Molecules having similar structures and properties are expected to exhibit similar biological activity. (Not always true!)

• Thus, molecules that are located closely together in the chemical space are often considered to be functionally related.

Page 34: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

34 CBS, Department of Systems Biology

Ligand-based VS: Fingerprints

– widely used similarity search tool– consists of descriptors encoded as bit strings– Bit strings of query and database are compared using

similarity metric such as Tanimoto coefficient

MACCS fingerprints: 166 structural keys

that answer questions of the type:

• Is there a ring of size 4?

• Is at least one F, Br, Cl, or I present?

where the answer is either

TRUE (1) or FALSE (0)

Page 35: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

35 CBS, Department of Systems Biology

Tanimoto Similarity

Tc c

a b c

9

10 9 90.9

or 90% similarity

Page 36: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

36 CBS, Department of Systems Biology

Tanimoto Similarity

Page 37: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

37 CBS, Department of Systems Biology

Ligand-based VS: Pharmacophore

Page 38: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

38 CBS, Department of Systems Biology

Structure-based Virtual Screening: Docking

Given a protein and a database of ligands, docking scores determine which ligands are most likely to bind.

Binding pocket of target Library of small compounds

Page 39: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

39 CBS, Department of Systems Biology

Energy of binding

Binding pocket of target Library of small compounds

-10 kcal/mol

+1 kcal/mol

-1 kcal/mol

+10 kcal/mol

ΔG = ΔH - TΔS

vdW

Hbond

Desolvation E

Electrostatic E

Torsional free E

Page 40: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

40 CBS, Department of Systems Biology

“Docking” and “Scoring”

•Docking involves the prediction of the binding mode of individual molecules

– Goal: new ligand orientation closest in geometry to the observed X-ray structure (Conformations of ligands in complexes often have very similar geometries to minimum-energy conformations of the isolated ligand)

•Scoring ranks the ligands using some function related to the free energy of association of the two partners, looking at attractive and repulsive regions and taking into account steric and hydrogen bonding interactions

– Goal: new ligand score closest in value to the docking score of the X-ray structure

Page 41: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

41 CBS, Department of Systems Biology

Docking algorithms

•Most exhaustive algorithms:– Accurate prediction of a binding pose

•Most efficient algorithms– Docking of small ligand databases in reasonable time

•Rapid algorithms– Virtual high-throughput screening of millions of

compounds

Page 42: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

42 CBS, Department of Systems Biology

Scoring functions

•Molecular mechanics force field-based

Score is estimated by summing the strength of intermolecular van der Waals and electrostatic interactions between all atoms of the ligand-target complex

-CHARMM, AMBER

•Empirical-based

Based on summing various types of interactions between the two binding partners (hydrogen bonds, hydrophobic, …)

- ChemScore, GlideScore, AutoDock

•Knowledge-based

Based on statistical observations of intermolecular close contacts from large 3D databases, which are used to derive potentials or mean forces

-PMF, DrugScore

Page 43: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

43 CBS, Department of Systems Biology

Ligand-based VSgood enrichment of candidate

molecules from the screening of large databases with less computational efforts

×too coarse to pick up subtle differences induced by small structural variations in the ligands

many options for model refinement

Structure-based VSbetter fit for analyzing smaller

sets of compounds, especially in retrospective analysis

include all possible interactions thus allowing the detection of unexpected binding modes

×Changing parameters for docking algorithms and scores is demanding

Mutants are being developed:

• pharmacophore methods with information about the target’s binding site

• docking programs that incorporate pharmacophore constraints

Combination of pharmacophore, docking and molecular dynamics (MD) screens

Page 44: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

44 CBS, Department of Systems Biology44

http://www.vcclab.org/lab/edragon/

Page 45: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

45 CBS, Department of Systems Biology

Public Web Chemoinformatics Toolshttp://pasilla.health.unm.edu/

http://pasilla.health.unm.edu/

Page 46: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

46 CBS, Department of Systems Biology

ChemSpiderwww.chemspider.com

Page 47: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

47 CBS, Department of Systems Biology

Open Babelhttp://openbabel.org/wiki/Main_page

Page 48: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

48 CBS, Department of Systems Biology D. Vidal et al, Ligand-based Approaches to In Silico Pharmacology, Chemoinformatics and Computational Chemical Biology, Ed J. Bajorath, Springer, 2011

Page 49: Chemoinformatics in Drug Design Biological Sequence Analysis, May 6, 2011 Irene Kouskoumvekaki, Associate Professor, Computational Chemical Biology, CBS,

49 CBS, Department of Systems Biology

Questions?