Pharmacophore QSAR et alfch.upol.cz/wp-content/uploads/2015/07/UK_DD_LBDD1_Berka_vz4.pdf · Search...

Post on 19-Jun-2020

7 views 0 download

Transcript of Pharmacophore QSAR et alfch.upol.cz/wp-content/uploads/2015/07/UK_DD_LBDD1_Berka_vz4.pdf · Search...

Ligand-based Methods Chemical Libraries

Similarity

Pharmacophores

RNDr. Karel Berka, PhD

RNDr. Jindřich Fanfrlík, PhD

RNDr. Martin Lepšík, PhD

Dpt. Physical Chemistry, RCPTM, Faculty of Science,

Palacky University in Olomouc

Drug Design

Outline

• Structure-based drug design (SBDD)

– Docking

– Virtual screening

– de novo design

– Pharmacophore search

• Ligand-based drug design (LBDD)

– Similarity matching

– Pharmacophore search

– QSAR

– ADMET

2

Possibilities of Drug Design

Known ligand Unknown ligand K

now

n targ

et

str

uctu

re

Unknow

n targ

et

str

uctu

re

Structure-based drug design

(SBDD)

Docking

Ligand-based drug design

(LBDD)

1 or more ligands

• Similarity search

Several ligands

• Pharmacophore

Large number of ligands (20+)

• Quantitative Structure-Activity

Relationships (QSAR)

De novo design

CADD not possible

some experimental

data needed

ADMET filtering

CHEMICAL LIBRARIES

Chemical Libraries • Large diversity

– Lead search

• Specific

– For combinatorial chemistry

• Typical motives:

5

David C. Young - Computational Drug Design: A guide for computational and medicinal chemists. Wiley-Blackwell, New York, 2009, ISBN 978-0470126851

• Types:

• 1D, 2D, 3D

• What for:

• Similarity search – 2D and 3D similarity, motifs

• Predictions – pKa, logP/logD, charge distribution, logS, …

How to Store Chemical Compound

CCO Ethanol:

6

Chemical Information System

Operation Classical information system Chemical information system

Storage Name = ‘KSICHT’

Store text, numbers, pictures,…

Store chemical structures and information about them

Search Search $Name

Returns: ‘KSICHT’

Search CC(=O)C4CC3C2CC(C)C1=C(C) C(=O)CC(O)C1C2CCC3(C)C4

Returns:

Advanced searches

Find queries containing ‘chemist’

Returns: ‘chemist’ ‘taky chemist’ ‘bum’

Searc molecules containing:

Returns:

Questions How to became chemist?

Returns: ‘Solve KSICHT’

Calculate logP(o/w) of:

logP(o/w) = 2.62

7

1D Structure Representation

Stores molecule in string format

• CAS number – registered molecules only

• SMILES – simple format

• InChI – IUPAC format - more comprehensive

8

SMILES

• Chemical graph Atoms: organics with implicit H (B, C, N, O, P, S, X),

anorganics or isotopes - [Au], [2H], charge – Ti+4 or Ti++++

aromates – small print (cccccc – benzene)

Bonds: simple – without sign (CC – ethane, O – water)

double (O=O), triple (N#N – nitrogen), four ([Ga-]$[As+]),

ring breaking – numbers (%10)

(C1CCCCC1 – cyklohexane, c1cccc2c1cccc2 - naphtalene)

Branching: brackets (C(Cl)(Cl)Cl – chloroform)

Stereochemistry: “/” a “\”

(F/C=C/F – trans-difluoroethen)

chiral atoms @ (c.clockwise) or @@ (clockwise)

(N[C@@H](C)C(=O)O – L-alanin)

Simplified Molecular-Input Line-Entry System

9

SMARTS • SMiles ARbitrary Target Specification

– Selection of atomic regular expressions

• Atoms – symbol or atomic number [C],[#6],[C,c]

– aromates small print [c],

– Regular expression: * (any),A (aliphatics), a (aromate)

• Bonds – '-' (simple), '=' (double), '#' (triple), ':' (aromatic), '~' (any bond)

• Connectivity – X (different) a D (same) deskriptors - [CX4] carbon with 4 other atoms, [CD4] – quarter carbon

• Cyclicity - R descriptor - [CR] (aliphatic carbon atom in ring)

• Logical operators – (and= ; &) (or= ,) (not= !)

[N;H3;+][C;X4] (primary amine)

10

InChI • IUPAC International Chemical Identifier

InChI=1(S standard)/chemical formula/c(atom connections)/ h(hydrogens)/p(protons)/q(charges)/b(double bonds)/ t ev. m(tetrahedral) /s(stereochemistry)/i(isotopes)/f/r

• InChIKey – for quicker searches – 27 long strings – 14 characters long hash of

connectivity ~ 9 characters hash of other properties

CH3CH2OH ethanol

InChI=1/C2H6O/c1-2-3/h3H,2H2,1H3 InChI=1S/C2H6O/c1-2-3/h3H,2H2,1H3 (standard InChI)

L-ascorbic acid

InChI=1/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/h2,5,7-10H,1H2/t2-,5+/m0/s1 InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/h2,5,7-8,10-11H,1H2/t2-,5+/m0/s1 (standard InChI)

11

Ethanol: LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ascorbic acid: CIWBSHSKHKDKBQ-JLAZNSOCSA-N

2D Structure Representation

• CHM – ChemDraw

• CDX – ChemDraw exchange file

12

3D Structure Representation

• MOL, SDF

• XYZ

• PDB

13

MOL/SDF

14

Row Section Description

1-3 header

1 Name of molecule („benzene“)

2 Additional information

3 Comment

4-17 Connection table

4 Sum of lines: 6 atoms, 6 bonds, ..., V2000 standard

5-10 atoms (1 row per atom): x, y, z, element, etc.

11-16 bonds (1 row per bond): 1. atom, 2. atom, type, etc.

17 properties

18 $$$$ End of molecule SDF can hold whole database

benzene

ACD/Labs0812062058

6 6 0 0 0 0 0 0 0 0 1 V2000

1.9050 -0.7932 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0

1.9050 -2.1232 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0

0.7531 -0.1282 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0

0.7531 -2.7882 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0

-0.3987 -0.7932 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0

-0.3987 -2.1232 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0

2 1 1 0 0 0 0

3 1 2 0 0 0 0

4 2 2 0 0 0 0

5 3 1 0 0 0 0

6 4 1 0 0 0 0

6 5 2 0 0 0 0

<Molecular Weight>499.61

M END

$$$$

• MDL molfile, structure-data file

XYZ

15

row Section Description

1-2 Header

1 Number of atoms

2 Comment

3-X Block of atoms

(1 row per atom):

5-10 element, x, y, z

More structures stored as multiple entries

5

methane molecule (in [[Ångström]]s)

C 0.000000 0.000000 0.000000

H 0.000000 0.000000 1.089000

H 1.026719 0.000000 -0.363000

H -0.513360 -0.889165 -0.363000

H -0.513360 0.889165 -0.363000

• Free format

• Easy storage

PDB - Protein DataBank file

16

HEADER EXTRACELLULAR MATRIX 22-JAN-98 1A3I

TITLE X-RAY CRYSTALLOGRAPHIC DETERMINATION OF A COLLAGEN-LIKE

TITLE 2 PEPTIDE WITH THE REPEATING SEQUENCE (PRO-PRO-GLY)

...

EXPDTA X-RAY DIFFRACTION

AUTHOR R.Z.KRAMER,L.VITAGLIANO,J.BELLA,R.BERISIO

AUTHOR 2 B.BRODSKY,A.ZAGARI,H.M.BERMAN

...

REMARK 350 BIOMOLECULE: 1

REMARK 350 APPLY THE FOLLOWING TO CHAINS: A, B, C

REMARK 350 BIOMT1 1 1.000000 0.000000 0.000000 0.00000

REMARK 350 BIOMT2 1 0.000000 1.000000 0.000000 0.00000

...

SEQRES 1 A 9 PRO PRO GLY PRO PRO GLY PRO PRO GLY

SEQRES 1 B 6 PRO PRO GLY PRO PRO GLY

SEQRES 1 C 6 PRO PRO GLY PRO PRO GLY

...

ATOM 1 N PRO A 1 8.316 21.206 21.530 1.00 17.44 N

ATOM 2 CA PRO A 1 7.608 20.729 20.336 1.00 17.44 C

ATOM 3 C PRO A 1 8.487 20.707 19.092 1.00 17.44 C

ATOM 4 O PRO A 1 9.466 21.457 19.005 1.00 17.44 O

...

HETATM 130 C ACY 401 3.682 22.541 11.236 1.00 21.19 C

HETATM 131 O ACY 401 2.807 23.097 10.553 1.00 21.19 O

HETATM 132 OXT ACY 401 4.306 23.101 12.291 1.00 21.19 O

...

SIMILARITY SEARCHES

17

Similarity Search

Search for similar structures to lead –

• 2D Sub-Structures

• 3D Sub-Structures

• 3D Conformational flexibility

18

N

NH2

HO

H

N

N(CH3)2

H

S

HN

O O

H3C

5-Hydroxytryptamine (5-HT)Serotonin (a natural neurotransmitter

synthesized in certain neurons in the CNS)

Sumatriptan (Imitrex)

Used to treat migrain headaches

known to be a 5-HT1 agonist

Similarity to Natural Ligand

19

2D Sub-Structure

• Functional groups

• Connectivity

Example. Halogen on aromatic ring together with carboxylic group

[

F

,

C

l

,

B

r

,

I

]

O

O

N

O

O

Cl

O

O

Cl

N

N

N

O

O

F

F

O

F

O

O

N

I

O

N

20

3D Sub-Structure

• Distances in 3D play more important role

• Bioisostericity – similar groups in 3D

• Usually storage only lowest energy structure

C

(

u

)

O

(

s

1

)

O

(

s

1

)

A

A

[

O

,

S

]

O

3.6 - 4.6 Å

3.3 - 4.3 Å

6.8 - 7.8 Å

360300240180120600

0

1

2

3

4

5

6

Dihedral angle

Ste

ric

Ene

rgy

(kca

l/mol

)

21

Bioisostericity

Young, D.C. Computational Drug Design. Wiley, 2009. 22

Bioisostericity II

Young, D.C. Computational Drug Design. Wiley, 2009. 23

How to calculate molecular similarity

)&()()(

)&(),(

yxByBxB

yxByxT

n

i

ii yxyxE1

2),(

Sequences/vectors of bits, or numeric values that can be compared by distance functions, similarity metrics .

E= Euclidean distance T = Tanimoto index (distance in XYZ) (similarity of bit vectors)

Quantitative assessment of similarity/dissimilarity of structures

need a numerically tractable form molecular descriptors, fingerprints, structural keys

Paradox of Similarity

Aminogenistein (x cystic fibrosis)

7-Hydroxy-2-(4-nitro-phenyl)-chromen-4-one

Pargyline (x hypertensis)

N-benzyl-N,1-dimethyl-2-propynylamine

It is not that simple

25

PHARMACOPHORE

Pharmacophore

• structural motive

(geometrical

restrictions on

functional groups)

important for

biological - mostly

pharmacological

activity

• Analogous to

chromophore

Bojarski, Curr. Top. Med. Chem. 2006, 6, 2005.

Pharmacophore-based Drug Design

Experimental

activity

Search for active

compounds in

chemical library

Preparation of

pharmacophore

Activity testing

Preparation of hits

See also John Van Drie’s

http://pharmacophore.org

Search for pharmacophore • Prerequisition:

– All active compounds in training set bind to common active site

• Pharmacophore query preparation

– Identification of characteristic functional groups (hydrogen bonds acceptors and donors, lipophilic groups, charge distribution)

• Pharmacophore search

– Search for pharmacophore query in all molecules in DB

– Scaffold hopping possible

– No need for receptor structure (can be useful for query generation)

Pharmacophore for HIV protease

Geometric set of functional groups necessary for HIV protease activity based on active site

O

O

Asp25

O

Gly27

H

NCH

CH3

CH2

H3C

Ile50

Donor

Donor

Acceptor

Hydrophobic

6.0 Å

6.9 Å

5.2 Å

6.3 Å

10.4 Å

Pharmacophore Query Identification

1) active site analysis

O

O

Asp25

O

Gly27

H

NCH

CH3

CH2

H3C

Ile50

O

O

Asp25

O

Gly27

H

NCH

CH3

CH2

H3C

Ile50

9.6 Å

6.9

8.8 Å

6.3Å

12.2 Å

Donor

Acceptoror Anion

Hydrophobic

Acceptor

Pharmacophore Query Identification

2) functional group definition

3) distance setup

O

O

Asp25

O

Gly27

H

NCH

CH3

CH2

H3C

Ile50

Donor

Donor

Acceptor

Hydrophobic

O

O

Asp25

O

Gly27

H

NCH

CH3

CH2

H3C

Ile50

Donor

Donor

Acceptor

Hydrophobic

9.6 Å

6.0 Å

6.9

6.9 Å

8.8 Å 5.2 Å

6.3Å

6.3 Å

12.2 Å

10.4 Å

Donor

Acceptoror Anion

Hydrophobic

Acceptor

Final Pharmacophore Query

Donor

Donor

Acceptor

Hydrophobic

6.0 Å

6.9 Å

5.2 Å

6.3 Å

10.4 Å

Last step:

Search in database of small molecules for

conformations identifiable with the query