Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

21
Dispensing Processes Profoundly Impact Biological, Computational and Statistical Analyses Sean Ekins 1 , Joe Olechno 2 and Antony J. Williams 3 1 Collaborations in Chemistry, Fuquay Varina, NC. 2 Labcyte Inc, Sunnyvale, CA. 3 Royal Society of Chemistry, Wake Forest, NC. Disclaimer: SE and AJW have no affiliation with Labcyte and have not been engaged as consultants

description

Dispensing processes profoundly influence estimates of biological activity of compounds. In this study using published inhibitor data for the tyrosine kinase EphB4, we show that IC50 values obtained via disposable tip-based serial dilution and dispensing versus acoustic dispensing differ by orders of magnitude with no correlation or ranking of datasets. Importantly, the computed EphB4 pharmacophores derived from this data differ for each dataset. Acoustic dispensing correctly highlights multiple hydrophobic features in the pharmacophore and correlates with calculated LogP values. Significantly, the acoustic dispensing-derived pharmacophore correctly identified active compounds in a test set. The subsequent analysis of crystal structures for other published EphB4 inhibitors and automated development of pharmacophores, indicated they were comparable to those developed with acoustic dispensing data. In short, dispensing processes are another important source of error in high-throughput screening that impacts computational and statistical analyses. These findings have far-reaching implications in biological research and in drug discovery.

Transcript of Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

Page 1: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

Dispensing Processes Profoundly Impact Biological, Computational and Statistical Analyses

Sean Ekins1, Joe Olechno2 and Antony J. Williams3

1 Collaborations in Chemistry, Fuquay Varina, NC.2 Labcyte Inc, Sunnyvale, CA.

3 Royal Society of Chemistry, Wake Forest, NC.

Disclaimer: SE and AJW have no affiliation with Labcyte and have not been engaged as consultants

Page 2: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

“If I have seen further than others, it is by standing upon the

shoulders of giants.”

Isaac Newton

Where do scientists get chemistry/ biology

data?

Databases

Patents

Papers

Your own lab

Collaborators

Some or all of the above?

What is common to all? – quality issues

Page 3: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

Data can be found – but …..drug structure quality is

important

More groups doing in silico repositioning

Target-based or ligand-based

Network and systems biology

Integrating or using sets of FDA drugs..if the structures are incorrect predictions will be too..

Need a definitive set of FDA approved drugs with correct structures

Also linkage between in vitro data & clinical data

Page 4: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

Structure Quality Issues

NPC Browser http://tripod.nih.gov/npc/

Database released and within days 100’s of errors found in structures

DDT, 16: 747-750 (2011)

Science Translational Medicine 2011

DDT 17: 685-701 (2012)

Page 5: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

DDT editorial Dec 2011

http://goo.gl/dIqhU

This editorial led to the current work

It’s not just structure quality we need to worry about

Page 6: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

Southan et al., DDT, 18: 58-70 (2013)

Finding structures of Pharma molecules is hard

NCATS and MRC made molecule identifiers from pharmas available with no structures

Page 7: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

How do you move a liquid?

Images courtesy of Bing, Tecan

McDonald et al., Science 2008, 322, 917.Belaiche et al., Clin Chem 2009, 55, 1883-1884

Plastic leaching

Page 8: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

Extremely precise Extremely accurate Rapid Auto-calibrating Completely touchless

No cross-contamination

No leachates

No binding

Moving Liquids with sound: Acoustic Droplet Ejection (ADE)

Acoustic energy expels droplets without physical contact

8

0

2.5

5.0

7.5

10.0

12.5

15.0

0.1 1 10 100 1000 10000Volume (nL)

%CV

Comley J, Nanolitre Dispensing, Drug Discovery World, Summer 2004, 43-54

Images courtesy of Labcyte Inc. http://goo.gl/K0Fjz

Page 9: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

Using literature data from different dispensing methods to generate computational models

Few molecule structures and corresponding datasets are public

Using data from 2 AstraZeneca patents:

Tyrosine kinase EphB4 pharmacophores (Accelrys Discovery Studio) were developed using data for 14 compounds

IC50 determined using different dispensing methods

Analyzed correlation with simple descriptors (SAS JMP)

Calculated LogP correlation with log IC50 data for acoustic dispensing (r2 = 0.34, p < 0.05, N = 14)

Barlaam, B. C.; Ducray, R., WO 2009/010794 A1, 2009Barlaam, B. C.; Ducray, R.; Kettle, J. G., US 7,718,653 B2, 2010

Page 10: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

Compound #

5 0.002 0.5534 0.003 0.1467 0.003 0.778

W7b 0.004 0.1528 0.004 0.445

W5 0.006 0.0876 0.007 0.973

W3 0.012 0.049W1 0.014 0.1129 0.052 0.17010 0.064 0.817

W12 0.158 0.250W11 0.207 14.40011 0.486 3.030

3.312.8

1.669.6

6.2

8.2

IC50 Acoustic (µM) IC50 Tips (µM) Ratio IC50Tip/IC50ADE

276.548.7

259.342.5

111.313.7

139.04.2

14 compounds with structures and IC50 data.

Barlaam, B. C.; Ducray, R., WO 2009/010794 A1, 2009Barlaam, B. C.; Ducray, R.; Kettle, J. G., US 7,718,653 B2, 2010

Page 11: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

A graph of the log IC50 values for tip-based serial dilution and dispensing versus acoustic dispensing with direct dilution

shows a poor correlation between techniques (R2 = 0.246).

acoustic technique always gave more potent IC50 value

Page 12: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

12

14 Structures with Data14 Structures with Data

Acoustic Model

Acoustic Model

Tip-based Model

Tip-based Model

Generate pharmacophore models

for EphB4 receptor

Acoustic Model

Acoustic Model

Tip-based Model

Tip-based Model

Test models against new

data

Acoustic Model

Acoustic Model

Tip-based Model

Tip-based Model

Test models against X-ray crystal structure

pharmacophores

ResultsResults

ResultsResults

Independent crystallography data Bioorg Med Chem Lett 18:2776; 18:5717; 20:6242; 21:2207

Independent data set of 12 WO2008/132505

Initial data set of 14 WO2009/010794, US 7,718,653

Experimental Process

Page 13: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

  Hydrophobic

features (HPF)

Hydrogen

bond acceptor

(HBA)

Hydrogen

bond donor

(HBD)

Observed vs.

predicted IC50

r

Acoustic mediated process 2 1 1 0.92

Tip-based process 0 2 1 0.80

• Ekins et al., PLOSONE, In press

Acoustic Tip based

Tyrosine kinase EphB4 Pharmacophores

Generated with Discovery Studio (Accelrys)

Cyan = hydrophobic

Green = hydrogen bond acceptor

Purple = hydrogen bond donor

Each model shows most potent molecule mapping

Page 14: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

• An additional 12 compounds from AstraZeneca Barlaam, B. C.; Ducray, R., WO 2008/132505 A1, 2008

• 10 of these compounds had data for tip based dispensing and 2 for acoustic dispensing

• Calculated LogP and logD showed low but statistically significant correlations with tip based dispensing (r2= 0.39 p < 0.05 and 0.24 p < 0.05, N = 36)

• Used as a test set for pharmacophores

• The two compounds analyzed with acoustic liquid handling were predicted in the top 3 using the ‘acoustic’ pharmacophore

• The ‘Tip-based’ pharmacophore failed to rank the retrieved compounds correctly

Test set evaluation of pharmacophores

Page 15: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

Automated receptor-ligand pharmacophore generation method

Pharmacophores for the tyrosine kinase EphB4 generated from crystal structures in the protein data bank PDB using Discovery Studio version 3.5.5

Cyan = hydrophobic

Green = hydrogen bond acceptor

Purple = hydrogen bond donor

Grey = excluded volumes

Each model shows most potent molecule mapping

Bioorg Med Chem Lett 2010, 20, 6242-6245.Bioorg Med Chem Lett 2008, 18, 5717-5721. Bioorg Med Chem Lett 2008, 18, 2776-2780.Bioorg Med Chem Lett 2011, 21, 2207-2211.

Page 16: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

•In the absence of structural data, pharmacophores and other computational and statistical models are used to guide medicinal chemistry in early drug discovery.

•Our findings suggest acoustic dispensing methods could improve HTS results and avoid the development of misleading computational models and statistical relationships.

•Automated pharmacophores are closer to pharmacophore generated with acoustic data – all have hydrophobic features – missing from Tip- based pharmacophore model

•Importance of hydrophobicity seen with logP correlation and crystal structure interactions

•Public databases should annotate this meta-data alongside biological data points, to create larger datasets for comparing different computational methods.

Summary

Page 17: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

Acoustic vs. Tip-based TransfersAdapted from Spicer et al., Presentation at Drug Discovery Technology, Boston, MA, August 2005

Adapted from Wingfield. Presentation at ELRIG2012, Manchester, UKNOTE DIFFERENT ORIENTATION

Adapted from Wingfield et al., Amer. Drug Disco. 2007, 3(3):24

Aqueous % Inhibition

Ac

ou

sti

c %

In

hib

itio

n

0 20 40

0

-20

-40

60 80

100

6080

100

-20

-40

2040

0 10 20 30 40 50

010

2030

4050

Se

ria

l d

ilu

tio

n I

C50

μM

Acoustic IC50 μM

104

104

103

102

10

1

10-1

10-2

10-3

Se

ria

l d

ilu

tio

n I

C50

μM

Acoustic IC50 μM10310210110-110-210-3

Log IC50 acousticLog

IC

50

tip

s

Data in this presentation

No Previous Analysis of molecule properties

Page 18: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

Strengths and Weaknesses

• Small dataset size – focused on one compound series

• No previous publication describing how data quality can be impacted by dispensing and how this in turn affects computational models and downstream decision making.

• No comparison of pharmacophores generated from acoustic dispensing and tip-based dispensing.

• No previous comparison of pharmacophores generated from in vitro data with pharmacophores automatically generated from X-ray crystal conformations of inhibitors.

• Severely limited by number of structures in public domain with data in both systems

• Reluctance of many to accept that this could be an issue

• Ekins et al., PLOSONE, In press

Page 19: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

The stuff of nightmares?

How much of the data in databases is generated by tip-based serial dilution methods? We don’t know…the meta data doesn’t tell us!

How much is erroneous?

Do we have to start again?

How does it affect all subsequent science – data mining etc?

Does it impact Pharmas productivity?

Page 20: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

Simple Rules for licensing “open” data

Williams, Wilbanks and Ekins. PLoS Comput Biol 8(9): e1002706, 2012

1: NIH and other international scientific funding bodies should mandate …open accessibility for all data generated by publicly funded research immediately

Could data ‘open accessibility’ equal ‘Disruption’

Ekins, Waller, Bradley, Clark and Williams. DDT, 18:265-71, 2013

As we see a future of increased database integration the licensing of the data may be a hurdle that hampers progress and usability.

Page 21: Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses

You can find me @... CDD Booth 205

PAPER ID: 13433PAPER TITLE: “Dispensing processes profoundly impact biological assays and computational and statistical analyses”April 8th 8.35am Room 349

PAPER ID: 14750PAPER TITLE: “Enhancing High Throughput Screening For Mycobacterium tuberculosis Drug Discovery Using Bayesian Models” April 9th 1.30pm Room 353PAPER ID: 21524

PAPER TITLE: “Navigating between patents, papers, abstracts and databases using public sources and tools”April 9th 3.50pm Room 350PAPER ID: 13358

PAPER TITLE: “TB Mobile: Appifying Data on Anti-tuberculosis Molecule Targets”April 10th 8.30am Room 357

PAPER ID: 13382PAPER TITLE: “Challenges and recommendations for obtaining chemical structures of industry-provided repurposing candidates”April 10th 10.20am Room 350

PAPER ID: 13438PAPER TITLE: “Dual-event machine learning models to accelerate drug discovery”April 10th 3.05 pm Room 350