Download - Dataset compilation and statistical analysis · Box S1 | Dataset compilation and statistical analysis Data on development compounds were compiled independently by the four companies

Box S1 | Dataset compilation and statistical analysis Data on development compounds were compiled independently by the four companies according to a pre-agreed framework. This included a broad agreement on the definitions of categorical classifications for reasons for failure etc. It should be emphasized though that some of these classifications still rely on a subjective judgment. The guideline definitions for the reasons for failure are shown in the table below. Clinical safety Termination because clinical studies demonstrate that the number and/or

clinical significance of observed adverse events is sufficient to preclude further development. Termination due to insufficient differentiation from competitor products, with respect to safety, is also included.

Commercial Reasons may include (but are not limited to): budget or resource constraints; merger related divestments.

Efficacy Termination because in vitro, in vivo or clinical studies fail to provide evidence of sufficient therapeutic effect or diagnostic value, as measured by objective or subjective parameters. Termination due to insufficient differentiation from competitor products, with respect to efficacy, is also included.

Formulation Termination because a physical formulation cannot be made which meets the requirements for administration in clinical or non-clinical studies (such as stability, solubility), or the requirements for final commercial formulation (such as composition and market image).

Non-clinical toxicology

Termination due to the identification of potential hazards in non-clinical studies (in vitro and/or in vivo), conducted at any time during discovery or development, which, when extrapolated to the potential risk to humans, preclude further development.

Patent issue Termination because the desired patent cover cannot be secured in one or more territories.

Pharmacokinetics / bioavailability

Termination because in vitro, in vivo or clinical studies assessing absorption, distribution, metabolism and elimination (ADME) and/or bioavailability fail to meet the prerequisite level for successful development within the company.

Rationalisation of company portfolio

Termination because there is no longer a fit with the company's vision, strategic focus, objectives or business needs, excluding rationalisation as a direct consequence of a merger.

Regulatory The project is terminated because regulatory authority (FDA, PMDA, EMEA etc.) decisions such as a restriction on labelling claims or a change in the regulatory requirement(s) for the project mean further development would not be feasible.

Scientific Problems associated with the biological target exculding efficacy, non-clinical toxicology and safety.

Technical Technical problems with development of the compound other than formulation issues.

Molecular descriptors were generated using ChemAxon version 5.9 (www.chemaxon.com) in order to ensure consistency of data. The actual structures were then removed. The datasets were provided to a third party broker (Thomson Reuters), who combined the data and encrypted sensitive fields such as disease area in accordance with a prior agreement. All analyses were carried out using JMP® Statistical Discovery software, version 11.0.0 from SAS Institute Inc. (www.jmp.com). The dataset for launched drugs used in the comparisons consists of those launched between 2000 and 2012 as sourced from ChEMBL (https://www.ebi.ac.uk/chembl/).

In format provided by Waring et al. (JULY 2015)SUPPLEMENTARY INFORMATION

NATURE REVIEWS | DRUG DISCOVERY www.nature.com/reviews/drugdisc

Oneway plots Comparisons of distributions used throughout the manuscript are plotted in the same manner using the oneway plot such as in FIG. 4a. This has been done to depict the full range of the data and the extent of the distributions, which are hidden when presenting mean values. The green diamonds on the plot indicate the mean values with the central line of the diamond representing the group mean and the tips of the diamonds representing its 95% confidence intervals. The outer blue marks indicate the standard deviation in the mean, the inner blue marks show the standard error. The gold boxes, termed outlier box plots depict the quartile range. The central line within the box represents the median value and the ends of the box indicate the 25th and 75th percentiles. The whiskers above and below the box indicate 1.5 times the interquartile range above and below the 25th and the 75th percentile respectively. Where shown, the points on the plot represent the individual data points, although they overlap considerably due to the large sample sizes. In all cases, a statistical test for significant differences between the mean values of each set was carried out, calculated using the Tukey-Kramer method (see below). Where significant differences were observed, these are illustrated on the right hand side of the plots with comparison circles. The circles indicate the test for statistically significant differences in the mean values; where there is no overlap between the circles, or where they overlap slightly, such that the angle of intersection between the circles is less than 90°, the differences may be considered statistically significant.

Mol

ecul

ar m

ass

(Da)

100

200

0

300

400

500450

750

650

550

350

250

150

50

600

700

800

Patentdata

450 (92)

388/448/506

443 (110)

383/442/499

396 (140)

299/384/471

Mean (s.d.)

25th/50th/75th percentiles

Drugcandidates

Launcheddrugs

All pairsTukey–Kramer

0.05



Mosaic plots The mosaic plot is a graphical representation of a two-way frequency table. The plot is divided into rectangles, so that the area of each rectangle is proportional to the proportions of the y variable in each category of the x variable. For further details see: Hartigan, J. A. & Kleiner, B. Mosaics for contingency tables, in Eddy, W. F., Ed., Computer Science and Statistics: Proceedings of the 13th Symposium on the Interface, pp. 268–273. Springer-Verlag, New York (1981). An example is given below from FIG. 6c.

In this example, the proportions on the x-axis (width of each category) represent the number of observations for each ionisation class; the proportions on the y-axis on the right represent the overall proportions of compounds progressing to the clinic or failing for preclinical tox; individual categories are divided between the two categories such that the position of the division on the y-axis indicates the proportion of the two categories (“progressed to clinic” or “preclinical tox”) within each ionisation type and thus the area of each box is proportional to the number of compounds in any one group ;the scale of the y-axis on the left shows the response probability, with the whole axis being a probability of one. Tests for statistically significant differences Where the manuscript refers to significant differences in the mean values, this is tested, in all cases, using the Tukey-Kramer method. For further details, see: Tukey, J. A problem of multiple comparisons, Dittoed manuscript of 396 pages, Princeton University (1953); Kramer, C.Y. Extension of multiple range tests to group means with unequal numbers of replications, Biometrics, 12, 309–310 (1956); Hayter, A.J. A proof of the conjecture that the Tukey–Kramer multiple comparisons procedure is conservative, Ann. Stat., 12, 61–75 (1984). For categorical values, the significance was defined as a probability of less than 0.1 likelihood of being greater than X2 (with probabilities quoted), where X2 is given as the difference of two negative log-likelihoods, one with whole-population response probabilities and one with each-population response rates. For further details, see: Wilks, S. S. The large-sample distribution of the likelihood ratio for testing composite hypotheses, Ann. Math. Stat. 9, 60-62 (1938).

0

0.75

0.50

1.00

0.25

Acid

Base

Neutral

Zwitterion



Descriptor distributions Lipophilicity calc logP

calc logD7.4

Quantiles: 100.0% maximum 9.01 75.0% quartile 4.38 50.0% median 3.10 25.0% quartile 2.07 0.0% minimum -7.8 Summary Statistics Mean 3.16 Std Dev 1.99




Ionisation apKa

bpKa

Quantiles: 100.0% maximum 20.0 75.0% quartile 13.5 50.0% median 10.5 25.0% quartile 4.63 0.0% minimum 0.44 Summary Statistics Mean 9.77 Std Dev 4.49




Size MWt / Da

refractivity

HeavyAtomCount

Quantiles: 100.0% maximum 929 75.0% quartile 498.63 50.0% median 441.81 25.0% quartile 382.55 0.0% minimum 137.07 Summary Statistics Mean 443.32 Std Dev 110.12

Quantiles: 100.0% maximum 246.7 75.0% quartile 136.25 50.0% median 121.64 25.0% quartile 103.63 0.0% minimum 27.68 Summary Statistics Mean 120.86 Std Dev 30.61

Quantiles: 100.0% maximum 66 75.0% quartile 35 50.0% median 31 25.0% quartile 27 0.0% minimum 8 Summary Statistics Mean 31.24 Std Dev 7.86



Polarity / H bonding tPSA

acceptorCount

donorCount

Quantiles: 100.0% maximum 432.6 75.0% quartile 103.3 50.0% median 84.40 25.0% quartile 64.15 0.0% minimum 0 Summary Statistics Mean 86.38 Std Dev 33.01





Cross correlations of descriptors

Distributions of selected calculated properties by target class MWt

calc logD7.4

calc logP

Mean Enzyme 452 Epigenetics 387 GPCR 450 Ion Channel 333 Kinase 449 NHR 472 Other 441 Protease 464 Unknown 375

Mean Enzyme 1.97 Epigenetics 1.44 GPCR 2.19 Ion Channel 1.21 Kinase 2.59 NHR 3.11 Other 2.21 Protease 1.23 Unknown 1.73

Mean Enzyme 2.85 Epigenetics 2.92 GPCR 3.15 Ion Channel 2.09 Kinase 3.14 NHR 4.83 Other 3.49 Protease 1.83 Unknown 2.54

calc logp calc logD7.4 apKa1 apKa2 bpKa1 bpKa2 MWt / Da refractivi ty HeavyAtom Count

PSA acceptorCount donorCount al iphatic AtomCount

al iphatic RingCount

carboal iphatic RingCount

heteroal iphatic RingCount

aromatic AtomCount

aromatic RingCount

carboaromatic RingCount

heteroaromatic RingCount

chira l CenterCount

stereoDouble BondCount

Fsp3 rotatable BondCount

clogp 1.00 0.83 0.01 0.08 -‐0.27 -‐0.16 0.41 0.44 0.42 -‐0.26 -‐0.19 -‐0.24 -‐0.01 -‐0.19 -‐0.02 -‐0.20 0.55 0.52 0.61 0.00 -‐0.15 0.00 -‐0.41 0.27

logD74 0.83 1.00 0.26 0.26 -‐0.25 -‐0.14 0.39 0.41 0.40 -‐0.22 -‐0.16 -‐0.29 -‐0.06 -‐0.17 -‐0.05 -‐0.16 0.58 0.56 0.52 0.13 -‐0.16 0.02 -‐0.42 0.14

apKa1 0.01 0.26 1.00 0.55 0.22 0.25 0.00 0.03 0.03 -‐0.16 -‐0.10 -‐0.20 -‐0.04 0.17 -‐0.01 0.22 0.08 0.10 -‐0.08 0.20 -‐0.04 0.05 -‐0.03 -‐0.16

apKa2 0.08 0.26 0.55 1.00 0.17 0.15 -‐0.13 -‐0.05 -‐0.09 -‐0.33 -‐0.17 -‐0.44 -‐0.14 0.03 -‐0.08 0.11 0.04 0.05 -‐0.07 0.14 -‐0.17 0.01 0.05 -‐0.09

bpKa1 -‐0.27 -‐0.25 0.22 0.17 1.00 0.59 -‐0.06 0.01 -‐0.04 -‐0.15 0.06 0.05 0.10 0.29 0.02 0.32 -‐0.17 -‐0.13 -‐0.19 0.02 0.09 -‐0.01 0.36 -‐0.02

bpKa2 -‐0.16 -‐0.14 0.25 0.15 0.59 1.00 0.15 0.24 0.18 0.05 0.30 0.04 0.12 0.20 -‐0.12 0.32 0.10 0.13 -‐0.20 0.35 0.01 0.01 0.20 0.11

mass 0.41 0.39 0.00 -‐0.13 -‐0.06 0.15 1.00 0.94 0.98 0.52 0.56 0.22 0.70 0.24 0.02 0.26 0.51 0.46 0.45 0.11 0.23 -‐0.02 -‐0.10 0.67

refractivi ty 0.44 0.41 0.03 -‐0.05 0.01 0.24 0.94 1.00 0.96 0.45 0.52 0.17 0.58 0.22 0.02 0.24 0.60 0.57 0.49 0.20 0.16 -‐0.02 -‐0.12 0.65

HeavyAtomCount 0.42 0.40 0.03 -‐0.09 -‐0.04 0.18 0.98 0.96 1.00 0.51 0.56 0.20 0.67 0.24 0.03 0.26 0.56 0.52 0.48 0.15 0.22 -‐0.02 -‐0.11 0.67

PSA -‐0.26 -‐0.22 -‐0.16 -‐0.33 -‐0.15 0.05 0.52 0.45 0.51 1.00 0.81 0.66 0.45 0.00 0.03 -‐0.02 0.17 0.15 -‐0.03 0.22 0.28 -‐0.01 -‐0.01 0.46

acceptorCount -‐0.19 -‐0.16 -‐0.10 -‐0.17 0.06 0.30 0.56 0.52 0.56 0.81 1.00 0.41 0.47 0.14 -‐0.09 0.22 0.21 0.20 -‐0.09 0.33 0.22 -‐0.05 0.07 0.47

donorCount -‐0.24 -‐0.29 -‐0.20 -‐0.44 0.05 0.04 0.22 0.17 0.20 0.66 0.41 1.00 0.30 -‐0.08 0.10 -‐0.16 -‐0.06 -‐0.08 0.03 -‐0.10 0.27 0.05 0.06 0.36

al iphaticAtomCount -‐0.01 -‐0.06 -‐0.04 -‐0.14 0.10 0.12 0.70 0.58 0.67 0.45 0.47 0.30 1.00 0.60 0.28 0.50 -‐0.24 -‐0.26 -‐0.01 -‐0.30 0.57 -‐0.01 0.52 0.56

al iphaticRingCount -‐0.19 -‐0.17 0.17 0.03 0.29 0.20 0.24 0.22 0.24 0.00 0.14 -‐0.08 0.60 1.00 0.51 0.79 -‐0.35 -‐0.33 -‐0.22 -‐0.19 0.47 -‐0.01 0.53 -‐0.11

carboal iphaticRingCount -‐0.02 -‐0.05 -‐0.01 -‐0.08 0.02 -‐0.12 0.02 0.02 0.03 0.03 -‐0.09 0.10 0.28 0.51 1.00 -‐0.12 -‐0.27 -‐0.26 -‐0.16 -‐0.15 0.43 -‐0.07 0.34 -‐0.03

heteroal iphaticRingCount -‐0.20 -‐0.16 0.22 0.11 0.32 0.32 0.26 0.24 0.26 -‐0.02 0.22 -‐0.16 0.50 0.79 -‐0.12 1.00 -‐0.21 -‐0.19 -‐0.13 -‐0.11 0.24 0.04 0.37 -‐0.10

aromaticAtomCount 0.55 0.58 0.08 0.04 -‐0.17 0.10 0.51 0.60 0.56 0.17 0.21 -‐0.06 -‐0.24 -‐0.35 -‐0.27 -‐0.21 1.00 0.97 0.63 0.53 -‐0.34 -‐0.01 -‐0.72 0.26

aromaticRingCount 0.52 0.56 0.10 0.05 -‐0.13 0.13 0.46 0.57 0.52 0.15 0.20 -‐0.08 -‐0.26 -‐0.33 -‐0.26 -‐0.19 0.97 1.00 0.57 0.58 -‐0.33 0.00 -‐0.69 0.20

carboaromaticRingCount 0.61 0.52 -‐0.08 -‐0.07 -‐0.19 -‐0.20 0.45 0.49 0.48 -‐0.03 -‐0.09 0.03 -‐0.01 -‐0.22 -‐0.16 -‐0.13 0.63 0.57 1.00 -‐0.30 -‐0.21 0.01 -‐0.53 0.33

heteroaromaticRingCount 0.00 0.13 0.20 0.14 0.02 0.35 0.11 0.20 0.15 0.22 0.33 -‐0.10 -‐0.30 -‐0.19 -‐0.15 -‐0.11 0.53 0.58 -‐0.30 1.00 -‐0.19 -‐0.02 -‐0.29 -‐0.07

chira lCenterCount -‐0.15 -‐0.16 -‐0.04 -‐0.17 0.09 0.01 0.23 0.16 0.22 0.28 0.22 0.27 0.57 0.47 0.43 0.24 -‐0.34 -‐0.33 -‐0.21 -‐0.19 1.00 -‐0.03 0.44 0.09

stereoDoubleBondCount 0.00 0.02 0.05 0.01 -‐0.01 0.01 -‐0.02 -‐0.02 -‐0.02 -‐0.01 -‐0.05 0.05 -‐0.01 -‐0.01 -‐0.07 0.04 -‐0.01 0.00 0.01 -‐0.02 -‐0.03 1.00 -‐0.11 -‐0.03

Fsp3 -‐0.41 -‐0.42 -‐0.03 0.05 0.36 0.20 -‐0.10 -‐0.12 -‐0.11 -‐0.01 0.07 0.06 0.52 0.53 0.34 0.37 -‐0.72 -‐0.69 -‐0.53 -‐0.29 0.44 -‐0.11 1.00 0.09

rotatableBondCount 0.27 0.14 -‐0.16 -‐0.09 -‐0.02 0.11 0.67 0.65 0.67 0.46 0.47 0.36 0.56 -‐0.11 -‐0.03 -‐0.10 0.26 0.20 0.33 -‐0.07 0.09 -‐0.03 0.09 1.00



Effect of carboaliphatic ring count on toxicology failure The structural descriptors generated on these compounds consist chiefly of counts of various substructural motifs, such as the number of aromatic rings, chiral centres etc. Given the quantised nature of these descriptors, they were modeled as ordinal as well as continuous variables. The oneway analysis revealed statistically significant differences in the averages in only one structural descriptor between the two categories (Tukey Kramer HSD method), the carboaliphatic ring count. However the differences in the values of the means between the categories (0.19 for toxicity compared to 0.29 for progressing) make it hard to attach any meaningful value to such an observation. Treating this as an ordinal variable revealed that this just failed to achieve significance (prob > Χ2 = 0.057) and logistic regression showed that the probability of a compound being toxic changes little over the range tested (0.37, 0.30, 0.23 for 0, 1 and 2 rings respectively). Moreover such observations should be treated with caution; given the number of descriptors that have been considered in this analysis, there is a chance that some may appear statistically significant by coincidence. carboaliphatic ring count

Mean (tox)=0.19 Mean (progressing)=0.29

Χ2 = 9.2, prob > Χ2 = 0.057

Logistic regression examining the effect of lipophilicity measures on clinical safety outcomes The logistic plot shows how the probability of observing a given categorical outcome, in this case the observation of a compound failing for clinical safety or reaching phase II, changes with a continuous variable, in this case calc logP or calc logD7.4, plotted on the x-axis. The change in probability is given by the blue line and should be compared to the overall probability across the whole range of x indicated by the division on the right hand y-axis. The probability is given on the left hand y-axis. The position of the markers in the x-direction indicates their value of x (calc logP or calc logD7.4). The compounds reaching phase II are coloured green and positioned above the blue probability line, those failing for clinical safety are red and below it. Other than that, the y-coordinate is randomly assigned. calc logP

calc logD7.4

Probability Overall 22% calc logP 2.5 19% calc logP 3.5 22%

Probability Overall 22% calc logD7.4 2.5 20% calc logD7.4 3.5 25%