Predictions of Intrinsic Aqueous Solubility of Crystalline...

40
Predictions of Intrinsic Aqueous Solubility of Crystalline Drug-like Molecules James McDonagh Supervisors Dr John Mitchell and Dr Tanja van Mourik 1

Transcript of Predictions of Intrinsic Aqueous Solubility of Crystalline...

Page 1: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Predictions of Intrinsic Aqueous Solubility of Crystalline Drug-like

Molecules

James McDonagh

Supervisors Dr John Mitchell and

Dr Tanja van Mourik

1

Page 2: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Overview • Introduction.

• Solution model – Our solubility prediction model.

• Results – The performance of the solubility predictions.

• Further models – Work following on from the initial model.

• Current/future work – Where we are currently and where we are going.

2

Introduction Solution model Results Further models Current/future work

Page 3: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Why is solubility prediction important?

• Crucial factor to control bioavailability of drug candidates.

• Critical component in determining the environmental impact of pesticides.

• Accurate in silico predictions of solubility can save time and money.

3

Introduction Solution model Results Further models Current/future work

Page 4: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

• New drug candidates are often more insoluble than their predecessors.

• Formulation of these drugs often involve less pleasant administration methods.

4

Introduction Solution model Results Further models Current/future work

Why should we care about solubility ?

• Certain pesticides can cause extensive damage and potentially enter the water cycle.

Page 5: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

5

Existing Theoretical Approaches

• So far, although theoretical methods have shown promise, they have not matched the accuracy of QSPR. Theoretical methods do have the advantage of being physically tractable.

• Industry also requires high through put methods. QSPR models are generally much faster than computational chemistry models.

• There are many theoretical models to make solubility predictions.

Introduction Solution model Results Further models Current/future work

Page 6: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Our Methodologies

• We have decomposed the solution (solu) free energy prediction in to two distinct steps.

– Sublimation (sub)

– Hydration (hyd)

• We have applied a range of methodologies to each step.

• Methodologies include simulation, QM calculation and machine learning.

6

Introduction Solution model Results Further models Current/future work

Page 7: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Thermodynamic cycle

7

ΔG*hyd

ΔGsolu

Crystalline

Gaseous

Solution

ΔG*sub

1 atm

+1.89 kcal/mol

ΔGᵒsub ΔGᵒhyd

1 mol/L Sub = sublimation Hyd = hydration Solu = solution

Introduction Solution model Results Further models Current/future work

Page 8: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Sublimation : Predictions by DMACRYS • DMACRYS a periodic lattice

simulation program.

• Electrostatics, from distributed multipoles.

• Buckingham potential to account for repulsion and dispersion.

• Calculates lattice energy and crystal entropy from phonon modes.

• Gas phase contributions calculated in Gaussian 09.

8

ΔG sub

Solid

Gaseous

Introduction Solution model Results Further models Current/future work

Page 9: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Hydration: Solvation models

• Continuum solvent, solvation model based on density (SMD).

• An integral equation theory (IET) of molecular liquids methodology the Reference Interaction Site Model (RISM).

9

ΔG hyd

Dilute solution

Gaseous

Introduction Solution model Results Further models Current/future work

Page 10: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Hydration: RISM

10

• Combines features of explicit and implicit solvent models.

• Solvent density is modelled, but no explicit molecular coordinates or dynamics.

~45 CPU mins per compound

Introduction Solution model Results Further models Current/future work

Page 11: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Hydration: RISM • We use the 3D RISM variation.

• We employ the Kovalenko-Hirata (KH) closure and Gaussian fluctuation free energy functional.

• We also employ the universal correction (UC).

∆𝐺ℎ𝑦𝑑3𝐷𝑅𝐼𝑆𝑀−𝐾𝐻/𝑈𝐶

= ∆𝐺ℎ𝑦𝑑3𝐷𝑅𝐼𝑆𝑀 + 𝑎 𝜌𝑉 + 𝑏

• The methodology we used will be referred to as 3DRISM-KH/UC.

11 David, S. Palmer., Andrey I Frolov, Ekaterina L Ratkova and Maxim V Fedorov,. Journal of Physics: Condensed Matter 2010, 22 (49), 492101.

Introduction Solution model Results Further models Current/future work

Page 12: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Solution Free Energy • The sum of our predictions

of ΔG sub and ΔG hyd produce a ΔG solu prediction.

• These methods were carried out for 25 chemically diverse drug-like molecules.

• Chemical accuracy ~4 kJ/mol or ~ 1 LogS unit.

• Useful predictions are within the standard deviation (SD) of the experimental values.

12

ΔG solu

ΔG hyd ΔG sub

Introduction Solution model Results Further models Current/future work

Page 13: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

13

Introduction Solution model Results Further models Current/future work

Data set selection

• The thermodynamically most stable polymorph was selected

• 10 had available experimental sublimation and hydration free energies. (in red).

Allopurinol ALOPUR

4-Aminobenzoic acid AMBNAC04

Trimethoprim AMXBPM10

Benzoic acid BENZAC02

Benzamide BZAMID02

Cocaine COCAIN10

Naproxen COYRUD11 Danthron

DAHNQU06

Primidone EPHPMO

Estrone ESTRON14

Acetaminophen HXACAN04

Ibuprofen IBPRAC01

Fluconazole IVUQOF

Isoproturon JODTUR01

Nitrofurantoin LABJON01

1-Naphthol NAPHOL01 Clozapine

NDNHCL01

Nicotinic acid NICOAC02

Niflumic acid NIFLUM10

Pindolol PINDOL

Pteridine PTERID11

Pyrene PYRENE07

Salicylic acid SALIAC

Diclofenac SIKLIH01

Mefenamic acid XYANAC

Page 14: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Sublimation free energy predictions

• Validation of the sublimation free energy prediction.

• B3LYP/6-31G(d,p) multipole.

• FIT repulsion dispersion potential.

• Correlation coefficient (R) 0.87.

• RMSE 5.66kJ/mol.

14

Introduction Solution model Results Further models Current/future work

Page 15: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Hydration Free Energy Predictions

• Validation of the hydration free energy predictions.

• SMD HF/6-31G(d,p).

• Both have strong R values 0.93 RISM, 0.97 SMD.

• RISM has a significantly higher RMSE 4.85kJ/mol RISM, 2.91kJ/mol SMD.

15

3DRISM-KH/UC

SMD (HF)

Introduction Solution model Results Further models Current/future work

Page 16: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Solution Free Energy Predictions

16

• The full 25 molecule set is compared to experiment.

• Chemical accuracy ~ 1logS unit. Experimental SD 1.79 LogS.

• Reasonable correlation R 0.85 RISM, R 0.84 SMD.

• RISM method provides best RMSE, RMSE of 1.45LogS RISM, RMSE of 2.03LogS SMD.

• SMD outliers Niflumic Acid and Pteridine.

Palmer, D. S.; McDonagh, J. L.; Mitchell, J. B. O.; van Mourik, T.; Fedorov, M. V., Journal of Chemical Theory and Computation 2012.

Introduction Solution model Results Further models Current/future work

DMACRYS + 3DRISM-KH/UC

DMACRYS+ SMD (HF)

Page 17: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

To sum up

• From these results we concluded it was possible to make predictions of a reasonable accuracy.

• In our methodology a larger portion of error could be attributed to the sublimation free energy prediction.

• Larger datasets were required to fully validate the methodology.

17

Introduction Solution model Results Further models Current/future work

Page 18: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Further models

• We took two approaches to follow up this work:

– Parameterisation and machine learning approaches to predict ΔG solution.

– Systematic theoretical improvements in Sublimation free energy predictions. (work currently ongoing).

18

Introduction Solution model Results Further models Current/future work

Page 19: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Informatics and machine learning

• We selected a dataset of 100 molecules.

• Calculated descriptors using the chemistry development kit (CDK).

• We followed our previously laid out theoretical methodology for the 100 molecules.

• We combined descriptors and theoretically calculated energies.

19

Introduction Solution model Results Further models Current/future work

Page 20: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Descriptors

• SMILES are input into CDK.

• Structural and some predicted properties are output for use as descriptors.

• These cheminformatics descriptors were used as part of the input for the machine learning methods.

CC(=O)Oc1ccccc1C(=O)O

20

Introduction Solution model Results Further models Current/future work

Descriptor Value

Molecular Weight 280

Molecular Formula C9H8O4 XLogP 1.24

Freely rotatable bonds 3 H bond acceptor 4

….. …..

…… ……

Page 21: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Computational Chemistry Calculations • DMACRYS –B3LYP/

6-31G(d,p), FIT potential.

• SMD with HF/6-31G(d,p)

• The same level of theory was used in the gas phase as the solution phase.

• SMD selected over RISM as it provided a better correlation in the previous work.

21

Introduction Solution model Results Further models Current/future work

Page 22: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Machine Learning Models

• Random Forest (RF) – A forest of decision trees.

22

Introduction Solution model Results Further models Current/future work

Page 23: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

• Support Vector Machines (SVM) – Classification by projection into a higher space and separation by a hyperplane.

23

Introduction Solution model Results Further models Current/future work

Page 24: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

• Partial least squares Regression (PLS) – can be considered as classification by deflation.

24

Introduction Solution model Results Further models Current/future work

𝑦 = 𝑋𝑏 + 𝜖𝑡

1 0.5 51 0.7 71 0.9 9

50702

Page 25: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Work flow/Experimental design

RF

SVM

PLS

25

Introduction Solution model Results Further models Current/future work

Prepare and input

Descriptors and Theoretically

calculated energies

Internal cross validation

Internal cross validation

Internal cross validation

External 10 fold cross validation

Page 26: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Results

• Results of the purely theoretical prediction.

• Our results are correlative.

• Standard linear regression is a poor fitting model.

• Chemical accuracy ~1logS unit.

• Experimental SD 1.71 LogS units.

26

Introduction Solution model Results Further models Current/future work

R = 0.57 R² = 0.32

RMSE = 2.95 -12.00

-10.00

-8.00

-6.00

-4.00

-2.00

0.00

2.00

4.00

6.00

-10.00 -8.00 -6.00 -4.00 -2.00 0.00 2.00 4.00

Pre

dic

ted

Lo

gS

Experimental LogS

LogS experiment Vs DMACRYS-SMD HF/6-31G(d,p) predictions

Page 27: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Descriptors only

27

Introduction Solution model Results Further models Current/future work

• Results of prediction exclusively using the CDK descriptors.

• All machine learning methods perform better than theory alone.

• Red bars show the SD and mean result.

• Boxes represent 75% of the predictions. Dark blue line shows the median.

PLS RF SVM Theory Mean RMSE 1.174(±0.08) 1.134(±0.03) 1.132(±0.03) 2.95

Mean R2 0.56(±0.03) 0.56(±0.03) 0.56(±0.03) 0.32

Descriptors only RMSE

Descriptors only R2

RM

SE

R2

PLS RF SVM

PLS RF SVM

Predictive model

Predictive model

Page 28: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Combined model

28

Introduction Solution model Results Further models Current/future work

• The model contains HF/6-31G(d,p) energies and descriptors.

• All show improvement over pure theory.

• Result are similar to those of the descriptors alone.

• Experimental SD 1.71 LogS units.

PLS RF SVM Theory Average RMSE 1.110(± 0.04) 1.107(±0.03) 1.111(±0.04) 2.95

Average R2 0.594(±0.04) 0.583(±0.04) 0.576(±0.04) 0.32 Combined model R2

Combined model RMSE

RM

SE

R2

PLS RF SVM

PLS RF SVM

Predictive model

Predictive model

Page 29: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Summary

• The information from theoretical calculations at this level has a minor impact but does improve accuracy and correlation of the results.

• The descriptors already hold much of the information.

• Further exploration of models of this type could allow us to find information not held in the descriptors that is accessible by chemical calculation.

29

Introduction Solution model Results Further models Current/future work

Page 30: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Exploration of sublimation free energy prediction

• The largest source of error in the initial method was the sublimation free energy prediction.

• We have a dataset of 60 molecules.

• We made sublimation free energy predictions using this dataset with our previously outlined method.

• We look to DFT methods to provide improved predictions.

30

Introduction Solution model Results Further models Current/future work

Page 31: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Periodic DFT

• We are using Periodic DFT to make sublimation free energy predictions.

• We are exploring dispersion corrections.

• We will look at the accuracy of prediction of the components of the free energy.

31

Introduction Solution model Results Further models Current/future work

Page 32: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Free Energy of Sublimation

• From our initial methodology.

• A poor correlation.

• Significant RMSE.

• Outliers hold significant leverage.

• All outliers contain NO2 groups (in red) which are known to be difficult to represent accurately in force fields.

32

Introduction Solution model Results Further models Current/future work

R² = 0.39 RMSE 16.6

0.00

20.00

40.00

60.00

80.00

100.00

120.00

0.00 20.00 40.00 60.00 80.00 100.00 120.00

ΔG

su

b P

red

icte

d

ΔG sub Experimental

Experimental Vs Predicted ΔG sub

Page 33: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Summary

• We have explored a purely theoretical methodology for predictions of solvation free energy.

• We have expanded from this to produce a combined computational chemistry - cheminformatics methodology.

• We have begun exploration of sublimation free energy, due to the large error it contributed in our original methodology.

33

Introduction Solution model Results Further models Current/future work

Page 34: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Thanks for your attention Many thanks to collaborators, co-workers and funders.

• Dr David Palmer (RISM)

• Professor Maxim Fedorov (RISM)

• Ms Neetika Nath (Machine Learning)

• Dr Luna De Ferrari

• Dr Herbert Früchtl

• Professor Michael Bühl

34

Introduction Solution model Results Further models Current/future work

Thanks to the members of room 150 School of Chemistry University of St Andrews:

Dr Lazaros Mavridis

Rosanna Alderson

Neetika Nath

Dr Luna De Ferrari

Luke Crawford

Leo Holroyd

Many thanks to my Supervisors

Dr John Mitchell Dr Tanja van Mourik

Dr Ludovic Castro

Ava Sih-Yu Chen

Page 35: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

35

Page 36: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

36

Page 37: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Current Preliminary Results

Enthalpy of Sublimation

• Using DMACRYS – B3LYP multipoles FIT potential and Gaussian 09.

• 48 molecules.

• A fair correlation but significant RMSE.

37

R² = 0.5613 RMSE 15.20

80.00

90.00

100.00

110.00

120.00

130.00

140.00

150.00

160.00

170.00

80.00 90.00 100.00 110.00 120.00 130.00 140.00 150.00 160.00

Pre

dic

ted

ΔH

su

b k

J m

ol-

1

Experimental ΔH sub kJ mol-1

Experimental Vs Predicted ΔH sub

Page 38: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Entropy of Sublimation

• Crystal entropy calculated in DMACRYS, gas phase in Gaussian 09.

• No meaningful correlation.

• Significant RMSE.

• 48 molecules

38

R² = 0.0412 RMSE 13.55

52.00

54.00

56.00

58.00

60.00

62.00

64.00

66.00

0 20 40 60 80 100

TΔS

pre

dic

ted

kJ

mo

l-1

TΔS experimental kJ mol-1

Experimental Vs Predicted TΔS sub

Page 39: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

Additional notes

• Buckingham potential • Universal correction ∆𝐺ℎ𝑦𝑑

3𝐷𝑅𝐼𝑆𝑀−𝑈𝐶 =

∆𝐺ℎ𝑦𝑑3𝐷𝑅𝐼𝑆𝑀 + 𝑎 𝜌𝑉 + 𝑏

• LogS ΔGsol−𝑅𝑇

ln (10)=log(S)

ΔGsolu = −𝑅𝑇𝑙𝑛𝑆 • 1logS unit = 5.71 in terms of

ΔG

• Thermodynamics • ΔHsub = -Ulatt +2RT • ΔSsub = (Sr+St) – Scrys

• ΔGsub = ΔHsub - TΔSsub

• ΔGhyd = Esol – Egas

• ΔGsolu = ΔGsub + ΔGhyd • ∆𝐺𝐺𝐹 =

𝐾𝐵𝑇 𝜌𝛼 𝑐𝛼 𝑟 −1

2𝑐𝛼 𝑟 ℎ𝛼 𝑟 𝑑𝑟

𝑅3𝑁𝑠𝑜𝑙𝑣𝑒𝑛𝑡𝛼=1

39

𝐸𝑙𝑘 = 𝐵𝑒𝑥𝑝 −𝐶𝑟𝑙𝑘 − 𝐴𝑟𝑙𝑘−6

∆𝐺𝑠𝑜𝑙𝑜 = ∆𝐺𝑠𝑢𝑏

𝑜 + ∆𝐺ℎ𝑦𝑑𝑜 = −𝑅𝑇𝑙𝑛(𝑆0𝑣𝑚)

Molar volume of the crystal Vm Intrinsic solubility So

Page 40: Predictions of Intrinsic Aqueous Solubility of Crystalline ...chemistry.st-andrews.ac.uk/staff/jbom/group/James... · experimental sublimation and hydration free energies. (in red).

40