modeling melting points

Post on 10-May-2015

769 views 4 download

Tags:

description

Presentation to ORU College of Science and Engineering - November 27, 2012

Transcript of modeling melting points

COLLECTING, CURATING, AND MODELING MELTING POINTS

Andrew Lang

Professor of Mathematics

Oral Roberts University

Open Drug Discovery for Neglected Diseases

MalariaSchistosomiasis Gram positive bacteriaBreast Cancer

Drugs for neglected diseases

need to be…

cheap and…

easy to make.

docking

combinatorial library

synthesis

solvent selection

recrystallization

biologicalassay

solubility models

solubility data

melting point models

melting point data

The big picture

docking

combinatorial library

synthesis

solvent selection

recrystallization

biologicalassay

solubility models

solubility data

melting point models

melting point data

Oral Roberts University undergraduate research

Cameron NeylonBiophysicist RAL

David BulgerMD/PhD Student

Tennessee

Solubility Measurements and Ugi Product Synthesis at ORU,

Drexel, and RAL

Submeta ONS Award Winner, BOE Award Winner

Supervisors: Robert Stewart, Lois Ablin, Bill Collier, Joel

Gaikwad, Jean-Claude Bradley, and Cameron Neylon

Lizzie ClarkNursing Major

Lacey CondronChemistry Major

Samantha Gaines, Lizzie Clark, and Lacey Condron

Solubility Measurements and Solubility Modeling at ORU

Supervisors: Ken Weed, Lois Ablin

Daryl Charron, Alejandro Hernandez, Maria Hernandez, Jesse Patsolic, Matthew Wilson

Cluster Computer Construction and In-Silico

Docking at ORU

Supervisors: Ken Preston

docking

combinatorial library

synthesis

solvent selection

recrystallization

biologicalassay

solubility models

solubility data

melting point models

melting point data

Let’s focus

Early models, before 2005 were…

…specialized1979 Martin – disubstituted benzenes1987 Hanson – normal alkanes1988 Needham – normal and branched alkanes1990 Abramowitz – non-hydrogen bonded benzenes1991 Dearden – anilines1993 Katritzky – aldehydes, amines, and ketones1994 Simamora – rigid aromatic1996 Charlton – alkanes1996 Katritzky – pyridines1999 Zhao – aliphatic2001 Chickos – homologous series2003 Bergstrom – druglike (N = 277, r2 = 0.54)

In 2005…

…everything changed

MDPI - cheminformatics.org

Karthikeyan 2005 N = 4173, r2 = 0.65

PHYSPROP

Clark 2005 N = 6257, r2 = 0.61

Recent melting point models use these datasets…

…never reproducing r2 = 0.65 (0.47 – 0.56)

Even though [a] melting point can be measured accurately, its prediction has been a notoriously difficult problem.

We began measuring, collecting, and curating melting points in the Fall of 2010

Jean-Claude Bradley’sChemical Information Retrieval

Course at Drexel

567 curated and referenced measurements from Fall 2010 Chemical Information Retrieval course

Most popular data sources…

…chemical vendors

Alfa Aesar donates ~13,000melting points to the public domain

collection

curation

modelingvalidation

measurement

ONS melting point

workflow

Collection: Open Datasource data points curated values source year data type

Bell 2483 1631 1995 donated-CC0

Bergstrom 277 277 2003 open

MDPI-Karthikeyan 4450 4084 2005 open

Hughes 287 262 2008 open

Oxford-MSDS 3217 1481 2010 open

Drugbank 875 875 2011 open

Griffiths 3757 278 2011 donated-CC0

Alfa Aesar 12986 8739 2011 donated-CC0

PHYSPROP 11645 9694 2011 donated-CC0

ONS 471 471 2012 open

27792 curated measurements for 19410 compounds

Curation is…

…lots of hard, tedious work(Jean-Claude Bradley and Antony Williams)

Antony Williams – RSC ChemSpider

Inconsistencies and SMILES problems within the “high trust level” MDPI dataset

PHYSPROP Structure Errors (Incorrect Valence)2315 out of 43543 contained pentavalent nitrogens

PHYSPROP Errors: Structure displayed is for the neutral compound dopamine but the associated CAS Number and

chemical name in the file are for the hydrobromide salt.

Common errorsunit errors: Kelvin/Celsius, Fahrenheit/Celsius

bad SMILES (non-rendering, hypervalency)

salts associated with SMILES for free base

using boiling point for melting point

Some melting points can’t be resolved only with literature: 4-benzyltoluene

Open lab notebook page measuring the melting point of 4-benzyltoluene

Modeling – All Data

Melting Point Model

CDKdescriptor calculator

Rstatistical computing

melting point data

MP Model N = 19515, r2 = 0.80

use this model

Modeling – Highly Curated Subset

compoundsdoubleplusgoodsingle

CDKdescriptor calculator

Rstatistical computing

data

Melting Point Model

MP Model N = 2704, r2 = 0.83

Straight chain carboxylic acids from 1 to 10 carbons

Straight chain alcohols from 1 to 10 carbons

Comparison of model with double+ validated measurements

Cyclic primary amines from 3 to 6 carbons cyclobutylamine flagged for measurement

only single source available

Publication of double+ validated melting point dataset

…as a preprint

Publication of double+ validated melting point dataset

…as a book

Data and model deployed…

…on the web

web service

…in Google spreadsheets

…as an app

Use case: recrystallizing dibenzalacetone

Can the solvents used to recrystallize compounds in organic teaching labs be improved?

Trans-dibenzalacetone

Aldol condensation between two molecules of benzaldehyde and one molecule of acetone

[Matthew McBride: Undergraduate Research Assistant - Drexel]

Dibenzalacetone First recrystallized in ethyl acetate in 1906: Straus

and Ecker, Ber. 39, 2988 (1906) Recrystallized in ethyl acetate in Organic Syntheses

Organic Teaching Labs

Recommended recrystallization solvent: ethyl acetate.

(http://classes.kvcc.edu/chm230/mixed%20aldol%20condensation.pdf

(http://www.xula.edu/chemistry/documents/orgleclab/Aldol_notes.pdf)

Recrystallization AppEnter compound identification and desired parameters

How does it work?

1. Look up the solvent boiling point

2. Look up the room temperature solubility or predict it via measured or predicted Abraham descriptors

3. Look up the solute melting point or predict it via a model

4. Use the melting point and the solubility at room temperature to predict the solubility at boiling

5. Calculate the predicted recrystallization yield

ResultsLists solvents and their predicted recrystallization yield.

Prediction is generated by the temperature dependent solubility curves.

Comparison ethyl acetate (predicted yield of 72%) vs ethanol

(predicted yield of 93%) ethyl acetate

ethanol

0.09M

1.1M

0.62M

2.06M

Dibenzalacetone derivatives docking against tubulin (paclitaxel site)

Example Derivatives of dibenzalacetone may be synthesized

by altering the aldehyde used From a library of derivatives, the following

compound was the top hit for the docking site of Taxol

Uses phenanthrene-9-carboxaldehyde

Search Literature Perform a Reaxys search to determine availability

of synthesis procedures

No results

[Matthew McBride: Undergraduate Research Assistant - Drexel]

Synthesis and recrystallization solvents chosen using ONS models

Used methanol and benzene

Melting Point: 264-265°C

(http://usefulchem.wikispaces.com/EXP286)

[Matthew McBride: Undergraduate Research Assistant - Drexel]

AcknowledgementsORU Biology and Chemistry FacultyJean-Claude Bradley (Drexel)Cameron Neylon (RAL)Antony Williams (RSC ChemSpider)Evan Curtin (Drexel)Matthew McBride (Drexel)

ORU research assistants: David Bulger, Daryl Charron, Lizzie Clark, Lacey Condron, Samantha Gaines, Alejandro Hernandez, Maria Hernandez, Jesse Patsolic, and Matthew Wilson