Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry...
-
Upload
tamsin-copeland -
Category
Documents
-
view
218 -
download
2
Transcript of Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry...
![Page 1: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/1.jpg)
Many Patterns & Many Methods
Gordon Barr, Chris Gilmore & Gordon CunninghamWestCHEM, Chemistry Department, University of Glasgow
New methods for visualising & utilising multiple analysis techniques in polymorph and salt screening systems
www.chem.gla.ac.uk/snap
![Page 2: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/2.jpg)
The Problem
High throughput screening experiments can generate hundreds of PXRD patterns a day
Problems with:
Data quality.
Sample quality.
Data quantity.
Need for automation, and speed.
How do you deal with hundreds of samples from a single technique
(e.g. XRPD), let alone more than one at once?
![Page 3: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/3.jpg)
How to cluster powder patterns?
Compare pairs of patterns using full-profile parametric and non-parametric statistics
Match every data point – not just peak maxima!
Use correlation coefficients:Pearson correlation coefficient (parametric).Spearman correlation coefficient (non-parametric).
Correlation coefficient +1.0 Correlation coefficient -1.0
![Page 4: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/4.jpg)
How to cluster powder patterns?
Match two patterns:
-> Get a correlation coefficient
Pattern A matches Pattern B with a
correlation of:0.314
Match n patterns:
-> Get a correlation between every pair of
patterns
-> can build a n x n correlation matrix
![Page 5: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/5.jpg)
Correlations and Distances
Have a correlation matrix
Convert correlations to distances:Correlation = 1.0 distance = 0.0Correlation = -1.0 distance = 1.0Correlation = 0.0 distance = 0.5
Take the distance matrix and perform:
Cluster analysis, Principal components analysis, Metric multidimensional scaling, Fuzzy clustering, Minimum spanning trees etc.
To find ‘interesting’ patterns and to visualize the data.
![Page 6: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/6.jpg)
Methodology
Full profile matchingall patterns against all patterns
nXRPD
Patterns
nxn
Correlation Matrix
nxn
Distance Matrix
Optional Pre-
processing
PCAPrincipal
ComponentsAnalysis
MMDSMetric Multi-Dimensional
Scaling
Clustering viaDendrograms
Estimate number of clusters
Identify possible mixtures
Identify Most Representative
Patterns for each cluster
Cluster visualisation
tools
Colour-coded Cell Display
![Page 7: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/7.jpg)
Also indexed as: Cardura XL®, Cardura®Doxazosin is a member of the alpha blocker family of drugs used to lower blood pressure in people with hypertension.
Doxazosin is also used to treat symptoms of benign prostatic hyperplasia (BPH).
Study performed using 21 patterns of
5 polymorphic forms of Doxazosin
Example: Doxazosin
Cut Level
![Page 8: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/8.jpg)
Metric multidimensional scaling (MMDS)
![Page 9: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/9.jpg)
<- No processing
<- Light background subtraction
Full background subtraction V
Example: Carbamazepine
![Page 10: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/10.jpg)
2000 Pattern Dendrogram
![Page 11: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/11.jpg)
… I'd cast my eye over the spectra and have done a spectral comparison of the data by eye.
I INDEPENDENTLY came up with five different spectral groups. …… So bottom line is PolySNAP using background subtraction routines gave EXACTLY the same result as me doing a spectral comparison by eye.
….thought you all should know that IMHO this is a significant step forward.
Don Clark, Pfizer Global R&D
Raman data works too!
![Page 12: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/12.jpg)
Raman Data Differences
Different background types
Much smaller differences between patterns
Cosmic spike problems
Form A Form BXRPD
Raman
![Page 13: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/13.jpg)
Raman Example – 3 form pharma
![Page 14: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/14.jpg)
Different Data Types
Doesn’t have to be PXRD or Raman data:
IR
DSC
Other Profile Data Numeric Data
XRF
![Page 15: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/15.jpg)
Multiple datasets
Combined XRPD + Raman instruments now available
Applying multiple techniques to the same samples gives additional info to work with
How would we actually combine results from two (or more) such different techniques ?
![Page 16: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/16.jpg)
Methodology
Full profile matchingall patterns against all patterns
n
XRPD Patterns
nxn
Correlation Matrix
Full profile matchingall patterns against all patterns
n
Raman Patterns
nxn
Correlation Matrix
nxn
Distance Matrix
Combine
XRD results
Raman results
Combined results
nxn
Distance Matrix
nxn
Distance Matrix
![Page 17: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/17.jpg)
Combining Datasets
Manual weighting:
Give a single weight to each dataset as a whole
Combine datasets on that basise.g. Powder 0.8, Raman 0.2
Dynamic weighting:
Automatically calculate optimal weighting for each entry in each dataset
Unbiased solution that scales the differences between individual distance matrices
![Page 18: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/18.jpg)
Dynamic Weighting
Dynamic Weighting using INDSCAL:Independent Scaling of DifferencesCarroll & Chang, (1970) Psychometrica 35, 283-319
Each data set has a 2-D distance matrix d
Dk is squared (nxn) distance matrix for dataset k
e.g. we have Raman and XRPD data on 20 samples, so k = 2, n=20.
We want a Group Average Matrix G to optimally describe our data
Specify diagonal weight matrices Wk which can vary over the k datasets
![Page 19: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/19.jpg)
Dynamic Weighting
Matrices are matched to weighted form of G by minimising(1)
Where
(a double-centering operation on D), and
Solve (1) to get best values for G and W
The resulting G matrix is then used as before
![Page 20: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/20.jpg)
Example: Combining Four Techniques
Dataset of Sulphathiazol, Carbamazepine + Mixtures
16 samples each had data from:1. PXRD (collected on a Bruker C2 GADDS)
2. DSC (collected on a TA instruments Q100)
3. IR (collected on a JASCO FT/IR 4100)
4. Raman (collected on a Renishaw inVia Reflex)
Combinations:PXRD+Raman
PXRD+Raman+DSC
PXRD+Raman+DSC+IR …. etc. [up to 15 sets of results!]
![Page 21: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/21.jpg)
Side by side: Dendrograms
![Page 22: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/22.jpg)
Side by side: 3D MMDS
![Page 23: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/23.jpg)
Side by side: 3D MMDS
![Page 24: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/24.jpg)
Combined Data: All Four
![Page 25: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/25.jpg)
Live Demo – Multiple Datasets
![Page 26: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/26.jpg)
Combined Conclusions
Full Profile Matching + Cluster analysis methods do very well in distinguishing forms automatically using either Raman or PXRD data individually
Combined results using Dynamic Weighting seem to do better than either PXRD or Raman individually
Use of combined data helps highlight any inconsistencies in separate analysesSuch inconsistencies would not be obvious with only one data sourceOutliers can then be examined manually in detail
Seeing similar clustering from multiple original data sources increases confidence in the overall results
![Page 27: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/27.jpg)
Pre-screening large datasets
Full analysis as shown limited to up to 2,000 patterns per data set.
What if you’ve got more?
Is this new sample something seen before, or new ?
Pre-screening allows a single sample pattern to be compared to large in-house database of existing patterns.
Compare e.g. >66,000 samples to new unknown in ~20 mins
Return the best 50 matches, then visualise using dendrograms, 3D Plots etc as before
![Page 28: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/28.jpg)
Salt Screening Mode
Salt Screening: not interested in samples consisting ofOne of our starting materialsMixture of multiple starting materialsGiven a library of starting materials to compare the new samples
to:
Just highlight what’s new and interesting
![Page 29: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/29.jpg)
HH
C2
C4
R11
R25
OH3
OH6
How do I do this?
PolySNAP
Matlab or other stats packages
dSNAP
-Cluster & visualise 3D
fragment geometry similarities
from the
Cambridge Structural Database
![Page 30: Many Patterns & Many Methods Gordon Barr, Chris Gilmore & Gordon Cunningham WestCHEM, Chemistry Department, University of Glasgow New methods for visualising.](https://reader035.fdocuments.in/reader035/viewer/2022081519/56649d195503460f949eec79/html5/thumbnails/30.jpg)
Acknowledgements
Many thanks to….
Arnt Kern & Karsten Knorr, Bruker AXS
Chris Frampton & Susie Buttar, Pharmorphix
For more information, please contact us:
Email: Web:
[email protected] www.chem.gla.ac.uk/snap