Dumas IR Statistical Analysisftp.esrf.eu/.../Dumas_IR_Statistical_Analysis.pdfWorkshop «HDF5 as...
Transcript of Dumas IR Statistical Analysisftp.esrf.eu/.../Dumas_IR_Statistical_Analysis.pdfWorkshop «HDF5 as...
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Paul DUMASSOLEIL Synchrotron, France
Multivariate Analysis in Infrared Spectroscopy
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Synchrotron IR data and «« basic »analysis
1000 1500 2000 2500 3000 3500 Wavenumbers (cm-1)
Point spectroscopy
Su
gar
regio
n
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Synchrotron IR data and «« basic »analysis
ChemicalChemical
imagingimaging
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Microscopes are commercial and softwares are proprietary
OPUS ( Bruker)OMNIC( Thermo Fischer)
Essentially used at SR-IR beamlines =
- Facing difficulties for providing the softwares to users
�Since data analysis is time consuming, usersrequest the software for post analysis whenback to their lab or institutes.
- Facing difficulties for exchanging format
�Users request beamtimes at differentbeamlines, and ask for merging data on equivalent samples, from different softwares
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Some developers have madepossible IR data analysis
PyMCA ( A. Solé)
Constraint in file name format, not as friendly-user as the proprietary ones
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Some developers have madepossible IR data analysis
Axis2000 ( A. Hitchcock, C. Jacobsen)
Not as friendly-user as the proprietary ones
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Statistical analysis on spectraand images
data
0.06
0.08
0.10
0.12
0.14
0.16
0.18
0.20
0.22
0.24
0.26
Nic
1000 1500 2000 2500 3000 3500
cm-1
Hundreds, even thousands of spectraSubtle difference expected
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Statistical analysis on spectraand images
PCA
HCA LDA Fuzzy-C means
ANN….. K-meansclustering
data
Specific preprocessing required
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Typical example
1000 1500 2000 2500 3000 3500
Wavenumbers (cm-1)
MIE scattering+
Dispersive effect
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
However, single cell spectraare often affected by physical phenomena
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
P.Bassan, H.J. Byrne, F.Bonnier, J.Lee, P.Dumas and Peter Gardner ( The Analyst, 2009)
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
There are data treatment published( not yet satisfactory) to account for these effects
Estimating and Correcting Mie Scattering in Synchrotron-Based MicroscopicFourier Transform Infrared Spectra by Extended Multiplicative Signal Correction KOHLER A. ; SULE-SUSO J. ; SOCKALINGUM G. D. ; TOBIN M. ; BAHRAMI F. ; YANG Y. ; PUANKA J. ; DUMAS P. ; COTTE M. ; VAN PITTIUS D. G. ; PARKES G. ; MARTENS H.Applied spectroscopy 2008, vol. 62, no3, pp. 259-266
Resonant Mie scattering in infrared spectroscopy of biological materials -understanding the 'dispersion artefact' BASSAN Paul ; BYRNE Hugh J. ; BONNIER Franck ; LEE Joe ; DUMAS Paul ; GARDNER Peter Analyst 2009, vol. 134, no8, pp. 1586-1593 [
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
IR data preprocessing
1- Derivative spectra
3500 3000 2500 2000 1500 1000Wavenumbers ( cm-1)
Raw spectrum
First derivative
Second derivative
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
IR data preprocessing
2- Smoothing
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
1800 1700 1600 1500 1400 1300 1200 1100 1000
SG 11 points
SG 9 points
SG 7 points
SG 5 points
SG 3 points
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
IR data preprocessing
3- Normalization
-0.4
-0.2
0.0
0.2
0.4
1700 1650 1600 1550 1500Wavenumbers ( cm-1)
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Statistical softwares mostly usedStatistical softwares mostlyused ( all proprietary and
expensives)
� Unscrambler ( www.camo.com)Individual spectra analysis
�Cytospec ( www.cytospec.com)Image analysis
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Importing data in Unscrambler
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Scores Loadings
Influence Explained variance
PCA Output
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
PCA additional features
Selecting spectrainside clusters
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
PCA additional features
Average spectra ofselected points
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
What’s missing in Unscrambler?
�Extracting all spectra from a 2D recording
�PCA of selected frequency ( energy region) region andof two non-adjacent frequency ( energy) regions
�Projecting the scores into the image of re-assembledspectra
Weak points:
Expensive ( between 15k€ to 35k€)
�Beta version of integrated imaging in June 2010
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Hyperspectral imagingwith Cytospec
( Mathlab based-)
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Importing data in Cytospec
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Univariate imaging
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Pre-processing options
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Multivariate imaging options
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
HCA imaging options
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
HCA : cluster imaging
Two clusters
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
HCA : cluster imaging
Three clusters
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Four clusters
HCA : cluster imaging
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
HCA : cluster imagingAverage spectra of each cluster
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Where are the needs?
Routine all exists, needs to reassemble them
Mathematical treatment package
Experiment type ?
IR STXM Diffraction UV fluorescence Others…
« Specific» Menu
X-Fluo
�Spectra treatment and imaging from non-proprietary software
�Statistical analysis of spectra and images , with a common approach for all micro-spectroscopicanalytical tools
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
Where are the needs?
�Merging several spectral range on a same spectra
�Combining with other approaches on the same sample� resolution� probing depth� image alignment
+ +
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 P. Dumas
In summary
Data acquisition is usually achieved using a proprietary software:
* they are all very good but…* difficulties in providing a copy to users* exchange data format difficult,
Statistical approaches do exist, most of the timeproprietary.But niether one is fully satisfactory.. Mathlab routines, IDL… PyMCA are available…
Users demand for software practice after theirexperiment is high, and there is no easy way to fulfil their request.