Introduction toIntroduction to Sensory Data Analysis - · PDF fileSensory Data Analysis:...
Transcript of Introduction toIntroduction to Sensory Data Analysis - · PDF fileSensory Data Analysis:...
Introduction toIntroduction to Sensory Data Analysis
Marion CunyfCamo Software AS
Sensory Data Analysis:Sensory Data Analysis: Course outline:
1. Why sensory data analysis?
2 Data collection and experimental design2. Data collection and experimental design
3. Inspection and preparation of the dataa. Theory
b D i Q li Sb. Demo in Quali‐Sense
4. Principal Component Analysis: PCAa. Theory
b. Demo in The Unscrambler
5. Partial‐Least Square Regression: PLSa. Theory
b. Demo in The Unscrambler
Sensory Data Analysis:Sensory Data Analysis: Course outline:
1. Why sensory data analysis?
2 Data collection and experimental design2. Data collection and experimental design
3. Inspection and preparation of the dataa. Theory
b D i Q li Sb. Demo in Quali‐Sense
4. Principal Component Analysis: PCAa. Theory
b. Demo in The Unscrambler
5. Partial‐Least Square Regression: PLSa. Theory
b. Demo in The Unscrambler
What can sensory data analysis provide us ?
1. Why sensory data analysis?
What can sensory data analysis provide us ?• Describing product characteristics / Quality Control
– Sensory panel of experts sensory profileSensory panel of experts sensory profile– Chemical / industrial process measurements
Multivariate regression analysis
h l lcheaper quality control• Understanding of the behaviour and liking of the
consumers
– Consumer studies preferences mapping
PCA / Multivariate regression analysis
Relating product characteristics to the needs g pof the consumers / Prediction of market response to a new product
• Investigation of competitive products / new recipesg p p / p
PCA
Positionning
Can we trust the sensory panel?
1. Why sensory data analysis?
Can we trust the sensory panel?Assessors consistently give variable results, due to differences in
motivation, sensitivity, and psychological response behaviors.In a sensory lab:
• assessors come and go• assessors come and go• time for training is short,
measuring and tracking each assessor’s performance is essential.
Focus of today
1. Why sensory data analysis?
Focus of today
Check the performance of the panelCheck the performance of the panel– Seeking the attributes that are the most reliable
d h l h d– Finding the panelists that need more training
Modeling– Behaviour of the attributes, grouping of samples (PCA)
– Regression over the preference (PLS)
Sensory Analysis workflow
1. Why sensory data analysis?
Selection of
Sensory Analysis workflow
DoE Selection of the Products
DoE
Analysis of the Products
Selection of the judges
DATAJudge
CHEMICAL data on the products
DATAPRODUCT
Check the data StatisticsANOVA
StatisticsANOVA
Relation Sensory Profile
Preference mapping
Check the model & results
between chemical and sensory data
MVAMVA
Sensory Data Analysis:Sensory Data Analysis: Course outline:
1. Why sensory data analysis?
2 Data collection and experimental design2. Data collection and experimental design
3. Inspection and preparation of the dataa. Theory
b D i Q li Sb. Demo in Quali‐Sense
4. Principal Component Analysis: PCAa. Theory
b. Demo in The Unscrambler
5. Partial‐Least Square Regression: PLSa. Theory
b. Demo in The Unscrambler
Data collection and experimental design
2. Data collection and experimental design
Data collection and experimental designin Sensory
Depending on objectives:
• Positionning Samples from the market
• Products (new recipe, QC) / reference
Experimental design• Maximum acceptance
E i l d iExperimental design
Requirements to input data
2. Data collection and experimental design
Requirements to input data
• Representative: Samples must be Sampling 1• Representative: Samples must be representative with respect to:– Average values
Variability
Population
l– Variability– Levels
• Accurate/Reproducible: The grades must be the same for the same
Sampling 2
Accurate/Reproducible: The grades must be the same for the same product independently of the panelist and time
Garbage in gives garbage out: No software program should find information where none exists.p g
Sensory Data Analysis:Sensory Data Analysis: Course outline:
1. Why sensory data analysis?
2 Data collection and experimental design2. Data collection and experimental design
3. Inspection and preparation of the dataa. Theory
b D i Q li Sb. Demo in Quali‐Sense
4. Principal Component Analysis: PCAa. Theory
b. Demo in The Unscrambler
5. Partial‐Least Square Regression: PLSa. Theory
b. Demo in The Unscrambler
Inspecting the data
3. Inspection and preparation of the data / a) Theory
Inspecting the data
• Data that are different from the othersthe others
(Typing error, missing values...)
• Distribution of the samples for different attributes:for different attributes: – uniform,
– groupings...g p g
3. Inspection and preparation of the data / a) Theory
Judging panel performance
1. Assessor – Sensitivity
– ReproducibilityReproducibility
2. Panel AgreementChecking for Crossover andChecking for Crossover and ranking (Eggshell Correlation)
Example data set: Tomatoes
3. Inspection and preparation of the data / a) Theory
17 tomato varieties (products)
Example data set: Tomatoes
11 descriptive evaluations (attributes) grade: 0 to 10
14 trained assessors (panelists)
3. Inspection and preparation of the data / a) Theory
Quali-Sense
Importing the data in Quali Sense
3. Inspection and preparation of the data / a) Theory
Importing the data in Quali‐Sense
• Select the columns• Select the columns corresponding to the products andthe products and panelists
• Exclude the• Exclude the colums that are not attributesnot attributes
• Adjust the score range
Preview of the Data
3. Inspection and preparation of the data / a) Theory
Preview of the Data
Spider plot Branch= Attribute
Overview by product of the judges’ grades on the diferent attributes
Line= Panelist
Sensitivity
3. Inspection and preparation of the data / a) Theory
Sensitivity• Measures the ability of a single assessor to identify product differences.
• A low p‐value shows a significant difference between products, and is thus good.
Panelist needing trainingAttribute not discriminative
Panelist needing training
Reproducibility
3. Inspection and preparation of the data / a) Theory
Reproducibility Monitors the ability of a single assessor to reproduce its y g presults with respect to the rest of the panel.
Reproducibility
3. Inspection and preparation of the data / a) Theory
Reproducibility
Size of the spot = mean difference in repeated pscores for all products
Color of the spot = frequency of bad replication
Assessor agreement
3. Inspection and preparation of the data / a) Theory
Assessor agreementThe agreement test measures each individual assessor's agreement compared to the rest of the panel.
Cross over
3. Inspection and preparation of the data / a) Theory
Cross‐overCrossover effects occur when an assessor scores products opposite in intensity to the rest of the panel.
Bad agreement and high cross-over probability indicate misused of the scale
Test 5: Rank Correlation
3. Inspection and preparation of the data / a) Theory
Test 5: Rank Correlation• Rank correlation is also a form of agreement test.
• Here the ranking instead of score values are used and compared between• Here, the ranking instead of score values are used and compared between assessors.
• Rank correlation measures the correlation between an assessor and the panel consensus ranking of products.
• Rank correlation values can be used to form so called ”eggshell” plots.
Rank correlation table
3. Inspection and preparation of the data / a) Theory
Rank correlation tableIn this test, the assessor differences are found using the assessors' cumulative product ranks instead of the assessor scores directly.
Select the trusted data
3. Inspection and preparation of the data / a) Theory
Select the trusted dataExclusion of panelist, samples, attributes
Make an average of the trusted data for
3. Inspection and preparation of the data / a) Theory
Make an average of the trusted data for multivariate analysis
Sensory Data Analysis:Sensory Data Analysis: Course outline:
1. Why sensory data analysis?
2 Data collection and experimental design2. Data collection and experimental design
3. Inspection and preparation of the dataa. Theory
b D i Q li Sb. Demo in Quali‐Sense
4. Principal Component Analysis: PCAa. Theory
b. Demo in The Unscrambler
5. Partial‐Least Square Regression: PLSa. Theory
b. Demo in The Unscrambler
3. Inspection and preparation of the data / b) Demo in Quali-Sense
Quali-Sense
Sensory Data Analysis:Sensory Data Analysis: Course outline:
1. Why sensory data analysis?
2 Data collection and experimental design2. Data collection and experimental design
3. Inspection and preparation of the dataa. Theory
b D i Q li Sb. Demo in Quali‐Sense
4. Principal Component Analysis: PCAa. Theory
b. Demo in The Unscrambler
5. Partial‐Least Square Regression: PLSa. Theory
b. Demo in The Unscrambler
Principal Component Analysis (PCA)
4. Principal Component Analysis: PCA / a) Theory
Principal Component Analysis (PCA)
• Exploratory data analysisData structure in PCA:• Each row represents an observationp y y
• Extract information
• Noise removal Variable 1 Variable 2 Variable 3
• Each row represents an observation• Each column represents a variable
Noise removal
• Dimensionality reduction Object 1
Object 2
bjObject 3
Object 4
X Model Error
Data Structure Noise
Principal Component Analysis (PCA)
4. Principal Component Analysis: PCA / a) Theory
Principal Component Analysis (PCA)
New latent variables that are linear combinations of the original variables.
PC1 = a1 V1 + a2 V2 + a3 V3
X = Mean + b1 PC1 + b2 PC2 + Error
Constraints :
• Maximise the dispersion of samples along the ( )latent variables (the variance)
• Orthogonality
PCA = A change of variable space
Principal Component Analysis (PCA)
4. Principal Component Analysis: PCA / a) Theory
Principal Component Analysis (PCA)Average =
e
Principal Component 1 (PC 1)
most typical example
e
PC1PC2
Adh
esiv
e
Adh
esiv
e PC2
Varimax Rotation
4. Principal Component Analysis: PCA / a) Theory
Varimax RotationThe aim is to enhance interpretationRotation works on the structured part of the data only (depends on the selected number of PCs)
PC2Scores Loadings PC2
PCs)Total explained variance is not changed (But more evenly distributed among PCs)
PC1 PC1
1 4
C
2 3
Scores Loadings
Data preprocessing before PCA
4. Principal Component Analysis: PCA / a) Theory
Data preprocessing before PCA
• In practice there is often a need to slightly modify• In practice, there is often a need to slightly modify the shape of the data to better suit an analysis.
• Such a modification is called preprocessing or pretreatment. (centering, scalling, derivative...)
• But when we use a trained panel it is not necessary
4. Principal Component Analysis: PCA / a) Theory
Example data set: Tomatoes17 tomato varieties (products)
Example data set: Tomatoes
11 descriptive evaluations (attributes) grade: 0 to 10
14 trained assessors (panelists)
PCA vocabulary
4. Principal Component Analysis: PCA / a) Theory
PCA vocabularyPrincipal components
Main data variations also known as ”latent variables” ”factors” and ”eigenvectors”Main data variations, also known as latent variables , factors and eigenvectors .
Scores, TMap of samples: Projected locations of objects onto the principal components
L di PLoadings, PMap of variables: Correlation between variables (regression of X on T)
Residuals, EError. The data can be divided into structure and residual: X = Xstruct + E
VarianceResidual variance – variance remaining in EResidual variance variance remaining in EExplained variance – The % variance explained by Xstruct
Model Equation: X = TPT + Estructure residualstructure residual
Example data set: Tomatoes
4. Principal Component Analysis: PCA / a) Theory
Example data set: TomatoesExternal color Acidity
The scale has been used with good The scale has been used with a gvariation3 groups appeared
small variation rangeAlmost uniform distribution
Data overview
4. Principal Component Analysis: PCA / a) Theory
Data overview
Check the range of value... No outlier
Do a PCA
4. Principal Component Analysis: PCA / a) Theory
Do a PCA
Number of component to take into account
4. Principal Component Analysis: PCA / a) Theory
Number of component to take into account
Explained variance in cross‐validation
With the explained variance in validation we decide to take into account 5 PCstake into account 5 PCs
Map of samples
4. Principal Component Analysis: PCA / a) Theory
Map of samples
• Tomatoes displayed as a score plot.
• The purpose is to
Average sample
describe products according to their sensory characteristics.
• The relative positions of products reflect their similarities or differencesdifferences.
Map of variables
4. Principal Component Analysis: PCA / a) Theory
Map of variablesHigh contribution on PC 2Firmness and Firmness inside are
• Loadings can be visualized to map
correlatedAnticorrelated with Meltyness
which variables have contributed to the score plot.
• Variables far away from the center are well described
Not contributing to PC1 & 2
and important
• Variables near the center are less
High contribution on PC 1
Tomato odor/flavor, Juciness, Sweetness, External color are anti‐correlated with Mealyness
important.
Bi Plot
4. Principal Component Analysis: PCA / a) Theory
Bi‐Plot
Bi Plot with Varimax rotation
4. Principal Component Analysis: PCA / a) Theory
Bi‐Plot with Varimax rotation
Conclusions on PCA
4. Principal Component Analysis: PCA / a) Theory
Conclusions on PCA• Some variables are correlated :
”Firm” and ”Firm inside” //// ”Meltiness”– Firm and Firm inside //// Meltiness
– ”Tomato odor”, ”Tomato flavor”, ”Juiciness”, ”Sweetness”, ”External color” //// ”Mealyness”
f h bl b l d f fSome of those variables can be selected if we want to save on time of sensory analyses
• Some variables are not descriminative: ”Acidity” and ”Skin width”They don’t have to be tested in the future.
• Some tomato varieties are presenting similar characteristics: F d K ”Fi ”– F and K are ”Firm”
– G and H are ”Firm inside”
– A, O and C are ”Juicy”
– Q and D are ”Melty”
Some can be dropped in a consumer study
Sensory Data Analysis:Sensory Data Analysis: Course outline:
1. Why sensory data analysis?
2 Data collection and experimental design2. Data collection and experimental design
3. Inspection and preparation of the dataa. Theory
b D i Q li Sb. Demo in Quali‐Sense
4. Principal Component Analysis: PCAa. Theory
b. Demo in The Unscrambler
5. Partial‐Least Square Regression: PLSa. Theory
b. Demo in The Unscrambler
4.Principal Component Analysis: PCA / b) Demo in the Unscrambler
Sensory Data Analysis:Sensory Data Analysis: Course outline:
1. Why sensory data analysis?
2 Data collection and experimental design2. Data collection and experimental design
3. Inspection and preparation of the dataa. Theory
b D i Q li Sb. Demo in Quali‐Sense
4. Principal Component Analysis: PCAa. Theory
b. Demo in The Unscrambler
5. Partial‐Least Square Regression: PLSa. Theory
b. Demo in The Unscrambler
Regression methods
5. PLS regression / a) Theory
Find a linear relationship between Y (variables to
Regression methodsp (
predict) and the x‐variables (variables explaining the data)
Fitted al e)
Y=B0+B1X1+ B2X2+…+ BNXN+ FFitted value
Y
With PLS: the new variables are called “latent variables” (linear combination from the former variables)
f
from the former variables)
Y=B0+B1LV1+ B2LV2+…+ BNLVN+ F
LV a X + a X + + a X
Observation
XLVi = a1X1+ a2X2+…+ apXp
PLS terminology
5. PLS regression / a) Theory
PLS terminology
Scores: (X T Y : T (or U)) M f l P j t d l tiScores: (X‐scores: T, Y‐scores: T (or U)) Map of samples. Projected locations of objects onto the model components.
Loadings: (X‐loadings: P, Y‐loadings: Q) Map of variables. Describes g ( g , g Q) prelationships between either X or Y variables.
Loading weights: (X‐loading weights: W) Describes relationships between X d Y i bland Y variables.
Residuals: (X‐residuals: E, y‐residuals: F) Error.
Variance M f id l / d f f d id l iVariance: Mean squares of residuals / degrees of freedom = residual variance
Model equations: X = TPT + E and Y = TQT + F
R i ffi i t Y B X *B X *B X *BRegression coefficients: Y = B0 + X1*B1 + X2*B2 + ... + XN*BN
Example data set: Tomatoes
5. PLS regression / a) Theory
17 tomato varieties (products)
11 descriptive evaluations (attributes) grade: 0 to 10
Example data set: Tomatoes
11 descriptive evaluations (attributes) grade: 0 to 10
14 trained assessors (panelists)
Distribution of Y = Preference
5. PLS regression / a) Theory
Distribution of Y = Preference
Do a PLS 1
5. PLS regression / a) Theory
Do a PLS 1
Selecting the number of latent variables
5. PLS regression / a) Theory
Selecting the number of latent variables
Model with 1 latent variable
Sample mapping
5. PLS regression / a) Theory
Sample mapping
Preference
Attributes explaining the preference
5. PLS regression / a) Theory
Attributes explaining the preference
Important variables
Not important
Preference is strongly correlated with ”External color” ”Sweetness” ”Tomatocolor , Sweetness , Tomato flavor” and ”Juiciness”And strongly anti‐correlated with ”M li ”
Prediction quality
5. PLS regression / a) Theory
Prediction qualityGood R2 good correlation between
di ti d t
Good validation error small error
prediction and measurement
(0.3) when predicting the preference: from 3 to 10
What did I earn ?
5. PLS regression / a) Theory
What did I earn... ?
• A new Tomato variety could be tested by a sensory• A new Tomato variety could be tested by a sensory panel on a restraint number of attributes:
M li– Mealiness
– External color
– Tomato flavor
– Juciness Gain of time and money
– Sweetness
• To predict the consumer liking
Sensory Data Analysis:Sensory Data Analysis: Course outline:
1. Why sensory data analysis?
2 Data collection and experimental design2. Data collection and experimental design
3. Inspection and preparation of the dataa. Theory
b D i Q li Sb. Demo in Quali‐Sense
4. Principal Component Analysis: PCAa. Theory
b. Demo in The Unscrambler
5. Partial‐Least Square Regression: PLSa. Theory
b. Demo in The Unscrambler
6. Summary
5. PLS regression / b) Demo in The Unscrambler
Sensory Data Analysis:Sensory Data Analysis: Course outline:
1. Why sensory data analysis?
2 Data collection and experimental design2. Data collection and experimental design
3. Inspection and preparation of the dataa. Theory
b D i Q li Sb. Demo in Quali‐Sense
4. Principal Component Analysis: PCAa. Theory
b. Demo in The Unscrambler
5. Partial‐Least Square Regression: PLSa. Theory
b. Demo in The Unscrambler
6. Summary
Summary6. Summary
Summary1. Managing data from panelists - Evaluation of panel performance
i i i i2. Univariate statistics 3. Principal component analysis (PCA)4 Varimax rotation4. Varimax rotation5. Regression analysis (PCR, PLS, MLR)6. Preference mapping7. L-PLS regression8. Cluster Analysis9 Classification (SIMCA PLS DA)9. Classification (SIMCA, PLS-DA)10. 3-way PLS regression11. Design of Experiment
CAMO Products
6. Summary
CAMO ProductsOn‐line applications:
The UnscramblerA complete Multivariate Analysis and Experimental Design software.
pp
•The Unscrambler on‐line •OLUC •OLUP•OLUP
A plug 'n' play product designed to make effective on‐line predictions and classifications, to monitor processes and ensure quality control with spectroscopic measurements.
Product OptimizerA powerful product formulation ensure quality control with spectroscopic measurements.A powerful product formulation and process optimization tool.
Training and ConsultancyQuali‐SenseThe best companion for panel leader detects the personal strengths and weaknesses of each assessor in your
Training and ConsultancyDesigned courses and support to help you get the best of your experiments and data
weaknesses of each assessor in your panel
h k f iThank you for your attention
Marion Cuny for technical questions: [email protected]
Maria Falcão for sales: maria@camo noMaria Falcão for sales: [email protected]