Post on 05-Jan-2016
description
Validation of Predictive Models: Acceptable Prediction Zone Method
Thomas P. Oscar, Ph.D.USDA, Agricultural Research ServiceMicrobial Food Safety Research UnitUniversity of Maryland Eastern Shore
Princess Anne, MD
Background Information
Terminology
• Performance evaluation
– Process of comparing observed and predicted
values.
• Validation
– A potential outcome of performance evaluation.
– Requires establishment of criteria.
Criteria
• Test Data– Interpolation
– Extrapolation
• Performance– Bias
– Accuracy
– Systematic Bias
Secondary Models
Predictive Modeling
PrimaryModel
PrimaryModel
Nmax
Model
max
Model
Model
No
Model
Observed No Predicted No
Observed Predicted
Observed max Predicted max
Observed Nmax Predicted Nmax
PredictedN(t)
ObservedN(t)
TertiaryModel
PredictedN(t)
Stage 3
Stage 2
Stage 1
Performance Evaluation
Goodness-of-fitPrimary/Secondary Models
VerificationTertiary Models
InterpolationAll Models
ExtrapolationAll Models
Test Data CriteriaInterpolation
• Independent data.
• Within the response surface.
– Uniform coverage.
• Collected with same methods.Incomplete and biased evaluation
Model data (10 to 40C) versus
Test data (25 to 40C)
Test Data CriteriaExtrapolation
• Independent data.
• Outside the response surface.
– Only one variable differs.
• Collected with same methods.Confounded comparison
Strain A in broth versus
Strain B in food
Acceptable Prediction Zone MethodDescription
Relative Error (RE)
RE for = (predicted - observed)/predicted
RE for N(t), No, max and Nmax = (observed - predicted)/predicted
RE < 0 are “fail-safe”
RE > 0 are “fail-dangerous”
"Acceptable"
"Overly Fail-safe"
"Overly Fail-dangerous"
4 5 6 7 8 9 10 11-1.2
-0.8
-0.4
-0.0
0.4
0.8
1.2
1.6
Predicted N(t) (log CFU/g)
Rel
ativ
e er
ror
Performance Factor %RE = REIN/RETOTAL
Performance Criteria
• Acceptable Predictions
-0.30 < RE < 0.15 for max
-0.60 < RE < 0.30 for
-0.80 < RE < 0.40 for N(t), No, Nmax
• Acceptable Performance
%RE => 70
Acceptable Prediction Zone MethodDemonstration
Model Development Design
• Salmonella Typhimurium
– No = 4.8 log CFU/g
• Sterile cooked chicken
– 10, 12, 14, 16, 20, 24, 28, 32,
36, 38, 40C
• Viable counts
– BHI agar
– 12 per growth curve
Performance Evaluation DesignSecondary Models (Interpolation)
• Salmonella Typhimurium
– No = 4.8 log CFU/g
• Sterile cooked chicken
– 11, 13, 15, 18, 22, 26, 30, 34,
37, 39C
• Viable counts
– BHI agar
– 12 per growth curve
Primary ModelLogistic with Delay
N = No if t
N = Nmax/(1+[(Nmax/No)-1]exp[-max (t-)]) if t >
0 10 20 30 404
5
6
7
8
9
10
11
Dependent (goodness-of-fit)
32C
Time (h)
N (
log
CF
U/g
)
Primary Model PerformanceGoodness-of-fit
4 5 6 7 8 9 10 11-1.2
-0.8
-0.4
-0.0
0.4
0.8
1.2
1.6%RE = 93.8
Predicted N(t) (log CFU/g)
Rel
ativ
e er
ror
Secondary Model for No
No = mean No
5 10 15 20 25 30 35 40 454
5
6
7
8
9
10
11
Independent (interpolation)Dependent (goodness-of-fit)
No
(log
CF
U/g
)
Temperature (C)
No Model Performance
Type of EvaluationDependent (goodness-of-fit)Independent (interpolation)
%RE100100
4.70 4.75 4.80 4.85 4.90-1.0-0.8-0.6-0.4-0.2-0.00.20.40.60.81.0
Predicted No (log CFU/g)
Rel
ativ
e er
ror
Secondary Model for Hyperbola with Shape Factor
= [41.47/(T - 7.325)]1.44
5 10 15 20 25 30 35 40 451
10
100
Independent (interpolation)Dependent (goodness-of-fit)
Temperature (C)
(h
)
Model Performance
0 10 20 30 40 50 60-1.0-0.8-0.6-0.4-0.2-0.00.20.40.60.81.0
Type of EvaluationDependent (goodness-of-fit)Independent (interpolation)
%RE100100
Predicted (h)
Rel
ativ
e er
ror
Secondary Model for max
Modified Square Root
max = 0.01885 if T
11.43
max = 0.01885 + [0.004325(T – 11.43)]1.306 if T > 11.43
5 10 15 20 25 30 35 40 450.0
0.1
0.2
0.3
0.4
0.5Dependent (goodness-of-fit)Independent (interpolation)
Temperature (C)
max
(h-1
)
max Model Performance
Type of EvaluationDependent (goodness-of-fit)Independent (interpolation)
%RE100100
0.0 0.1 0.2 0.3 0.4-1.0-0.8-0.6-0.4-0.2-0.00.20.40.60.81.0
Predicted max (h-1)
Rel
ativ
e er
ror
Secondary Model for Nmax
Asymptote Model
Nmax = exp(2.348[((T – 9.64)(T – 40.74))/((T – 9.606)(T – 40.76))])
5 10 15 20 25 30 35 40 455
6
7
8
9
10
11
Independent (interpolation)Dependent (goodness-of-fit)
Temperature (C)
Nm
ax (
log
CF
U/g
)
Nmax Model Performance
Type of EvaluationDependent (goodness-of-fit)Independent (interpolation)
8 9 10 11-1.0-0.8-0.6-0.4-0.2-0.00.20.40.60.81.0 %RE
100100
Predicted Nmax (log CFU/g)
Rel
ativ
e er
ror
Secondary Models
Predictive Modeling
PrimaryModel
PrimaryModel
Nmax
Model
max
Model
Model
No
Model
Observed No Predicted No
Observed Predicted
Observed max Predicted max
Observed Nmax Predicted Nmax
PredictedN(t)
ObservedN(t)
TertiaryModel
PredictedN(t)
Tertiary Model PerformanceVerification
4 5 6 7 8 9 10 11-1.2
-0.8
-0.4
-0.0
0.4
0.8
1.2
1.6
Predicted N(t) (log CFU/g)
Rel
ativ
e er
ror
%RE = 90.7
Comparison of Models
Model REIN REOUT RETOTAL
Primary 121 8 129
Tertiary 117 12 129
Total 238 20 258
Fisher’s exact test; P = 0.48, not significant.
Performance Evaluation DesignTertiary Model (Interpolation)
• Salmonella Typhimurium
– No = 4.8 log CFU/g
• Sterile cooked chicken
– 11, 13, 15, 18, 22, 26, 30, 34, 37,
39C
• Viable counts
– BHI agar
– 4 per growth curve
Tertiary Model Performance Interpolation
0 5 10 15 20 254
5
6
7
8
9
10
11
Time (h)
N (
log
CF
U/g
)
Tertiary Model Performance Interpolation
4 5 6 7 8 9 10 11-1.0-0.8-0.6-0.4-0.2-0.00.20.40.60.81.0
Predicted N(t) (log CFU/g)
Rel
ativ
e E
rror
%RE = 97.5
Should the validated tertiary model be used to predict chicken safety?
• Evaluation for extrapolation to:
– other initial densities (No)
– other strains
– other chicken products
Performance Evaluation DesignTertiary Model (Extrapolation)
• Salmonella Typhimurium
– No = 0.8 log CFU/g
• Sterile cooked chicken
– 10, 12, 14, 16, 20, 24, 28, 32,
36, 40C
• Viable counts
– BHI agar
– 4 per growth curve
Tertiary Model Extrapolation to low No
0 10 20 30 400123456789
1011
Time (h)
N (
log
CF
U/g
)
4 5 6 7 8 9 10 11-10123456789
10 24 RE > 10
Predicted N (log CFU/g)
Rel
ativ
e E
rror
Tertiary Model PerformanceExtrapolation to low No
%RE = 2.5
Conclusions
• Criteria are important for evaluating performance of models.
• Consensus on validation would improve the quality and use of predictive models in the food industry.