GADataMining CNA
Click here to load reader
-
Upload
arpita1790 -
Category
Documents
-
view
238 -
download
0
Transcript of GADataMining CNA
Genetic Algorithms for
Data Mining
Sid Bhattacharyya
Overview• Genetic Algorithms: a gentle introduction
– What are GAs– How do they work/ Why?– Critical issues
• Using Genetic algorithms (effectively)• Use in Data Mining
Natural Genetics to AI
• Computational models inspired by biological evolution– survival of the fittest– reproduction through cross-breeding
Genetic Algorithms• Population based search (parallel)
– simultaneous search from multiple points in search space
population members: potential solutions
• Fitness function (search objective)– numerical “figure of merit”/utility measure of an individual
selection• “Mating” and reproduction of individuals
crossover, mutation• Evolution from one generation to the next
iterative search, convergence
Advantage GAs• General purpose, robust search technique
– application to varied problem types
• Data mining– fitness function: flexible expression of modeling criteria,
tradeoffs amongst multiple objectives– models optimized to specific business objectives
– diverse model representation– linear, non-linear interaction terms, rules, sequences, etc.
GA Application Examples• Function optimizers
– difficult, discontinuous, multi-modal, noisy functions• Combinatorial optimization
– layout of VLSI circuits, factory scheduling, traveling salesman problem
• Design and Control– bridge structures, neural networks, communication networks
design; control of chemical plants, pipelines• Machine learning
– classification rules, economic modeling, scheduling strategies
Portfolio design, optimized trading models, directmarketing models, sequencing of TV advertisements,adaptive agents, data mining, etc.
GAs: Basic Principles
• Representation of individuals– String of parameters (genes) : chromosome
eg. F(p,q,r,s,t): p q r s t– Bit-string representation (?):
1 0 0 1 1 0 1 0 1 1 0 1 1 0 0– genotype and phenotype
GAs: Basic Principles• Survival of the fittest (Fitness function)
– numerical “figure of merit”/utility measure of an individual
– tradeoff amongst a multiple evaluation criteria– efficient evaluation
GAs: Basic Principles
• Reproduction to create offspring– Selection– Crossover– Mutation
GAs: Basic Principles
• Convergence– progression towards uniformity in population– premature convergence?
(local optima)
GA: Basic Operation
Solution1 (f1)
Solution2 (f2)
Solution3 (f3)
Solution4 (f4)
...
...
SolutionN (fN)
Solution1
Solution2
Solution2
Solution4
...
...
SolutionX
Offspring1(1,4)
Offspring2(1,4)
Offspring3(2,7)
Offspring4(2,7)
...
...
OffspringN(x,y)
Selection RecombinationCrossover Mutation
Generation t Generation t+1
GAs: Parallel Search
X
X
Hill climber
Fitness
x
Typical GA Run
Fitness
Generations
Best
Average
Operators: Selection
• Fitness proportionate selection (fi/f )• number of reproductive trials for
individuals
Selection• Roulette-wheel selection
(stochastic sampling with replacement)
– wheel spaced in proportion to fitness values
– N (pop size) spins of the wheel
Selection• Stochastic universal sampling
– N equally spaced pins on wheel– single turn of the wheel
Selection• Premature converge• Fitness scaling
f = f - (2*avg. - max.)• Ranked fitness• Elitism• Steady-state selection• Demetic grouping
Operators: CrossoverParent 1: 11010 101100101Parent 2: xxyxx yxyyxxyxy
crossover site
Offspring 1: 11010 yxyyxxyxyOffspring 2: xxyxx 101100101
(Single-pt. crossover)
• combining good building blocks
CrossoverParent 1: axpsqvqbtpihdParent 2: qzxxaycgbtphw
crossover sites
Offspring 1: azpsavcbtpphdOffspring 2: qxxxqyqgbtihw
(Uniform crossover)
Crossover
Fitness
x
X X
X ParentsOffspring
Operators: Mutation
• alters each gene with small probabilityx 1 y x 0 y 0 y y 0 x y x y
x 1 y x 0 y 1 y y 0 x x x y
Recombination operators
• Mutation & premature convergence• Mutation vs. Crossover
– operator probabilities– which is more important?
• Optimal parameter settings (!)
Non-Binary Representations• Integer, real-number, order-based, rules, ...
• Binary or Real-valued?real representations give faster, more consistent, more accurate results
• High-level representation– intuitive, can utilize specialized crossover and mutation– effective search over complex spaces– design of representation and operators --forma theory
Real-valued representationParent1: 3.45 0.56 6.78 0.976 2.5Parent2: 0.98 1.06 4.20 0.34 1.8
Offspring1: 3.22 0.56 6.78 0.65 2.12Offspring2: 1.43 1.06 4.20 0.41 1.93
(Arithmetic crossover)
High-level representationParent1:Parent2:
Offspring1:Offspring2:
{(1 .2 x 3 .4 ) (5 .8 x 6 .0 ) (0 .2 x 0 .61 )}1 2 7≤ ≤ ∧ ≤ ≤ ∧ ≤ ≤
{( . . ) ( . . ) ( . . )2 3 41 36 51 51 5616 2 4≤ ≤ ∧ ≤ ≤ ∧ ≤ ≤x x x∧ ≤ ≤ ∧ ≤ ≤( . . ) ( . . )}0 3 11 2 2 2 73 9x x
{ ( . . ) ( . . )}(1.2 x 3.4)1≤ ≤ ∧ ≤ ≤ ∧ ≤ ≤2 2 2 7 51 5 619 4x x
{( . . ) [( . . ) ]2 3 41 36 516 2≤ ≤ ∧ ≤ ≤ ∨ ≤ ≤x x (5.8 x 6.0)2
∧ ≤ ≤ ∧ ≤ ≤( . . ) }0 3 113x (0 .2 x 0.61)7
High-level representation
{( . . ) ( . . )}0 3 11 2 2 2 73 9≤ ≤ ∧ ≤ ≤x x{( . . ) ( . . ) ( . . )}0 3 11 2 2 2 7 51 6 23 9 4≤ ≤ ∧ ≤ ≤ ∧ ≤ ≤x x x
• Generalize/Specialize
{( . . ) ( . . )}0 3 11 2 2 2 73 9≤ ≤ ∧ ≤ ≤x x
{( . . ) ( . . )}0 45 0 9 19 2 93 9≤ ≤ ∧ ≤ ≤x x
Tree-structured representation (GP)
/
x 5
log
*
(x log(y))/5)
y<
if
y 7
0
* y
x 2
+AND
>
x 2
If (y<7) and (x>2) then 0, else 2x+y
Genetic search: Issues
• Coding scheme, fitness function critical– General mechanism so robust that, within
reasonable margins, parameter settings are not critical.
– exploiting problem-specific knowledge– the “art” in GA design!
Genetic search: Issues• Stochastic search
– multiple runs with different random streams• Exploration vs. exploitation of search• Does not guarantee optimality ! But ….
• Structured population models• Parallelizable for large data
GAs and Optimization• Search space: representation• Global search without gradient information
– functions with multiple local optima – non-differentiable functions
• Robust, assumption-free, and very general• Hybrid approaches -- GAs with conventional
optimization techniques
Using GAs ?
• When to use a GA? • GA and traditional techniques• How long does it take?• Will it perform better?
Using GAs
• population size• mutation, crossover rates• how many generations• multiple runs
Is it a “black-box”?
? Huh?
• Data characteristics• Fitness function• GA parameters
GA Application Examples• Function optimizers
– difficult, discontinuous, multimodal, noisy functions• Combinatorial optimization
– layout of VLSI circuits, factory scheduling• Design and Control
– bridge structures, neural networks, communication networks design; control of chemical plants, pipelines
• Machine learning– classification rules, economic modeling, scheduling strategies
Portfolio design, optimized trading models, direct marketing models, sequencing of TV advertisements, adaptive agents, data mining, etc.
GAs and Data Mining
• Discovery• Prediction• Hypothesis testing and refinement
Data Mining• Pattern templates
([attribute in {v1,v2}] and [attribute=value]) or([attribute in {v1,v2,v3}] and [attribute>value]) or …
• when S, if C then Pwhen region=neif inc > 41K and child>2then x-sales>100
• when S, C and P are positively correlated• the mean of A when S and C, is significantly different
from the mean of A when S
S
PC
Data mining
• How good are the patterns– accuracy– coverage– support
• Understandability
# cases in C and P# cases in C
# cases in C and P# cases in P
# cases in C# cases in S
GA for Data Mining• Fitness evaluation
Expected values
Chi-square
– higher values imply C and P are related
Correlation • linear correlation -- product moment corr. coefficient• monotonically correlated -- Spearman’s rank corr. coeff.• Correlation coefficient x support
Interesting rule
n c c
21
22212
12111
2221
1211nnrnnr
nnCnnCPPS
+=+=
I
ncr
e jiij =
∑∑−
=i i ij
ijije
cn 22 )(
χ
SV
2 sCramer' χ
=
SPSCS
PCSII
II −
DM application• Symbolic models of consumer choice
– assumption-free– behavioral insights for targeting promotions– advantage over decision trees algorithms?
• DTs are stepwise optimal, but not globally so• high noise-sensitivity of DTs
– advantages over neural networks
{ ( ) ( ) ( ) ( )}3 5 4 0 4 3 6 3 5 5≤ ≤ ∧ < > ∧ >in c K a g e in c K a g e B u yo r th e n
Performance evaluation• Accuracy/Error rate
– will higher accuracy give better performance for the target task?
“The use of error rate often suggests insufficiently careful thought about the real objectives of the research” – David J. Hand, Construction and Assessment of Classification Rules.
True NFalse N
False PTrue P
Actual
Predicted P
N
P N • sensitivity, specificity• misclassification costs
• Of course, with 99:1 split in data, default dummy model gives 99% accuracy.
Model Representation
• Non-linear tree-structured models (GP)– Non-linear interaction terms– Function set : internal nodes
• {+,-,*,/,log}
– Terminal set: leaf nodes• {constants, variables}
/
x1 5
log
*
(x1 log(x3))/5)
x3
DM Performance: Decile Analysis
DecileNumber ofCustomers
Number ofResponses
ResponseRate(%)
CumulativeResponses
CumulativeResponseRate (%)
CumulativeResponse Lift
top 2500 2179 87.2 2179 87.2 4472 2500 1753 70.1 3932 78.6 4033 2500 396 15.8 4328 57.7 2964 2500 111 4.4 4439 44.4 2285 2500 110 4.4 4549 36.4 1876 2500 85 3.4 4634 30.9 1587 2500 67 2.7 4701 26.9 1388 2500 69 2.8 4770 23.9 1229 2500 49 2.0 4819 21.4 110
bottom 2500 55 2.2 4874 19.5 100Total 25,000 4874 19.5
100.*eperformanc avg. overall
decileeperformanc avg. cum. = decile LiftCumulative
Decile Maximization(DMAX)• Objective
Find model f(x) (predictor variables x)such that performance in upper deciles (specified depth-of-file) is maximized
• Explicitly manages resource constraint– mailings to particular depths-of file
• Performance at different mailing depths– models optimized for different mailing depths
DecileNumber of
Responders/Profit
top max2 max3 max456789
bottom
DMAX: Illustrative Example
0
5
10
15
20
25
30
35
40
45
0 5 10 15 20 25 30 35 40
$10
$4
$2
$6
$9$7
$3
$1
$5
$8
OLS($28)
DMAX 40% ($32)
OLS: .14 X1 + .06 X2DMAX 40%: .19 X1 + .07 X2
Profit X1 X2$10 45 5$9 35 21$8 31 38$7 30 30$6 6 10$5 45 37$4 30 10$3 23 30$2 16 13$1 12 30
GA DMAX• Representation: w1 w2 w3 .. wk
• Integrated variable selection• Fitness evaluation
– classification accuracy– model reliability – maximize specified decile performance
• response, profit, etc.
• Hybrid algorithm
Comparative Performance: Case I
• Response modeling– maximize response in top 3 deciles– 4.6% response to mailing
DMAX (30%): - 0.01X1 - 2.51X2 - 0.008X3 - 0.08X4
LOGIT : - 0.40 - 0.01X2 - 0.007X3- 3.25X4
Neural Network: 3 layers, 2 hidden nodes, 12 coefficients
Case I: Genetic Algorithm DMAX (30%)
DecileNumber
ofCustomers
Numberof
Responses
DecileResponse
Rate
CumResponse
Rate
CumResponse
Lifttop 4,617 865 18.7% 18.7% 4112 4,617 382 8.3% 13.5% 2963 4,617 290 6.3% 11.1% 2444 4,617 128 2.8% 9.0% 1985 4,617 97 2.1% 7.6% 1676 4,617 81 1.8% 6.7% 1467 4,617 79 1.7% 5.9% 1308 4,617 72 1.6% 5.4% 1189 4,617 67 1.5% 5.0% 109
bottom 4,617 43 0.9% 4.6% 100TOTAL 46,170 2,104 4.6%
Case I:Cum Response Lift Comparison
DecileGenetic
AlgorithmDMAX(30%)
LogisticRegression
NeuralNetwork
top 411 384 385 2 296 284 277 3 244 227 2214 198 194 1865 167 166 1646 146 146 1467 130 131 1318 118 119 1189 109 108 108
bottom 100 100 100
Case II 2% Response RateCum Response Lift Comparison
DecileGenetic
AlgorithmDMAX(10%)
GeneticAlgorithm
DMAX(20%)
GeneticAlgorithm
DMAX(30%)
GeneticAlgorithm
DMAX(40%)
LogisticRegression
1 220 186 191 192 194 2 174 195 166 166 165 3 157 173 179 150 1484 148 158 158 161 154*5 139 145 146 146 1466 131 135 138 138 1387 122 124 127 127 1278 114 116 117 117 1179 108 108 109 109 109
bottom 100 100 100 100 100
Case II: 2% Response RateSmoothness: Logistic Regression
DecileNumber
ofCustomers
Numberof
Responses
DecileResponse
Rate
CumResponse
Rate
CumResponse
Lifttop 7,203 283 3.9% 3.9% 1942 7,220 200 2.8% 3.3% 1653 7,225 165 2.3% 3.0% 1484 7,215 255* 3.5% 3.1% 154*5 7,227 167 2.3% 3.0% 1466 7,220 140 1.9% 2.8% 1387 7,209 89 1.2% 2.6% 1278 7,228 68 0.9% 2.4% 1179 7,205 65 0.9% 2.2% 109
bottom 7,232 32 0.4% 2.0% 100TOTAL 72,184 1,464 2.0%
Case II: 2% Response RateSmoothness: GA DMAX (10%)
DecileNumber
ofCustomers
Numberof
Responses
DecileResponse
Rate
CumResponse
Rate
CumResponse
Lifttop 7,203 322 4.5% 4.5% 2202 7,220 188 2.6% 3.5% 1743 7,225 178 2.5% 3.2% 1574 7,215 178 2.5% 3.0% 1485 7,227 151 2.1% 2.8% 1396 7,220 133 1.8% 2.7% 1317 7,209 103 1.4% 2.5% 1228 7,228 84 1.2% 2.3% 1149 7,205 81 1.1% 2.2% 108
bottom 7,232 46 0.6% 2.0% 100TOTAL 72,184 1,464 2.0%
Case II: 2% Response RateSmoothness: GA DMAX (20%)
DecileNumber
ofCustomers
Numberof
Responses
DecileResponse
Rate
CumResponse
Rate
CumResponse
Lifttop 7,203 271 3.8% 3.8% 1862 7,220 299* 4.1% 4.0% 195*3 7,225 191 2.6% 3.5% 1734 7,215 162 2.2% 3.2% 1585 7,227 140 1.9% 2.9% 1456 7,220 119 1.8% 2.7% 1357 7,209 90 1.2% 2.5% 1248 7,228 85 1.2% 2.3% 1169 7,205 69 1.0% 2.2% 108
bottom 7,232 38 0.5% 2.0% 100TOTAL 72,184 1,464 2.0%
Comparative Performance: Case III
Profit modeling– maximize profit in top 2 deciles– mailing (profit / size)
» Non-responder: -$0.29 / 92.55%» Unpaid responder: -$5.65 / 7.10%» Paid responder: +$275 / 0.35%
Average profit for mailing: +$0.32
DMAX (20%): - .36X1 - .23X2 + .005X3 + .24X4
LOGIT(PR): - .01X1 - .03X2 + .322X3 + .25X4
Case IV: Profit ModelGenetic Algorithm DMAX (20%)
Decile
Number of
Customers
Percent PAID
Responders
Percent UNPAID
Responders
Decile Average
Profit
Cum Average
Profit
Cum Profit Lift
top 8,171 0.82% 10.1% $1.43 $1.43 444 2 8,171 0.62% 8.7% $0.96 $1.20 371 3 8,171 0.37% 8.2% $0.28 $0.89 277 4 8,171 0.34% 8.4% $0.20 $0.72 223 5 8,171 0.29% 5.9% $0.20 $0.62 191 6 8,171 0.32% 7.4% $0.19 $0.54 169 7 8,171 0.23% 4.0% $0.13 $0.49 151 8 8,171 0.18% 4.8% -$0.04 $0.42 130 9 8,171 0.24% 8.3% -$0.06 $0.37 114
bottom 8,171 0.17% 4.9% -$0.08 $0.32 100 TOTAL 81,710 0.35% 7.1%
Case IV: Profit ModelCum Profit Lift Comparison
DecileGenetic
AlgorithmDMAX (20%)
LogisticRegression
top 444 3852 371 2943 277 2354 223 1905 191 1846 169 1637 151 1468 130 1239 114 111
bottom 100 100
Modeling on Multiple Objectives• Model [y1,..,yk] = f (x)
– simultaneously optimize on multiple objectives
• Some common DM modeling desirables– response and high purchase revenues– likely churners with high usage of services– high tenure and usage– purchase and non-return– cross-selling, etc.
[or CPR (Combined Profit and Response) Models]
Multiple objectives• Traditional approaches
– multiple single-objective models, and combine– weighted average of objectives
• conflicting objectives– different levels of tradeoffs
• frontier of non-dominated solutions– choice of final model based on diverse decision-
maker objectives, can also be subjective
Pareto Frontier
• Non-dominated solutions– multiple objectives πi, f a(x) better than f b(x) if
• Single GA run obtains– tradeoff frontier of
non-dominated solutions f k(x)
))(())((: xx bi
ai ffi ππ ≥∀
))(())((: xx bj
aj ffj ππ >∃
π1
π2 non-dominated modelsdominated models
Multi-objective GA
• Pareto-Based Selection (Louis and Rawlins, ‘93)– randomly select a pair of solutions from population– generate two new “offspring”– determine the Pareto-optimal set from parents and offspring,
and choose two solutions for new population
• Elitistism• retain best solution intact in next population
• fosters local search around best solution
– retain non-dominated set of solutions intact in next generation
Fitness evaluation• DMAX approach
– fitness at specified depth-of-file d
Experimental Study: Data
• Cellular-phone provider seeking to identify potential high-value churners– two dependent variables
• binary Churn variable• continuous variable measuring revenue ($)
– predictors: minutes-of-use (peak and off-peak), average charges, and payment information, etc.
• obtained after EDA, normalized to 0 mean 1 s.d
– 50,000 sample: 25,000 for training, 25,000 for test set
Multiple Objectives: Performance
• Churn lift • model capturing more churners in top deciles is better
• $-Lift
• model giving high revenue customers in upper deciles is better
• overall modeling objective– maximize expected revenue saved through identification of high-
value churners– Churn-Lift * $-Lift
NC
NC
d
d /
NR
NR
d
d /
Decile 1 (trg)
050
100150200250300350400
0 100 200 300 400 500 600
Churn-Lift
$-Li
ft
GPGALogisticOLS
5 independent GA runs, aggregate the sets of non-dominated solutions
Experimental StudyNon-dominated models: Decile 1 (Training)
Experimental StudyNon-dominated models: Decile 1 (Test)
Decile 1 (Test)
0
50
100
150
200
250
300
350
400
0 100 200 300 400 500
Churn-Lift
$-Li
ft
GPGALogisticOLS
Experimental StudyNon-dominated models: Decile 2 (Test)
Decile 2 (Test)
0
50
100
150
200
250
300
0 50 100 150 200 250 300 350 400 450
Churn-Lift
$-Li
ft
GPGALogisticOLS
Experimental StudyNon-dominated models: Decile 3 (Test)
Decile 3 (Test)
0
50
100
150
200
250
0 50 100 150 200 250 300 350
Churn-Lift
$-Li
ft
GPGALogisticOLS
Experimental StudyNon-dominated models: Decile 7 (Test)
Decile 7 (Test)
60
80
100
120
140
80 90 100 110 120 130 140 150
Churn-Lift
$-Li
ft
GPGALogisticOLS
Experimental Study
Performance SummaryPerformance Decile 1 Decile 2 Decile 3 Decile 7
Churn-Lift, $-Lift 304.9, 261.7 265.4, 207.4 272.3. 155.0 138.8, 126.9 GA-best Product of Lifts 797.8 550.4 422.2 176.1
Churn-Lift, $-Lift 343.7, 256.5 343.5, 182.1 275.1, 178.3 139.4, 131.2 GP-best Product of Lifts 881.5 625.5 490.4 182.9
Churn-Lift, $-Lift 447.1,111.8 403.4, 72.6 295.9, 57.4 137.8, 66.7 Logistic Regression Product of Lifts 499.8 292.7 169.96 91.9
Churn-Lift, $-Lift 116.2, 360.5 108.1, 271.7 99.7, 223.2 91.8, 136.2 OLS Regression Product of Lifts 418.8 293.71 222.5 125.1
Churn-Lift, $-Lift 79, 357 76, 263 74, 217 78, 136 OLS *
Logistic Product of Lifts 282 201 160 106
General Optimization of Lifts• Fitness function
– Seeks a general maximization of lifts at all deciles
Specific vs. General Lift Opt
Performance Decile 1 Decile 2 Decile 3 Decile 7 $-Lift, Churn-Lift 304.9, 261.7 265.4, 207.4 272.3. 155.0 138.8, 126.9 GA-best
Lift-Opt Product of Lifts 797.8 550.4 422.2 176.1 $-Lift, Churn-Lift 303.2, 261 288.3, 188.8 276.7, 151.3 138.1, 104.5, GA-best
General-Opt Product of Lifts 791.4 544.3 418.6 144.3
Churn-Lift, $-Lift 343.7, 256.5 343.5, 182.1 275.1, 178.3 139.4, 131.2 GP-best Lift-Opt Product of Lifts 881.5 625.5 490.4 182.9
Churn-Lift, $-Lift 332, 252.5 265, 223.1 233.9, 186.5 132.3, 133.1 GP-best General-Opt Product of Lifts 838.3 591.2 436.2 176.1
Table: Best Prod-Lifts in Deciles
Specific vs. General Lift Opt.
Decile 1 Decile 2 Decile 3 Decile 7 Performance $-Lift Churn-
Lift $-Lift Churn
-Lift $-Lift Churn-
Lift $-Lift Churn-
Lift GA-best Lift-Opt
361.4
464.7
271.6
401.3
223.9
309.8
136.6
139.5
GA-best General-Opt 361.7 421 273.3 398.1 223.9 304.1 136.6 138.4
GA-best Lift-Opt
372.7
475.2
276.5
417.9
226.1
310.3
137.2
139.8
GA-best General Opt 372.1 421.3 276.8 378.3 226.6 296.7 137.1 139.8
Table: Best $-Lift and Churn-Lifts in Deciles
Case Study – “EC challenge”EDA, Variable-selection• Problem
– 15,178 obs., 79 variables, “response” dependent– Seeking maximum lift in the top decile
– Logistic regression model• 15 variables, after EDA, transformation This is the hard part!
(many of them combinations of multiple vars.)• Lift of 126 in the top decile
• EC approach– Include all variables– Explore simple “terms”: non-linear GP models
• small populations, looking for robust terms– Final model(s) using obtained terms
Case Study – “EC• Various 2-5 var. terms show some predictability
– Lifts ranging in 122-127• Models on these terms
– Non-linear, Linear model: lifts in 126-132
• Examples– 3 tan(HC211) + EC31 Trg:122.5 Test: 122.5– (OCC81 - log10(ORDTERM1/IC191))*STATE2*HHAS21 Trg: 124.9 Test:126.4– STATE2 * HHAS21 Trg: 121.3 Test: 121.3
– (OCC81 - log10(B)) * B * (A + B + (ORDTERM1 * (A + B))) Trg: 131.5 Test: 126.9A = (STATE2 - SECGENDE) and B = STATE2*HHAS21
– B + tan(2B + HHAS21) + EC31 + (ORDTERM1)*(B + Trg: 131.1 Test:127.8tan[B + HHAS21 + ((HHAS21*HV31)/2.1)] )
– AB^3 (1 + OCC81) + AB(OCC81) + 2DEB(OCC81)^2. Trg: 134.4 Test 131.6
– 4A + B + C + 2D + E + 2*OCC81 (10 vars. total) Trg: 132.5 Test: 131.7