Dynamically converging to the best package designs SKIM at ART 2013
description
Transcript of Dynamically converging to the best package designs SKIM at ART 2013
expect great answers
Dynamically converging to the best package designsA simple yet effective approach
Carlo Borghi, Eline van der Gaast, Virginie Jesionka and Gerard Loosschilder 10 June 2013
2
Creating a package design that has impact at point of sale is partly …
ART catch the eye make the brand stand out position the product and informs the customer
CRAFT optimize the mix of elements make the most of the potential of the creative design template
3
We’re not the first to crack this nut. But we have a remarkably simple yet effective approach
An immensesearch space of potential
package designs, combining ART
and CRAFT
A convergent procedure involving consumer opinions
One ‘best’ or a few segment-specific package designs
4
• 11-points scale purchase intent measurements for pack designs
• Pack designs drawn from an orthogonal design on the nth space
• Collect set number of observations
• Run LS regression through PERL-embedded R code on accumulated data
• Define (n+1)th parameter space by dropping bottom ranking items (attribute levels)
nth parameter space
Smaller, (n+1)th
parameter space
We systematically limit the parameter space throughout the search steps
5
Nice, but does it work?
It works!
In theory, to identify the optimal pack design using simulated data
In practice, to improve on the current package design and be as good or better than expert opinions
We test the effectiveness of our approach in an hypothetical study in the petcare category
Set up of the study
6
7
8
Benefit statement: 20 levels
Design: 4 levels
Cat picture: 15 levelsInsect logo: 3 levels
Vet logo: 3 levels
Color: 5 levels
Chemicals: 2 levels
9
Parameter space for a pet anti-parasite pack design
Attributes Levels Nature
1 Package concept 4 Visual concepts
2 Pet picture 15 Pet pictures
3 Package color 5 Integrated in pack concepts
4 Claims 20 Text statements
5 List of chemical components 2 Present / absent
6 Icons 3 Present (2 versions) / absent
7 Vet only icon 3 Present (2 versions) / absent
# of possible combinations 108,000
10
Respondents go through the survey to narrow down the packages to 18 combinations
Levels deleted according to pre-specified order
Loop # N # deleted levels
Remaining combinations after
deletion
1 200 9 51,840
2 70 8 15,552
3 50 9 2,592
4 50 7 384
5 30 7 18
…
On simulated data, the hit rate is surprisingly high, even with few “respondents”
Step 1
12
The algorithm correctly eliminates the worst levels most of the time
Design X-0.8
X0.3 0.4 2.2
Cat picture X-7.3
X-5.4
X-4.9
X-4.7
X-3.1
X-2.9
X-2.4
X-1.7
X-0.8
X-0.5
X0.0 0.8
X1.0 1.4 3.3
Color X-1.9
X-1.8
X0.1
X0.9 2.3
Claim X-3.1
X-2.7
X-1.4
X-0.9
X-0.3
X0.5
X0.9
X0.9
X1.0
X1.2
X1.3
X1.6
X1.6
X2.12
X2.6
X3.6
X3.8 3.9 5.5 5.5
Chemicals X1.1 3.1
Insect logo X-3.6 -1.5 2.7
Vet logo X-1.9
X-0.5 4.2
X1.9
In the 5th and final loop, a level is incorrectly eliminated
X if the level is eliminated in one of the loops
Increasing utility
True average population utility
5 iterations with 200, 70, 50, 50 and 30 respondents each, +/-20% error uniformly distributed in ratings
Design X-0.8 0.3
X0.4 2.2
Cat picture X-7.3
X-5.4
X-4.9
X-4.7
X-3.1
X-2.9
X-2.4
X-1.7
X-0.8
X-0.5
X0.0 0.8
X1.0 1.4 3.3
Color X-1.9
X-1.8
X0.1
X0.9 2.3
Claim X-3.1
X-2.7
X-1.4
X-0.9
X-0.3
X0.5
X0.9
X0.9
X1.0
X1.2
X1.3
X1.6
X1.6
X2.12
X2.6
X3.6
X3.8 3.9 5.5 5.5
Chemicals X1.1 3.1
Insect logo X-3.6 -1.5 2.7
Vet logo X-1.9
X-0.5 4.2
X1.9
Even with half the respondents, the algorithm is quite effective
13
X if the level is eliminated in one of the loops
Increasing utility
True average population utility
Incorrectly eliminated in the 2nd loop
Incorrectly eliminated in the 2nd loop
5 iterations with 100, 35, 25, 25 and 15 respondents each, +-20% error uniformly distributed in ratings
14
Loop 1108,000combs
Rating of polarizing concepts
Loop 251,840 combs
Loop 315,552 comb
Loop 5384 combs
Cluster 1
Cluster 2
Extra observations on all remaining packages, treated holistically18 combs
Loop 251,840 combs
Loop 315,552 combs
Loop 5384 combs
The procedure identifying clusters is inserted between loops in the screening process
Extra observations on all remaining packages, treated holistically18 combs
Step 2
15
In an actual consumer survey, in which we test if our solution is better than..• the current package• the expert opinion
16
The experts selected their suggested winners from the full set of pack alternatives
Current Optimization outcomeExperts
17
Our design process has delivered a better pack design than the current package design
Current package (n=415)
4th best (n=96)
3rd best (n=67)
2nd best (n=91)
Best from algorithm (n=76)
6 7 8 9 10 11
8.21
8.78
8.82
8.87
9.01
Purchase intent rating (1-11 scale)
These variations would also be “good enough”
All differences versus current package are statistically significant at a 95% CL
18
Two experts are as good as the algorithmAlthough expert opinions are not always consistent
Expert A (n=61)
Current package (n=415)
4th best (n=96)
3rd best (n=67)
2nd best (n=91)
Expert C (n=61)
Best from algorithm (n=76)
Expert B (n=61)
6 7 8 9 10 11
8.10
8.21
8.78
8.82
8.87
8.98
9.01
9.30 Difference not significant @ 95%
Purchase intent rating (1-11 scale)
Conclusions and next steps
19
20
Conclusion: we have an effective approach to identify an optimal package design
Advantages over other methods (e.g., conjoint analysis) are:
• Show one package concept per screen • Can handle a very large parameter space with (several) attributes with many
levels• Can target questions to the best packages such as extra scores or details
through the “why” question.
Further research steps:• Exclude levels based on confidence tests (and create design dynamically)• Better real-time respondent quality checks• Test the method for complex stimuli, e.g. video advertisement
contact us or follow us online!
Carlo Borghi | [email protected] van der Gaast | [email protected] Jesionka | [email protected] Loosschilder | [email protected]
22
Cur
rent
pac
kage
23
Alg
orith
m’s
bes
t pac
kage
24
Exp
erts
’ bes
t pac
kage
25
The algorithm is effective in predicting the score of the best concepts, but not those of the experts’s cocepts
Expert A
Expert B
Expert C
Our concept1
Our concept2
Our concept3
Our concept4
0.0 2.0 4.0 6.0 8.0 10.0
8.0
8.0
7.7
8.8
8.9
8.9
8.6
8.1
9.3
9.0
8.9
9.0
8.8
8.8
Real average scoreEstimated average score