Proved.co scores & metrics

Post on 09-May-2015

66 views 0 download

description

Short description of Proved.co approach to weighting of data and calculation & normalization of scores and metrics.

Transcript of Proved.co scores & metrics

Proved.co Scores and Metrics

January 2014

Introduction

For each concept proved.co calculates: —  Score to show overall concept performance

—  Five key metrics to show concept’s strengths, weaknesses and areas for improvement

This document outlines proved.co approach to score & metrics calculation, as well as background and framework behind.

2

Architecture

3

Questionnaire

Survey distribution DB

Dashboard

Analytics DB

Collection

Computation

Fro

nt-

en

d

Bac

k-e

nd

Visualization

Computation

4

Weighting

RIM-procedure

Individual weights for each respondent to fit age and gender proportions with census data

Calculation

Raw scores, i.e. direct results of stat formulae application on questionnaire data collected

Normalization

Normalized scores, i.e. raw scored rescaled to 0..100 using benchmarks

Proved Scores & Metrics

Weighting

5

Weighting

— Two target variables: — Gender: males and females (2 targets) — Age: 18-29, 30-49, 50+ (4 targets)

— Random Iterative Method (RIM): — Data is weighted by gender. Gender weighting factors are

calculated and applied (Iteration 1) — Data is weighted by age. Age weighting factors are

multiplied by gender’s (Iteration 1’s) and applied (Iteration 2) — Data is re-weighted by gender. Gender weighting factors

are multiplied by iteration 2’s and applied (Iteration 3). — Iterations continue until all targets are met or precision

does not change more that by 1% while weight factors are within [0.25; 4] limits.

6

Weighting targets

7

Males 49%

Females 52%

18-29YO 17%

30-49YO 38%

50+ YO 47%

Males 49%

Females 52%

18-29YO 23%

30-49YO 36%

50+ YO 43%

United Kingdom USA

Based on 2012 census data Based on 2011 census data

Special notes

Weighting does not apply for:

— Self-service plans, i.e. samples from client’s contact lists and river samples

— Audiences which target outside age and gender, i.e. moms or car owners

8

Calculation

9

Framework

— Proved.co is a project of Bojole (UK) Ltd, a traditional market research company with eight years of concept testing expierence

— At the moment of proved.co development Bojole had norms for 1228 concept tests: — Raw data, i.e. more than 250 thousands of

completed questionnaires

— Corroboration data, i.e. post-tests, ranking data, and instrumental variables

10

Corroboration

Post%tests'

Ranking'data'

Instrumental'variables'

11

Market data for launched concepts Available for limited number of concepts The most reliable corroboration

Max-diff ratings for sets of 30-90 concepts Available for 1116 concepts Quite reliable corroboration

Overall liking, purchase intent, etc Available for all 1228 concepts Questionable reliability

Score modeling

— Bojole has decided to develop a single score which best represents overall performance of a concept under test

— Bojole used iterative regression modeling to determine: — Variables to include into score calculation

— Score formulae

12

Corroboration variables

Relevance

Uniqueness Word of mouth

All available scaled diagnostic variables

Score modeling

— The following set of variables and formulae coefficients have been determined:

13

Concept relevance High impact

Concept’s word of mouth Mid impact

Concept’s value for money Mid impact

Concept uniqueness Low impact

Raw score calculations

On individual level

— Sum of weighted — Concept relevance,

— Word of mouth,

— Value for money

— Uniqueness

— Weighting coefficients reflect score modeling described above

On aggregate level

— Weighted average of individual scores

— Weights reflect fit to age/gender proportions of target population

14

Normalization

15

Framework

— We believe any concept test results to be useful only in context, i.e. against benchmarks

— Thus, we normalize raw score and each raw metric to the scale 0..100 representing its performance against benchmarks

— A little extra benefit — 0..100 scores are easier to read and compare between

16

Benchmarks

— We store for each idea: — Description and sample

— All calculated raw metrics and scores, i.e. raw benchmarks

— All normalized metrics and scores, i.e. scaled benchmarks

— List is updated with each new computation

17

Benchmarks for idea score

18

Distribution of raw idea scores is close to normal one and thus can be used for sensible 0..100 scaling

Normalization

19

— First, we calculate average (avg) and standard deviation (sdev) for a distribution of all raw benchmarks

— Then we calculate normalized value, measuring deviation of raw score / metric (rvalue) against average (avg) in standard deviations (sdev).

— Normalized score shows how given concept (or its metric) benchmarks against whole distribution of other concepts in our database.