Sampling and non sampling errors in the Italian Television Audience Measurement system

27
Sampling and non sampling errors in the Italian Television Audience Measurement system European Conference on Quality in Official Statistics - Q2008 Rome, 9-11 July 2008 Participants to research group: De Vitiis, D’Alò, Di Consiglio, P.D. Falorsi (chief), S. Falorsi, Orsini, Pallara, Russo, Seeber, Tuoto Speaker : Alessandro Pallara Istituto Nazionale di Statistica

description

European Conference on Quality in Official Statistics - Q2008. Participants to research group: De Vitiis, D’Alò, Di Consiglio, P.D. Falorsi ( chief ), S. Falorsi, Orsini, Pallara, Russo, Seeber, Tuoto. Sampling and non sampling errors in the Italian Television Audience Measurement system. - PowerPoint PPT Presentation

Transcript of Sampling and non sampling errors in the Italian Television Audience Measurement system

Page 1: Sampling and non sampling errors in the Italian Television Audience Measurement system

Sampling and non sampling errors in the Italian Television Audience Measurement system

European Conference on Quality in Official Statistics - Q2008

Rome, 9-11 July 2008

Participants to research group: De Vitiis, D’Alò, Di Consiglio, P.D. Falorsi (chief), S. Falorsi, Orsini, Pallara, Russo, Seeber, Tuoto

Speaker : Alessandro Pallara Istituto Nazionale di Statistica

Page 2: Sampling and non sampling errors in the Italian Television Audience Measurement system

Outline of the talk

Television Audience Measurement (TAM) and the “meter panel”

Survey parameters and sampling design

Estimation of sampling error

Sources of bias in TAM estimates

Measurement errors: E&I

Panel attrition and conditioning

Comments and concluding remarks

Rome, 10 July 2008

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Page 3: Sampling and non sampling errors in the Italian Television Audience Measurement system

Television Audience Measurement (TAM) data have a high social and economic impact.

Essential information to: Broadcasters, for programming policy and

programme scheduling Broadcasters and advertising agencies,

for agreeing upon the price of commercial air-time and advertising campaigns

Television Audience MeasurementE

uro

pea

n C

on

fere

nce

on

Qu

ali

ty i

n O

ffic

ial

Sta

tist

ics

-

Q20

08

Rome, 10 July 2008

Page 4: Sampling and non sampling errors in the Italian Television Audience Measurement system

Context and purposes of the research

Purposes (and Research reports)

1) review current estimation procedures for estimating daily ratings and associated sampling errors (released June ’07);

2) evaluate accuracy of the survey estimates with respect to the various sources of non sampling errors (Dec. ’07);

3) put forward tools and recommendations for checking statistical quality (both sampling and non sampling errors) of the output of TAM survey (under release, July ‘08)

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

The context for this research is the agreement signed in 2006 between Italian NSI and the Italian Communications Regulatory Authority (Agcom), so that Istat has been appointed for carrying out a study on the statistical methodology behind the national TAM system.

Page 5: Sampling and non sampling errors in the Italian Television Audience Measurement system

Current worldwide standard in TAM methodology has two basic features:

a viewing household panel sample (the People meter panel) selected according to certain household demographic characteristics (age of the householder, number of components, city size, geographical region)

a measurement device (the people meter) that register (a) TV set status (i.e. which channel is being tuned to with certainty) and (b) viewer presence, which is quite demanding on panelists (i.e. pressing their remote control button each time they enter or leave a television viewing session)

Standard TAM methodology

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 6: Sampling and non sampling errors in the Italian Television Audience Measurement system

Survey Parameters

Let r denotes a generic TV channel and T a given time interval (daypart, day, week)

Main Parameters

Uk

Ta,kr

T

t Uk

ta,krTr yy

TA

1

1

The Reach (or cover/cume) is the cumulative percentage or total (usually expressed in thousands) of a population that has been counted as viewers at least once during a specified interval.

The Audience is the average number of individuals (homes or target groups) viewing a TV channel over a given time interval (e.g. programme, daypart).

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Uk

Tc,krTr yC

Page 7: Sampling and non sampling errors in the Italian Television Audience Measurement system

The Share (of Audience) is defined as the percent of Households Using Television (HUT) or Persons Viewing Television (PVT) which are tuned to a specific program or station at a specific time.

The Rating is the size of television audience relative to the total universe, expressed as a percentage

N

APN Tr

Tr 100

1001

1

R

rTr

TrTr

A

ASH

Survey Parameters (cont.ed)E

uro

pea

n C

on

fere

nce

on

Qu

ali

ty i

n O

ffic

ial

Sta

tist

ics

-

Q20

08

Rome, 10 July 2008

Page 8: Sampling and non sampling errors in the Italian Television Audience Measurement system

Survey population, statistical units, data analyzed

Survey population : members of household aged 4 or more

→Survey estimates refer to in-home TV viewing (persons and households, including viewing of guests of the sample households), of total population and selected target subpopulations

Elementary data used for estimating parameters

Individual viewing statement: meter records (raw data) converted after data processing into summary statements of individual viewing over time (each minute). Each statement contains information concerning (a) Start and end time of the viewing session; (b) identification of signal source and TV set being viewed; (c) identity of viewer

Data analyzed

→Raw and validated panel meter micro-data (daily data for 4 weeks between Sept. ’05 through June ‘06)

→Population total of auxiliary variables and sampling weights used in the estimation procedure

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 9: Sampling and non sampling errors in the Italian Television Audience Measurement system

Sampling design of most TAM survey

Two phases in TAM sampling strategy

1. In the first phase, a face-to-face interview (the Establishment Survey, ES) is carried out each year, based (in 2006) on a sample of approximately 30,000 households and using a two-stage stratified sample:

provides certain universe estimates (in terms both of individuals and household) which will be used in the TAM estimation procedure, such as education attainment, socio-economic status or number of children per household,

provides a database of potential households for recruitment in the second phase sampling

2. In the second phase a panel of about 5100 households, is “broadly” randomly selected (within control strata) from ES respondents (the people meter panel sample).

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 10: Sampling and non sampling errors in the Italian Television Audience Measurement system

The Meter panel sample

Meter panel selected characteristics (used for panel turnover control), Active vs. Lost Panelists, Compared to Total population benchmarks

CharacteristicsUniverse

(Demographic data 08/05)

Panel Distribution (unweighted)

Active households

Lost panelists (Sept. '05/Jun. '06)4 Sept., '05 10 June, '06

Region

C 19,6 19,5 19,3 21,1

NE 19,5 19,2 19,4 19,0

NO 28,7 28,3 28,7 28,9

SI 32,3 32,9 32,7 31,0

City size <100,000 inh. 74,9 73,9 74,8 72,6

  >100,000 inh. 25,1 26,1 25,2 27,4

Age of householder

<=45* 32,6 32,6 30,5 36,0

46-64* 34,9 34,5 35,8 32,2

=>65* 32,4 32,9 33,7 31,8

Number of components

1 24,9 23,2 23,7 22,2

2 27,1 26,3 27,5 22,6

3 21,6 22,7 21,9 22,1

4 18,9 20,2 19,2 23,2

5+ 7,5 7,5 7,6 9,8* Estimated through ES

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 11: Sampling and non sampling errors in the Italian Television Audience Measurement system

.

Problems with TAM sampling design in Italy

• quota sampling

• unknown selection probability of units from the recruitment households database (originating from different ES’s)

• rules for field substitution of non responding households: different contact rates between basic households and substitutes, interviewer may influence substitutions

• very high total (non response to ES + refusal to panel recruitment) non response rate: >90%

Non respondents may be different as for the amount of television viewed: light viewers out of home a lot and less available for interview, light viewers may feel their cooperation less important

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 12: Sampling and non sampling errors in the Italian Television Audience Measurement system

MSE of an estimator of an unknown population parameter

Approach to quality assessment :

•Direct (smooth) estimators of the sampling variance

•(Indirect) indicators of the Bias

Approach to measuring accuracy of TAM estimates

)ˆ()ˆ(V)ˆ()ˆ( 22 BEMSE

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 13: Sampling and non sampling errors in the Italian Television Audience Measurement system

Estimation of sampling variance

The model assisted approach cannot be utilized because it is not possible to know the inclusion probabilities of the observed panel (units are selected by different sampling designs, some of which use purposive selections, very high rate of non response) On the other hand, using some suitable approximations a linear model can be found whose parameter estimates allow to properly approximate TAM actual estimates (details in the proceedings paper) Sampling variance has then been estimated through a robust estimation technique (sandwich estimator,Valliant et al., 2000) based on the residuals of the linear model. How much robust? The estimators are model unbiased consistent under quite general variance structure, different from the one used for producing the survey estimates.

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 14: Sampling and non sampling errors in the Italian Television Audience Measurement system

Estimates of Audience (each minute) and Coefficient of Variation (CV) for a large channel of the public network – 4 Sept., 2005

0

2000000

4000000

6000000

8000000

10000000

12000000

time: HH:MM

Aud

ienc

e

0

5

10

15

20

25

30

35

40

45

CV %

CV % estimates and CI

Variance estimation – an example

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 15: Sampling and non sampling errors in the Italian Television Audience Measurement system

Scatter plot of CV by Audience Size (minutes and dayparts)

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Variance estimation – an example (cont.d)

Page 16: Sampling and non sampling errors in the Italian Television Audience Measurement system

Sources of bias in TAM estimates

Potential sources of bias in meter panel sample

• coverage errors (e.g. non-TV homes not included in estimates, ≈ 1,500,000 est. persons in Italy) • (wave) non responses • model assumptions errors • measurement errors

• attrition and panel conditioning

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 17: Sampling and non sampling errors in the Italian Television Audience Measurement system

Measurement Errors in meter panel data

Main sources of measurement errors (data gathering and editing phases)

Meter statements indicating that the TV set is switched on, but without any persons registered as present (uncovered viewing).

Long viewing session without any change in registered set use or viewer presence (signing on/off of viewing individuals, channel switching, long/constant viewing)

TV OFF viewing

Same individual registered as a viewer for two or more TV sets at the same time (concurrent viewing)

Undue or wrong re-assignment of uncovered viewing to a household component (processing errors)

measurement errors: mis-match between a signal source of a TV set being viewed and a person registered as a viewer through the people meter

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 18: Sampling and non sampling errors in the Italian Television Audience Measurement system

Processing TAM data – E&I

Editing checks:

• rejection of certain panel households from the daily reporting samples because of suspected faulty compliance by panelists [excess (24 hours) viewing, long/constant viewing above set threshold values] • records of individual viewing are canceled out (concurrent viewing, overnight constant viewing, unassigned uncovered viewing) • records of individual viewing are edited in (uncovered viewing assigned to viewer) E

uro

pea

n C

on

fere

nce

on

Qu

ali

ty i

n O

ffic

ial

Sta

tist

ics

-

Q20

08

Rome, 10 July 2008

Page 19: Sampling and non sampling errors in the Italian Television Audience Measurement system

Percent variation of audience estimates (unweighted) from raw to validated data resulting from treatment of uncovered viewing

Processing data - Editing and ImputationE

uro

pea

n C

on

fere

nce

on

Qu

ali

ty i

n O

ffic

ial

Sta

tist

ics

-

Q20

08

Rome, 10 July 2008

Page 20: Sampling and non sampling errors in the Italian Television Audience Measurement system

Processing data - Editing and Imputation

Percent variation of audience estimates using different cut-off values and criteria for deletion of records with long constant viewing

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 21: Sampling and non sampling errors in the Italian Television Audience Measurement system

Panel attrition

Annual rates of panel attrition (years 2005 – 2006)

           

  2005 2006  

House moving 2,8%   2,4%    

Fatigue (drop-out) 10,9%   9,6%    

Fatigue (discard) 1,8%   0,9%    

Discard for stratification 1,2%   1,7%    

Inability to continue 1,2%   1,2%    

           

Total 18%   16%    

           Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 22: Sampling and non sampling errors in the Italian Television Audience Measurement system

Annual attrition rates by subgroup (indices – avg. years 2005-2006)

             

    Drop-out Discard  

Number in household        

1   69   105    

2   94   96    

3   101   94    

4   125   102    

5+   150   113    

             

Age of householder        

≤45   109   77    

46-64   106   82    

65+   86   141    

             

Region            

NW   101   88    

NE   111   64    

C   103   96    

SI   91   135    

Attrition rates by subgroup of populationE

uro

pea

n C

on

fere

nce

on

Qu

ali

ty i

n O

ffic

ial

Sta

tist

ics

-

Q20

08

Rome, 10 July 2008

Page 23: Sampling and non sampling errors in the Italian Television Audience Measurement system

Panel attrition and conditioning

0,00

0,01

0,01

0,02

0,02

0,03

0,03

0,04

0,04

0,050 6 12 18 24 30 36 42 48 54 60 66 72 78 84 90 96 10

2

108

114

120

126

132

138

145

151

158

164

171

179

188

195

201

208

214

223

10 June, ’06Installed

Households

# households 5.093

Average Age 63,7

SD 50,8

Max 231

75% 102

Median age 51

25% 20

Months-in-sample percent distribution of households panel sample

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 24: Sampling and non sampling errors in the Italian Television Audience Measurement system

Age effectsE

uro

pea

n C

on

fere

nce

on

Qu

ali

ty i

n O

ffic

ial

Sta

tist

ics

-

Q20

08

Rome, 10 July 2008

Day_ par t = 12: 00: 00- 14: 59: 59 Day_ par t = 15: 00: 00- 17: 59: 59

t

400

600

800

1000

can11_ sot

can11_ sop

t

500

1000

1500

2000

can11_ sot

can11_ sop

Day_ par t = 18: 00: 00- 20: 29: 59 Day_ par t = 20: 30: 00- 22: 29: 59

t

500

1000

1500

can11_ sot

can11_ sop

t

1000

1500

2000

can11_ sot

can11_ sop

Daily estimates of audience (thous. of individuals) of satellite TV channels by some dayparts (4 weeks between Sept. ’05 through June ’06) – households below and above median of time-in-sample

Page 25: Sampling and non sampling errors in the Italian Television Audience Measurement system

Comments and concluding remarks - 1Sampling errors

The CV decreases as the estimation increases.

The larger estimates (major networks) quite reliable

The lower estimates (local networks) quite unreliable

The CV slowly decreases as the size of time interval of estimates

increases

Non sampling errors • coverage errors related to list problems (non-TV homes, non-voting resident households, ……..) • non standardized criteria for substitution of non responding households to ES may lead to select in the panel heavy viewer households

• some evidences of the presence of an upper bias in survey estimates: editing checks seems to be unbalanced towards editing viewing statements in rather than out, threshold values for considering long viewing as unrealistic result in canceling out viewing statements only in the case of overnight viewing

• the lack of an upper limit to time-in-sample for households in the panel suggests the presence of panel attrition and conditioning because of modifications in panelist viewing behavior and compliance with the measurement device during their presence in sample

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 26: Sampling and non sampling errors in the Italian Television Audience Measurement system

Comments and concluding remarks – 2

Recommendations for Improving quality

• coincidental surveys on a regular basis to check real viewing status of panelists vs. registered meter data• occasional surveys of non respondents to analyze independence of response mechanism from viewing behavior • introducing a method for panel rotation, with an upper limit to time-in-sample of panel households

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008

Page 27: Sampling and non sampling errors in the Italian Television Audience Measurement system

Thank you for your attention!

Eu

rop

ean

Co

nfe

ren

ce o

n Q

ua

lity

in

Off

icia

l S

tati

stic

s -

Q

2008

Rome, 10 July 2008