Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of...

52
Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    0

Transcript of Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of...

Page 1: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

Robert VoogtDutch Ministery Of Social Affairs and

Employment(formerly of the University Of Amsterdam)

Nonresponse in survey research: why is it a

problem?

Page 2: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

2

Overview• What is nonresponse, why is it a

problem and why does the traditional way of correcting for nonresponse not solve the problem

• Overview of general correction techniques

• An alternative approach to correct for nonresponse bias

• Real life illustration

Page 3: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

3

What is nonresponse, why is it a problem and why does the traditional way of correcting

for nonresponse not solve the problem?

Page 4: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

4

Survey research

• Population is sampled• Sample is a good representation of

population when good sample techniques are used

• Not all sample elements will respond

Page 5: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

5

Unit vs Item nonresponse

• Some are not reached, others refuse or are not sending back the questionaire: unit nonresponse

• Some who do answer the questionnaire do so incompletely: item nonresponse

Page 6: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

6

MCAR, MAR, MNAR

3 general nonresponse mechanisms can be distinguished

• MCAR: Missing Complety At Random• MAR: Missing At Random• MNAR: Missing Not At Random

Page 7: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

7

Missing Completely At Random (MCAR)

• Conditional distribution M given the survey outcomes Y and survey design variables Z. Let f(M|Y,) denote the distribution, with the unknown parameters.

• If MCAR: f(M|Y,Z,) = f(M|) for all Y,Z,

• Not a realistic assumption

Page 8: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

8

Example MCAR

• Taking a random subsample of a group of nonrespondents

• If random subsample of nonrespondents is analysed (after obtaining answers of all of them), the nonsampled nonrespondents can be said to be MCAR

• So correction methods using the MCAR assumption can be used

Page 9: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

9

Missing At Random (MAR)• MAR: f(M|Y,Z,) = f(M|Yobs,Z,) for all Ymis,

• where Yobs denotes all the observed survey data

• This means that missingness depends on the observed variables, the observed values of incomplete variables or on the design variables, but not on the variables or values that are missing

Page 10: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

10

Example MAR• For both respondents and

nonrespondents we know their level of education

• Respondents who share the same value of level of education have the same distribution on the unobserved variables

• Most survey nonrespondent adjustment methods assume MAR

Page 11: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

11

Not Missing At Random (NMAR)

• NMAR: f(M|Y,Z,) = f(M|Yobs,Ymis,Z,) for all Yobs,Ymis,

• This means that missingness depends on missing values after conditioning on the observed data

• To get an unbiased distribution M, a joint model of the data and the nonresponse mechanism is necessary

Page 12: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

12

Example MNAR

• For both respondents and nonrespondents we know their level of education

• Given the level of education nonresponse on the variables of interest is not random

• This means it is not sufficient to use only level of education to correct for nonresponse bias.

Page 13: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

13

Nonresponse biasIf nonresponse is not a result of

design, almost always NMAR is the case, with data biased by nonresponse as a result.

The amount of nonresponse bias is dependent on:

1. the correlation between the target variable(s) and the nonresponse mechanism;

2. the level of nonresponse.

Page 14: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

14

Nonresponse bias

withYk: the score of element k in the population on

the target variabelek: probability of response of element k in the

population when contacted in the sampleC(,Y): population covariance between response

probabilities and the values of the target variable

1

1( ) ( ( )

N

k kk

C Y Y YN

Page 15: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

15

Nonresponse bias

with

• (Yk-Y): the difference between the population score and the score of element k on the variabele of interest

• (k-: the difference between the mean probability to respond and the probability to respond of element k

• It follows from this equation that the response level in itself does not say everything: the amount of bias depends on the relation between the first and second part of the equation

1

1( ) ( ( )

N

k kk

C Y Y YN

Page 16: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

16

Traditional correction methods• Use population information to compare to

the respondent group with the population• Use information that is available for both

respondents and nonrespondents• Use information about the difficulty to

obtain data from the respondents

• In fact, the assumption is that the data are MAR, given the values of the variables of which population information or information about the nonrespondents is available

Page 17: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

17

Traditional correction methods• No information about the difference

on the variables of interest between the respondents and nonrespondents

• No information about the difference in response probabilities between sample elements that score different on the variables of interest

• So there is no reason why this way of correcting should work

Page 18: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

18

Overview of general correction techniques

Page 19: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

19

Different correction techniques

• Weighting: assigning each observed element an adjustment weight

• Extrapolation: respondents who are most like the nonrespondents are used for correction

• Imputation: missing values are substituted by estimates

Page 20: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

20

Weighting• Weighting: assigning each observed

element with an adjustment weight• Sample elements that belong to

groups that seem underrepresented on the variables used in the weighting will have a high adjustment weight

• Sample elements that belong to groups that seem overrepresented among the respondents will have a low adjustment weight

Page 21: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

21

Weighting Example

• Question: Have you ever visited Lugano? (Y/N)

• Population information available about age (18-30 ) (31-64 ) (65-older)

• Comparison of respondents and population

• Weighting

Age Resp Popul

Weight

18-30 20% 30% 30/20=1.5

31-64 70% 50% 50/70=0.7

65+ 10% 20% 20/10=2.0Lug 18-

3031-64

65+

Unw

W*

Yes 20%(4)

50%(35)

10%(1)

40%(40)

33%(33)

No 80%(16)

50%(35)

90%(9)

60%(60)

67%(67)

N 20 70 10 100 100Yes: 4*1.5 + 35*.7 + 1*2.0=6+24.5+2=32.5

No: 16*1.5 + 35*.7 + 9*2.0=24+24.5+18=66.5

Page 22: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

22

Extrapolation

• Central idea: some groups of respondents are more like the nonrespondents than others are

• For example, sample elements that first refused, but when contacted for the second time, were persuaded to participate, can be used as proxies for the final refusals

Page 23: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

23

Extrapolation Example• Question: Have you

ever visited Lugano? (Y/N)

• Two respondent groups: early respondents and late respondents

• Calculate the distribution among the nonrespondents using the last respondent method

Lug R1 R2 TR NR TS

Yes 48%(29)

28%(11)

40%(40)

20%(10)

33%(50)

No 52%(31)

72%(29)

60%(60)

80%(41)

67%(100

)

N 60 40 100 50 150

Last respondent: L=A2+(A2-A1) (X2-X1/X2), with:

L: theoretical last respondent

A: % response to an item in a wave

X: cumulative % respondents at the end of a wave

L = 50+(50-40) (67-40/67) = 50+*.40=18%

Page 24: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

24

Imputation• Imputation: missing values are

substituted by estimatesDifferent methods of imputation:• Single Imputation: for each variable one

value is imputed• Hot Deck Imputation: a missing value is

replaced by an observed value of a comparable respondent

• Multiple Imputation: for each variable several values are imputed; in this way the uncertainty that imputation brings with it is also taken into account

Page 25: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

25

Hot Deck Imputation Example• Divide the respondents into homogenous

groups. For exampe, by using CHAID.• CHAID recursively partitions a sample into

groups so that the variance of the dependent variable is minimized within groups and maximized among groups

• Link each nonrespondent to the group it fits in best

• Substitute the values of a random respondent from the same group as the value of the nonrespondent

Page 26: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

26

Hot Deck Imputation Example, part 2

CHAID finds groups:

age 18-30,31-64/low

education,

31-64/high education,

65+/maleand65+/female

Grp R HDI NR TS

18-30 20% (4) 25*.20 =5

9

31-64/low 33% (10)

4* .33 =1

11

31-64/high

63% (25)

1* .63 =1

26

65+/male 20% (1) 8* .20 =2

3

65+/female

0% (0) 12* .0 =0

0

% Lug Yes 40% (40)

18%(9) 33% (49)

100 50 150

Page 27: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

27

Multiple Imputation Example

• For each case, 5 values for each missing variabele are calculated, using a regression equation and adding a random error term

• These values are combined in one single value, for example, by taking the mean

• The variance will take the uncertainty due to the imputed value into account by combining the within imputation variance (the variance of each estimated data set) and the between imputation variance (in which all 5 data sets are used)

Page 28: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

28

Multiple Imputation Example, part 2

Imp1 Imp2 Imp3 Imp4 Imp5 Mean

NR 1 .41 .56 .34 .62 .44 .47

NR 2 .67 .77 .81 .56 .64 .69

NR 3 .28 .11 .07 .15 .22 .17

NR 4 .02 .10 .06 .23 .09 .10

….

NR 50 .21 .32 .46 .16 .20 .27

TNR .33

Percentage that has visited Lugano

Page 29: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

29

An alternative approach to correct for nonresponse

Page 30: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

30

Key to succes of correction methods• The information used in the

correction method• The correction method must model

the nonresponse mechanism• The variables used in correction

should have a relation with:– the variables of interest– the probability to respond of a

sample element

Page 31: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

31

Central Question Method(Betlehem & Kersten, 1984)

• Nonrespondents are asked to answer one (or more) questions central to the subject of the study

• The central questions are believed to have a strong relation with both the nonresponse process and the subject of the study

• Central questions are used in correction

Page 32: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

32

Central Question Example

• Central Question: Have you ever visited Switzerland? (Y/N)

• Question of interest: Have you ever visited Lugano? (Y/N)

• Comparison of respondents and non-respondents

• Weighting as correction technique

Lug CQ:Y CQ:N Unw W*

Yes 67%(40)

0%(0)

40%(40)

29%(29)

No 33%(20)

100%(40)

60%(60)

71%(71)

N 60 40 100 100

Yes: 40*.72 + 0*1.43 = 28.8 + 0 = 29

No: 20*.72 + 40*1.43 = 14.4 + 57.2 = 71

CQ Resp Nonr TS Weight

Yes 60% 10% 43% 43/60=0.72

No 40% 90% 57% 57/40=1.43

N 100 50 150

Page 33: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

33

Real Life Illustration

Page 34: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

34

Illustration

• Election study• High levels of nonresponse• External information available to

test the succes of the correction procedures

Page 35: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

35

Our research questions• Does nonresponse causes a problem

in election studies?• Is using background variables

sufficient or do we need central questions?

• Do different correction techniques lead to different results?

• Is it really necessary to recontact nonrespondents?

Page 36: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

36

Data Collection

• City of Zaanstad, The Netherlands• N=995; 901 used• Recontacting refusals• Mixed mode data collection• Two central questions:

– Voted in 1998 national elections– Political interest

Page 37: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

37

Response rateMethod N %

Telephone Complete question.

452 50.2

Central questions 81 9.0

Mail Complete question.

94 10.4

Central questions 27 3.0

Face-to-face

Complete question.

158 17.5

Central questions 37 4.1

Nonresponse

52 5.8

Total sample 901 100

Page 38: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

38

Does nonresponse cause problems?

We distinguish four groups:• Response at first contact (470)• Response after two contacts (76)• Response after three or four contacts

(158)• Nonrespondents (including those who

answered the central questions) (197)

Page 39: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

39

Comparison of response groups

R1 R2 R3 NR

Voted nat. elections 86 70 60 62

Voted prov. elections 47 46 25 29

Interested in politics 79 76 55 27

Voting not important 9 17 38 -

Conclusion: nonresponse bias is present

Page 40: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

40

How to correct?

Using the Central Question Procedure and compare it with more traditional correction methods

Two central questions:• Voted at national elections (0-1) –

from election lists (so no response bias)

• Political interest (0-1) – from short nonresponse questionnaire

Page 41: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

41

Correction methods• Weighting by background variables /

+ central questions• Extrapolation• Hot Deck Imputation by background

variables / + central questions• Multiple Imputation by background

variables / + central questionsfor response levels of 52 % and 78 %

Page 42: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

42

Weighting• On background variables: age,

ethnicity, gender, household composition, education, residential value, number of years living in current residence, social cohesion in neighborhood; using an iterative procedure

• As above plus validated voter turnout national elections 1998 and political interest (central questions)

Page 43: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

43

Extrapolation

• Last Respondent Method

Page 44: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

44

Hot Deck Imputation

• Obtain subgroups by using CHAID• Assign nonrespondents to the groups• Decide exact value to be imputed

using a regression model (multiple imputation)

• For background variables / background variables and central questions

Page 45: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

45

Multiple Imputation• Use AMELIA (King et al., 1998) to calculate 10

discrete imputation values for each variable• Calculate the mean distribution by summing

the 10 proportions of each of the categories of the variable and divide it by 10

• Compute variance to take both within- and between-imputation variance into account

• For background variables / background variables and central questions

Page 46: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

46

Dependent variables

• Voted at national elections• Voted at provincial elections• Self-reported political interest• Importance of voting

Page 47: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

47

Results for weighting52%

78%

Rsp

BG CQ Rsp BG CQ TS

Voted national

85.5

83.3 74.5

78.0

77.5

74.5

74.5

Political Interest

78.8

78.0 65.2

73.0

72.1

65.2

65.2

Voted provincial

47.4

46.1 40.6

42.3

42.0

40.1

39.5

Importance Voting

69.5

68.7 63.4

59.7

59.5

56.5

-

Rsp: Respondents, BG: Background variables

CQ: Central Questions, TS: Total sample

Page 48: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

48

Compare different methodsRsp W HD

IMI EX W HDI MI TS

78%

BG BG BG CQ CQ CQ

Voted National

78.0

77.5 77.8

75.7

73.0 74.5 75.4 75.1 74.5

Political Interest

73.0

72.1 72.2

71.7

69.2 65.2 65.6 64.6 65.2

VotedProvincial

42.3

42.0 42.8

42.5

39.0 40.1 41.1 41.2 39.5

Importance Voting

59.7

59.5 59.7

57.8

52.7 56.5 56.2 56.1 -

Rsp: Respondents, W: Weighting, HDI: Hot Deck Imputation, MI: Multiple Imputation. EX: Extrapolation, BG: Background variables, CQ: Central Questions, TS: Total sample

Page 49: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

49

Relations: regression turnout provincial elections52 BV CQ 78 BV CQ

Resp W W HDI MI Resp W W HDI MI TS

VtNat

* * * * * * * * * * *

Age * * * * * * * * * * *Urb

Sex *Educ * * * * * * *Ethn

Value

* *

Mobil

* * *

Cohe

Page 50: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

50

Conclusions• Using cental questions lead to better

estimates than only using background variables

• Higher response levels lead to better estimates

• All correction techniques perform equally well: the information used in the correction is more important than the technique used

• Correcting bias in regression parameters is less succesful

Page 51: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

51

Recommedations• Always reapproach nonrespondents, to try

to reach a response level of 75 %• Always ask (a sample of) nonrespondents

to answer a small number of central questions

• Always try to get as much information as possible from external sources

• The technique used is not so important – simple techniques perform equally well as more complex ones.

Page 52: Robert Voogt Dutch Ministery Of Social Affairs and Employment (formerly of the University Of Amsterdam) Nonresponse in survey research: why is it a problem?

52

Thank you for your attention!

• Questions?

• Contact: [email protected]