Adjusting for non-ignorable non-response: Application to Gulf War Study. Angela Wood 1, Ian White 1...
-
Upload
cody-porter -
Category
Documents
-
view
213 -
download
0
Transcript of Adjusting for non-ignorable non-response: Application to Gulf War Study. Angela Wood 1, Ian White 1...
Adjusting for non-ignorable non-response: Application
to Gulf War Study.
Angela Wood1, Ian White1 and Matthew Hotopf2
1MRC Biostatistics Unit, Cambridge, UK.
2GKT School of Medicine & Institute of Psychiatry, London, UK
Non-ignorable non-response in surveys
Recruits
Non-ignorable non-response in surveys
RespondersNon-
responders
Non-ignorable non-response in surveys
• Non-ignorable non-response: non-response relates to unrecorded characteristics of interest.
• Bias is likely to occur if non-responders are ignored.
RespondersNon-
responders
Useful information
• Reasons for non-response.
• Proxy outcomes.
• Intensive follow-up on sample.
• Number of failed contact attempts.
The problem
Questionnaires sent to N participants
n1 responded
N-n1 did not respond
The problem
Questionnaires sent to N participants
n1 responded
N-n1 did not respond
Questionnaires sent to non-responders
N-(n1+n2) did not respond
n2 responded
Mailing Wave 1
MailingWave 2
The problem
Questionnaires sent to N participants
n1 responded
N-n1 did not respond
Questionnaires sent to non-responders
N-(n1+n2) did not respond
n3 responded
n2 responded
Questionnaires sent to non-responders
N-(n1+n2+n3) not responded
Mailing Wave 1
MailingWave 2
Mailing Wave 3
The problem
Questionnaires sent to N participants
n1 responded
N-n1 did not respond
Questionnaires sent to non-responders
N-(n1+n2) did not respond
n3 responded
n2 responded
Questionnaires sent to non-responders
N-(n1+n2+n3) not responded
Mailing Wave 1
MailingWave 2
Mailing Wave 3
Data are only
observed for
responders
Example: Fatigue
Mailing wave
N=4822
Non-case Case
1 1054 (51.5%) 991 (48.5%)
2 381 (56.4%) 295 (43.6%)
3 251 (56.8%) 191 (43.2%)
Non-responders 1659
Notation
• i=1,…,N participants.• m waves.
• n1, n2, …,nm responders at waves 1, 2,…, m respectively.
• N-(n1+ n2+ … + nm) non-responders.
• Outcome of interest Yi for individual i.
• Confounders Xi for individual i.
• Yi and Xi are only known for responders.
Response Model
pi1 = P(i responds at 1st attempt);
pi2 = P(i responds at 2nd attempt | i not responded at 1st attempt);
pi3 = P(i responds at 3th attempt | i not responded at 1st or 2nd attempt):
logit(pij) = j + YiT (i=1,…, N; j=1,…,3).
Response Model
pi1 = P(i responds at 1st attempt);
pi2 = P(i responds at 2nd attempt | i not responded at 1st attempt);
pi3 = P(i responds at 3th attempt | i not responded at 1st or 2nd attempt):
logit(pij) = j + YiT (i=1,…, N; j=1,…,3).
The effect of outcome on the probability of response is the same at all waves – strong assumption.
How does it work?
Mailing wave
Y=1 Y=0
1 200 400
2 100 300
Non-responders 120
? ?
How does it work?
Mailing wave
Y=1 Y=0
1 200 400
2 100 300
Non-responders 120
60 60
OR = 0.33
How does it work?
Mailing wave
Y=1 Y=0
1 200 400
2 100 300
Non-responders 120
60 60
OR = 0.33OR = 1.13
How does it work?
Mailing wave
Y=1 Y=0
1 200 400
2 100 300
Non-responders 120
20 100
OR = 1.67OR = 1.67
Estimation procedure
• Modified conditional likelihood method (Alho 1990, Biometrika)– Conditional likelihood, product over responders:
ij P(i responds at wave j | Yi ).
– Use additional estimating equations which include information about number of non-responders.
Weighted Outcome model• Unconditional response probabilities
i1 pi1
i2 pi2 (1-pi1)
i3 pi3(1-pi1)(1-pi2)(1-pi3)
• The probability of responding = (i1+ i2+ i3)
• Use inverse response probabilities (i1+ i2+ i3)-1 to weight the observed data Yi.
• Easily extends to multivariate case.
Incorporating uncertainty in the weights
(1) Bootstrapping (2) Multiple weights
– Generate K sets of weights from K non-parametric bootstrap samples.
– Perform a weighted analysis for each set of weights. – Pool the results together, rather like the multiple
imputation technique. – The sets of weights only need to be derived once and
then can conveniently be used in any subsequent analyses.
Application: Gulf War Survey
• Various symptoms in military personnel in the Persian Gulf War 1990-91 have caused international speculation and concern.
• Cross-sectional postal survey on UK servicemen.
• 3 Mailing attempts
Application: Gulf War Survey• Compare various health problems between
– Gulf Cohort: Persian Gulf War veterans– Bosnia Cohort: Servicemen deployed to the Bosnia conflict– Era Cohort: Those serving during the Gulf war but not
deployed there.
• Outcome of interest: fatigue• Confounders:
– age, marital status, rank, education, employment, whether still serving or discharged, smoking, alcohol.
Response waves
Gulf N=4822
Wave 1 responded 2099 (43.5%)
Not responded 2723
Wave 2 Responded 701 (14.5%)
Not responded 2022
Wave 3 Responded 483 (10.0%)
Non responded 1539 (31.9%)
Response waves
Gulf N=4822 Bosnia N=2983 Era N=3905
Wave 1 responded 2099 (43.5%) 995 (33.4%) 1417 (36.3%)
Not responded 2723 1988 2488
Wave 2 Responded 701 (14.5%) 431 (14.4%) 552 (14.1%)
Not responded 2022 1557 1936
Wave 3 Responded 483 (10.0%) 389 (13.0%) 436 (11.1%)
Non responded 1539 (31.9%) 1168 (39.2%) 1500 (38.4%)
Fatigue case
Mailing wave
Gulf Cohort Bosnia Cohort Era CohortNon-case Case Non-case Case Non-case Case
1 1054 (51.5%)
991 (48.5%)
706 (72.7%)
265 (27.3%)
1092 (78.7%)
295 (21.3%)
2 381 (56.4%)
295 (43.6%)
309 (75.2%)
102 (24.8%)
442 (81.3%)
102 (18.7%)
3 251 (56.8%)
191 (43.2%)
278 (77.4%)
81 (22.6%)
326 (78.9%)
87 (21.1%)
Non-responders
1659 1242 1561
Univariate Models
• Response model for EACH cohort
logit(pij) = j + fatigue*
• Outcome model compares fatigue across cohorts
using inverse response probability weights.
Estimated Fatigue cases in Gulf cohort
Mailing wave
Gulf Cohort
fatigue non-case fatigue case
1 1054 (51.5%)
1056
991 (48.5%)
986
2 381 (56.4%)
374
295 (43.6%)
306
3 251 (56.8%)
254
191 (43.2%)
185
Non-responders 1659
1158 (69.9%) 502 (30.1%)Weights = 1.7 (non-case), 1.3 (case), chi-squared = 0.77
Results Estimated percentage of fatigue
case (se)OR (95% CI)
Gulf Bosnia era G vs B G vs E
Responders only 46.7 25.7 20.6 2.5
(2.2-2.9)
3.4
(3.0-3.8)
Adjusting for non-responders
Without adjusting for uncertainty in the weights
41.0 21.1 19.5 2.6
(2.2-3.0)
2.9
(2.5-3.2)
Adjusting for uncertainty in weights using 1000 bootstrap samples
41.0 (2.0) 21.1 (2.2) 19.5 (2.5) 2.6
(1.9-3.5)
2.9
(2.0-4.1)
Adjusting for uncertainty in weights using multiple weights (k=10)
41.8 (2.5) 21.6 (2.4) 19.3 (2.3) 2.6
(2.0-3.4)
3.0
(2.1-4.3)
Multivariate Response and Outcome models
• Response model for EACH cohort
logit(pij) = j + ZiT
– where Zi may include outcome Yi and other characteristics collected.
• Outcome model adjusting for confounders.
– Inverse response probability weights.
– Multiple weights K=10.
The multivariate response model
Gulf cohort only SE()
Outcome Fatigue 0.77 0.28
Military status Still in military service baseline
Discharged -0.85 0.23
Rank Officer 1.03 0.32
other baseline
Also adjusted for employment, education, age, smoking, alcohol intake, marital status.
Fatigue
Frequency (95% CI) Adjusted Odds ratios (95% CI)
Gulf Bosnia Era G vs B G vs E
Responders only
46.9% 25.8% 20.5% 2.2 (1.9-2.6)
3.6 (3.2-4.2)
Adjusting for non-response
Multivariate response model
38.7% (34.1-43.4)
22.2% (17.9-26.5)
19.4% (12.6-26.2)
2.2
(1.6-3.0)
3.2
(2.3-4.5)
Post traumatic stress reaction
Frequency (95% CI) Adjusted Odds ratios (95% CI)
Gulf Bosnia Era G vs B G vs E
Responders only
13.2 % 4.7% 4.1% 2.6 (1.9-3.4)
3.8 (2.8-4.9)
Adjusting for non-response
Multivariate response model
10.5% (8.0-12.8)
3.8% (2.1-5.4)
3.7% (1.9-5.6)
3.0 (1.9-4.6)
3.9 (2.5-6.1)
Conclusions
• Participants responding earlier had more symptoms than those responding later or not at all, particularly amongst Gulf veterans.
• Observed excess of symptoms in Gulf veterans reduced but not eliminated.
• Standard errors were increased when allowing uncertainty in response probabilities.
Discussion (1)
• Relax modeling assumptions– a common effect of covariates/outcomes on the
probability of response across all waves. – Not possible: logit(pij) = j + Zi
Tj
– Possible: logit(pij) = j + ZiT (0 + 1 j)
• Never responders?
Discussion (2)
• Estimation procedures– Considered full likelihood methods
• EM algorithm• Bayesian approach, WinBUGS
– Produce similar results– No need to use multiple weights (“all-in-one” methods).
• Extension to dealing with item-non-responders and refusals – little change in results.