Discussion Alan Zaslavsky Harvard Medical School.

14
Discussion Alan Zaslavsky Harvard Medical School

Transcript of Discussion Alan Zaslavsky Harvard Medical School.

Page 1: Discussion Alan Zaslavsky Harvard Medical School.

DiscussionAlan Zaslavsky

Harvard Medical School

Page 2: Discussion Alan Zaslavsky Harvard Medical School.

Fabrication as a Statistical Procedure

• Fabrication is like imputation– Duplication is like hot deck– Duplication with random modifications is like

multiple imputation– Duplication is like weight modification

• Fabrication is a multilevel process– Interview, interviewer, area, … project level

Page 3: Discussion Alan Zaslavsky Harvard Medical School.

Fabrication as a Game• Payoffs/risks to fabricator– Reduce effort while receiving payment– Risks greater for higher-level organization/person

• Detection/deterrence

• Costs/risks to data purchaser– Paying more for less information– Wrong decisions– Loss of credibility (cliff loss function)

• Risks may change with greater expertise on either side

Page 4: Discussion Alan Zaslavsky Harvard Medical School.

Assumptions about Fabricators

• Fabricators are not very sophisticated– No fancy synthesis models

• Fabricators are not trying to work hard– Falsifying must be easier than data collection– Will not know how to “beat” moderately sophisticated

detection techniques• If fabricators try harder …– Good standard synthesis methods could be hard to

detect– Learning on both sides

Page 5: Discussion Alan Zaslavsky Harvard Medical School.

Fabrication on the Continuum of Survey Management

• Related to other survey errors at scale– Inadequately designed survey questions and tools• Not adapted to conditions under which survey fielded

– Interviewer errors• Misinterpretation of questions, procedures• Interpersonal interview technique• Training and motivation

• Monitoring of “honesty”, accuracy, technique

Page 6: Discussion Alan Zaslavsky Harvard Medical School.

Detection techniques• Good survey management– Timely, at all levels– Recruitment, observation– Metadata and paradata

• Post-survey analysis– Replication of survey: interpenetrating samples – Subject-matter expertise– Statistical outliers (single and patterns)

• Earlier is better

Page 7: Discussion Alan Zaslavsky Harvard Medical School.

Regina Faranda

• Extensive checking– Subject-matter and survey expertise– Checklist: QC

• Statistical assumptions?– Can be stated and tested

Page 8: Discussion Alan Zaslavsky Harvard Medical School.

Rita Thissen

• Detailed specifics of monitoring and detection systems– Technology: CARI, CAPI, …

• (Anecdotes rarely heard)

Page 9: Discussion Alan Zaslavsky Harvard Medical School.

Mike Robbins

• Duplicate detection is like record linkage– Likelihood ratio

• Duplicate detection also important in other settings– US Census (2000?): match 330M

Page 10: Discussion Alan Zaslavsky Harvard Medical School.

Robbins – Duplicate detection

• Duplicate detection is like record linkage– Likelihood ratio

• Duplicate detection also important in other settings– US Census (2000?): match 330M × 330M possible

record pairs• Would models be different for fabricated data,

processing errors, repeated real interviews?

Page 11: Discussion Alan Zaslavsky Harvard Medical School.

Example: Medicare CAHPS survey

• Pulled ~5000 responses (out of ~400K/year)• Examined 27 substantive items• Complex features– Substantial amount of screening/skipped items– Multiple choice items– Blocks of closely related items

Page 12: Discussion Alan Zaslavsky Harvard Medical School.

Agreement – all pairs

Page 13: Discussion Alan Zaslavsky Harvard Medical School.

Best agreement: duplicates?

Page 14: Discussion Alan Zaslavsky Harvard Medical School.

Conclusions

• Know your data and survey methodology• Thanks to speakers for sharing their

experience and methods