1
Lecture 4 - Survey design
• Sampling
• Sample size/precision
• Data collection issues
• Sources of bias
• Critical review of survey reports
2
Why do surveys?
• Information on particular population
– prevalence of a disease
– behaviour, knowledge, attitude
• Planning of services
• Collect information on data not routinely available:
– e.g., mental health status, health behaviours
• Repeat surveys to monitor trends (serial cross-sectional studies)
3
Bias and precision of the survey estimates
• Bias:– selection bias relates to sample selection– information bias relates to information
collected (measurements)
• Precision– relates to sample size
4
Study bias and precision vs measurement validity and
reliability• Bias/validity:
– does measurement/study estimate reflect true state of affairs
• Precision/reliability – if measurement/study is repeated, will similar
result be obtained?
6
Definitions
• Sampling unit– person or group (e.g., household)
• Sampling frame– list of sampling units in the population
• censuses• electoral lists • telephone lists• are institutional populations excluded (e.g., prisons,
nursing homes)
7
Target and study population
• Target population:– population for generalization of results
• Study population:– population for collection of data– may be total target population or a sample
8
Types of sample• Non-representative
– convenience– volunteers
• Representative– simple random– systematic– cluster– multistage
9
Simple random sample
• Each sampling unit in the population has equal probability of being included
• Sampling with replacement:– each unit placed back in pool
• Sampling without replacement (usual method):– each unit selected is kept out of pool
10
Simple random sample (cont’d)
• Methods:– manual– tables of random numbers– computer-generated random numbers
11
Systematic sample• Select every nth individual from a list
– can use existing numbers– e.g., patient appointments, medical records
• Advantages:– Does not require complete sampling frame– Simple to carry out
• Disadvantages:– May be unsuitable for cyclic or ordered data (e.g.,
every 5th patient when only 5/day)
12
Stratified sampling
• Separate sample selected from different strata of population
• Requires separate sampling frame for each stratum
• Useful if there are small but important subgroups of the population (e.g., very old, very young, institutionalized, sick)
13
Cluster sampling
• Sampling unit is a group (e.g., household, village, school)
• Step 1: Simple random sample of groups
• Step 2: All members of group included in sample
• Advantages:– enumeration of population not needed– more efficient use of resources
14
Multistage sampling
• Larger units sampled in first stage, smaller units later
• e.g.:– stage 1 - sample of towns– stage 2 - sample of city blocks or census tracts– stage 3 - sample of households
15
Sampling for “hidden populations”
• Homosexual men:– gay bars, newspapers
• Injection drug users:– convenience sample (e.g., treatment facilities)– snowball sampling (through networks)
• Capture-recapture methods– identify biases of sampling method
16
Planning a survey
• Define target population • Select method of sampling
– sampling unit, sampling frame, etc
• Calculate sample size• Define survey data collection methods• Non-respondents
– number of attempts to reach
– different days, times
17
Sample size estimations
• Requirements:– level of precision (width of confidence interval)– expected variability (estimated from previous
studies, pilot study, or literature)
18
Design of questionnaires
• List study variables
• Collect existing questions and instruments
• Adapt and/or develop new questions
• Format questionaire
• Pre-testing (timing, responses, clarity, etc.)
• Revise, determine priorities, shorten
19
Question wording: clarity
• Use concrete rather than abstract terms, e.g.,– During a typical week, how many hours do you
spend doing vigorous exercise?– Not: How much exercise do you get?
• Avoid jargon, technical terms, slang• Avoid double-negatives (Do you disagree that
doctors should not make house calls?)• Use active vs passive voice (Has a doctor ever told
you vs Have you ever been told by a doctor?)
20
Question wording: clarity
– Break long sentences into short ones (20 word or fewer)
– Use good grammar but use informal style– Avoid hypothetical questions– Evaluate reading level (normally not more than
8th grade)
21
Question wording: neutrality
• Do not suggest desirable response, e.g.:– Not: do you ever drink alcohol?– Better: how often do you drink alcohol?
• Give permission to give undesirable response e.g.:– Sometimes people forget to take medications
their doctor prescribes. Do you ever forget (or how often do you forget) to take your medications?
22
Question wording
• Introduce attitude questions, e.g.:– People have different opinions about their
medical care. We are interested in your opinion.
• Avoid double-barreled questions– How much coffee or tea do you drink each day?
• Avoid assumptions– How much help do you get from your family?
23
Response wording
• Make them short
• Use as few options as possible
• Consider different types of non-response:– refuse– don’t know– no opinion– not applicable– omission by subject or interviewer
24
Response wording
• Make sure responses are mutually exclusive (or give instructions to “check all that apply”)
• Consider use of response card for multiple questions with same set of responses
25
Organization of questionnaire
• Group questions by subject matter
• Introduce each group with short descriptive statement (e.g., now I am going to ask you some questions about your use of health services)
• Begin with more emotionally neutral questions
• More sensitive questions (e.g., income, sexual function) near end of questionnaire
26
Organization of questionnaire
• interviewer-administered: repeat time frame fairly frequently
• self-administered: repeat time frame at top of each page or each set of questions, e.g.:
During the past year, how many times have you:– Visited a doctor?
– Been a patient in an emergency department?
– Been admitted to hospital?
27
Organization of questionnaires• Group questions with similar response scale
• Format skip patterns– screener questions– branching questions
• Time frame– group questions that ask about same time frame– “usual” behavior vs specified time period– assist respondent with milestones to help define
reference time frame
29
Face-to-face interviews:advantages
• reduce items with no response
• easier for older, less educated, lack of fluency in language
• some formats easier to administer:– skip patterns to avoid irrelevant questions– open-ended questions - can probe for more
complete response
30
Face-to-face interviews:disadvantages
• cost
• time
• effort (interviewer training, evaluation of inter-rater reliability)
• interviewer biases
• differences in sociodemographic characteristics of interviewer and subject
31
Telephone interviews:advantages
• less expensive than face-to-face
• reduce items with non-response
• some formats easier to administer:
– skip patterns to avoid irrelevant questions
– open-ended questions - can probe for more complete response
• large, representative samples can be organized from one office
• avoids bias associated with appearance of interviewer
32
Telephone interviews:disadvantages
• misses households without telephone
• misses those with unlisted ‘phone numbers
• bias when calls made during day
• multiple calls may be needed
• perceived as intrusive by some
• difficult to administer items with multiple response options
33
Mailed questionnaires:advantages
• least expensive
• can be coordinated from one office
• social desirability minimized
• inconsistent results on completeness of reporting (e.g., for # MD visits)
34
Mailed questionnaires:disadvantages
• relatively low response rates– multiple mailings, cover letter, letterhead,
advance warning, token of appreciation, SSAE• difficult to get information on non-respondents
– differences between early and late responders• items may be omitted: 5-10% may be unusable• cannot control order of questions• postal strikes
35
Analysis of surveys
• Missing data– exclude– imputation: e.g., based on characteristics of
respondents – sensitivity of estimate to method of imputation
• Weighting of estimates– for stratified samples
36
Analysis of surveys (cont’d)
• Crude estimates, confidence intervals– Continuous data: Mean, median, quartile– Categorical data: proportion– Confidence intervals to describe precision
37
Bias and precision of the survey estimates
• Bias:– selection bias relates to sample selection– information bias relates to information
collected
• Precision– relates to sample size
38
Selection bias in surveys
• Does the final analysis sample represent the original target population?
• Sources of bias:– sampling method– non-response– missing data
39
Information bias in surveys
• Bias in measurement of outcomes
• Sources of information bias:– non-validated measurement instrument – unblinded or poorly trained data collectors– response set– etc.
40
Critical review of an article describing prevalence or incidence
(Loney et al, 1998)
• Are the study methods valid?
• What is interpretation of the results?
• What is the applicability of the results?
41
Are the study methods valid?
• Appropriate study design and sampling methods
• Appropriate sampling frame
• Adequate sample size
• Suitable outcome
• Unbiased measurement of outcome
• Adequate response rate
42
What is interpretation of the results?
• Are the estimates of prevalence or incidence given with confidence intervals and in detail by subgroup, if appropriate?
43
What is the applicability of the results?
• Are the study subjects and the setting described in detail and similar to those of interest to you?
44
CSHA: Are the study methods valid?
• Appropriate study design and sampling methods
• Appropriate sampling frame
• Adequate sample size
• Suitable outcome
• Unbiased measurement of outcome
• Adequate response rate
45
CSHA: study design and sampling methods
• Prevalence survey with 2 analytic studies appended
• Target population: Canadian population aged 65 and over
• Exclusions:– Yukon and NW territories– Indian reserves, military units– persons with life-threatening illnesses– not fluent in French or English
46
CSHA: Appropriate study design and sampling methods (cont’d)
• 18 study centres across Canada
• 36 cities and surrounding rural area– selected for accessibility to study centres– included 60% of population aged 65+
47
Sampling frame: community sample
• Sampling frame for community sample: – Medicare (provincial health insurance plans)– In Ontario: used Enumeration Composite
Record (aggregate based on election records and municipal records)
• Stratified random sampling by age:– 65-74– 75-84 (twice sampling fraction of 75-84)– 85+ (2.5x sampling fraction of 75-84)
48
Sampling frame: institutional sample
• Nursing homes, chronic care facilties, collective dwellings (e.g., convents)
• 3 centres sampled from insurance lists
• Other centres used multistage sampling: – stratified sample of institutions:
• small (up to 25 beds)• medium (26 - 100 beds)• large (more than 100 beds)
– random sampling within selected institutions
49
Sampling (cont’d)
• Person who could not be contacted or who refused was replaced with another from same age group, same sex, same geographic region.
• Target for each region:– 1800 from community sample– 250 in institutional sample
50
Adequate sample size?
• Target sample in each region:• 1800 in community
• 250 in institutions
• Assuming institutional prevalence of 50%– 95% CI of 6%
• Assuming community prevalence of 5%– 95% CI of 1%
51
Suitable outcome
• 2-stage process
• 3MS screen in subject’s home
• all with positive screen (score of <78) and random sample of those with negative screen referred for clinical evaluation
• DSM III-R criteria for final diagnosis
52
Unbiased measurement of outcome
• Interviewers and clinical team (nurse, psychometrician, neuropsychologist, physician) were blind to screening result
• Negative screens included
53
Response rate: community sample• 19,398 people on community sample lists
– 3,753 had died, were wrong age, had left study area, or institutionalized
– 1,020 could not speak English or French
– 534 away or in hospital during study period
• 14,091 (72.6%) eligible for study
– 1,601 could not be contacted
– 3,482 refused
• 9,008 participated (63.9% of those eligible)
• 8,949 screened (59 who could not be screened referred for clinical assessment)
54
• Among those with positive screen (1,614):– 508 (31%) refused clinical assessment
• Among sample of those with negative screen (731):– 228 (31%) refused clinical assessment
• Total participation rate (screening and clinical assessment): 0.69 x 0.64 = 0.44
Response rate: community sample (cont)
55
Response rate: institutional sample• 1817 subjects in sample
– 154 died, assigned to wrong age group, left study area or institution
– 46 could not speak French or English– 31 in hospital
• 1586 (87.3%) eligible– 50 could not be contacted– 281 refused
• 1,255 (79.1%) participated in screening
Top Related