How I Learned to Stop Worrying and Love the Survey Cycle DEPARTMENT OF STATISTICS The Survey Cycle...
-
Upload
suzanna-johnson -
Category
Documents
-
view
214 -
download
0
Transcript of How I Learned to Stop Worrying and Love the Survey Cycle DEPARTMENT OF STATISTICS The Survey Cycle...
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
The Survey Cycle
Sampling Overview
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
A quote …
“Why do they call it common sense?
It isn’t that common.”
- Mark Twain
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
The brief
Intro/first considerations Contracting out surveys Survey management Sampling issues Questionnaire development Pilot surveys/Sources of error Data collection/processing Data presentation Completing the loop
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Major themes
First considerations
Who do I need to survey?
How do I get representative samples?
Representative sampling strategies
Accuracy statements
Developing the questionnaire
Presenting the results
How do I manage this beast?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Excellent on line resources
www.stats.govt.nz/NR/rdonlyres/CA923AA8-BDF6-4EAD-834F-573F04EEF7A9/0/AGuidetoagoodSurvey.pdf
www.perseus.com/surveytips/Survey_101.htm
www.whatisasurvey.info
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Understand the Problem
Identify Questions
Refine/Revise Questions
Choose Design
Inventory Resources
Assess Feasibility
Determine Trade-offs
STAGE 1:RESEARCHDEFINITION
STAGE 2:RESEARCHPLAN/DESIGN
The process
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
First considerations
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
“The report presents the findings of the first comprehensive national survey of the views of a sample of adult New Zealanders about crime and the criminal justice system’s response to crime.”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
…“the survey results were available to the Ministry’s policy staff working on the sentencing and parole reforms.”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
“Since the survey was conducted in 1999, a major reform of the sentencing and parole regimes in New Zealand has taken place, with the commencement of the Sentencing Act 2002 and the Parole Act 2002 on 30 June 2002.”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
What do you want to achieve?
What are the objectives?
What are the critical questions to be
answered?
How will the results be used?
How will the results be communicated?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
“Fools rush in where angels fear to tread...”
Do I have to do a survey?
Has this been done by someone else?
Literature search
Published Statistics/Other Government agencies
Surrogate information - proxies
Expert advice
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
Introduction 1
1.1 National surveys overseas
1.2 Research at home
1.3 The present study
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Published Stats/proxies example: Race and politics in New Caledonia
Recent presidential election in France – and therefore New Caledonia
Nicolas Sarkozy and Ségolène Royal
Anecdotal evidence suggests Kanaks (Melanesians) were more likely to vote for Ségolène
Election results available by region
No ethnicity question in latest census
(2004) – Chirac banned it s
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Published stats/proxies example: Race and politics in New Caledonia
NC’s statisticians have come up with a ‘proxy’ measure
% of people (14+ years) by administrative region who speak a Melanesian language
Voting data available from “Les Nouvelles” newspaper
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Published stats/proxies example: Race and politics in New Caledonia
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Published stats/proxies example: Race and politics in New Caledonia
% Voted for Sarkozy (who voted) vs % Speak Melanesian Language
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0% 20% 40% 60% 80% 100% 120%
% Speak Melanesian
% V
ote
d S
ark
(wh
o v
ote
d)
% Voted for Sarkozy (who voted) vs % Speak Melanesian Language
R2 = 85%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0% 20% 40% 60% 80% 100% 120%
% Speak Melanesian
% V
ote
d S
ark
(wh
o v
ote
d)
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Failing this, I will need to conduct a survey
Population Sample(select)
StatisticParameter (estimate)
sample proportion
sample mean
true proportion
true mean
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
“While no nation-wide survey focussing solely on attitudes towards crime and criminal justice issues has previously been conducted in New Zealand, some studies have touched on related topics. For example, in 1996, the National Survey of Crime Victims (Young et al. 1997)”….
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Who do I need to survey?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Who do I need to survey?
Define who your target population is.
Examples:
Main household purchaser
Eligible voters
Recent insurance claimant
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
The sample comprised 1,000 interviews amongst the general population aged 18 years and over (the main sample)
Person-to-person survey was conducted…
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
How do I need to survey?
Types of surveys:
The three most common types of surveys,
mail/web surveys
telephone surveys
Person-to-person interviews.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Types of surveys
Survey costs are lowest for mail/web surveys
More expensive for telephone surveys
Most expensive for personal interviews
With well-trained interviewers, higher response rates and longer questionnaires are possible with personal interviews
The design of the questionnaire is critical
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Web survey example:
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Telephone survey example
METHOD: Conducted by CATI (Computer Assisted Telephone Interviewing)
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
How much $$$ is needed?
Communication with Consumer Link
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
How much $$$ is needed?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
How do I sample these people?
Non-representative samples
Send letters out/ web requests 0800/0900 telephone requests – wait for replies
Self-selection bias
Convenience/judgment/snowball sampling
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Non-representative samples
Sampling cost is lower and implementation easier
Statistically valid statements cannot be made about the precision of the estimates
There is some information but it cannot ‘retro-fitted’ to a different population
Why? You have no idea if the respondents are ‘representative’ of the people you are interested in.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Non-representative samples: Disaster
To prepare for her book Women and Love, Shere Hite (1976):
sent questionnaires to 100,000 women asking about love, sex, and relationships
4.5% responded
Hite used those responses to write her book
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Non-representative samples: Disaster
Moore (Statistics: Concepts and Controversies, 1997) noted:
respondents “were fed up with men and eager to fight them…”
“the anger became the theme of the book…”
“but angry women are more likely” to respond
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
When parts of the population cannot be selected...
…the sample cannot representthe whole population.
Selection bias
Population
Sample
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
How do I get representative samples?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Representative samples
The method use to pick interviewees relies on the bedrock of random sampling:
when the chance of selecting each person in the target population is known,
Then, and only then, do the results of the sample survey reflect the entire population
This is the reason that interviews with 1,000 NZ adults can accurately reflect the opinions of more than ~2 million NZ adults
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Representative = random sample
Each person in a population has a KNOWN RANDOM PROBABILITY of being selected
Arrange yourself randomly about room
Distribute yourselves randomly in the room
E.g. randomly choose ½ of people from today
How?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Representative samples: sample frames
A critical element in any survey is to locate (or “cover”) all the members of the population being studied so that they have a chance to be sampled.
To achieve this, a list - termed a “sampling frame” - is usually constructed
The quality of the sampling frame is probably the dominant feature for ensuring adequate coverage of the desired population.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Sample frames
Any procedure and data that effectively enables the selection of a sample
Good frames require development and maintenance efforts
E.g. Statistics NZ runs an annual survey (the Annual Business Frame Update Survey) simply to update their Business Frame
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Sample frames
Most frames are imperfect, exhibiting
Undercoverage
Duplicated units (perhaps under different spellings or ID numbers)
Out-of-date or missing data
Population
Sample frame
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Telephone sampling of households
Under-coverage is a fundamental problem for telephone surveys of households
Only 92% of households have a land-line
Less than 80% of Maori or Pacific households
Households without phones are also different in other ways; e.g. they are generally low-income households
Duplicates also occur
i.e. some households have more than one phone number, and thus have more chance of being selected
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Telephone sampling frames …
White Pages
Telecom sells random samples of listed numbers
Unlisted numbers not included
So have lost another 15% of phone numbers
May be cheaper to use paper directories instead, but these are out of date (even when just distributed)
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Telephone sampling frames …
Random digit dialing (RDD)
Naïve approach
List all possible numbers, and select at random
Many non-working numbers - success rate <10%
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Telephone sampling frames …
Better approaches
E.g. Mitofsky-Waksberg
Take banks of possible phone numbers, and select phone numbers more intensively from banks that have larger proportions of listed numbers
Increased hit rate to 60% in US
Pseudo-RDD methods using banks centered on valid “seed” phone numbers are sometimes used
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Household sampling for in-home surveys
Multi-stage approach widely used
Area sample
take list of areas and select sample of areas
38,366 mesh blocks in NZ Geostatistical System
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Household sampling for in-home surveys
Household sample
Interviewers list all dwellings within selected mesh-blocks (following mesh-block maps)
Sample of households selected in each area
Variations on this approach exist
Random route within area (i.e. route follows rules from random starting point), or ignoring area boundaries
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
“The main sample comprising 1006 adults was drawn from 1500 households in 14 locations throughout New Zealand.”
“The locations were defined in terms of region and area type and were designed to ensure a fully representative cross-section of the New Zealand population aged 18 years and over.”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating Case Study: Crime & Punishment
The population consists of all households in NZ
Sampling frame = area units
200 regions chosen randomly within 14 regional strata
5 households per region
Random adult chosen within each household
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Business frames
Business Directory
Excellent frame held by Statistics NZ
Contained 278,000 non-farming enterprises in Feb ‘01
Not available for market research surveys
Other business frames are marketing databases
Dun & Bradstreet, UBD, Yellow Pages
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Representative sampling strategies
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Types of representative sampling strategies
Simple random sampling
Stratified random sampling
Cluster sampling
Systematic sampling
Quota/booster sampling
Combinations of the above
Multistage sampling
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Simple random sampling
Allocate labels 1, 2 …,N to population
Randomly select sample of size, n, from the above via:
the use of random numbers,
This is used to ensure that each element in the sampled population has the same probability of being selected.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Stratified simple random sampling
The population is first divided into sub-groups, called strata
Take random sample from each strata
The basis for forming the various strata depends on the amount of info. known about sample frame
Can lead to more accurate estimates
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Stratified simple random sampling…
Strata can be region of country (rural/urban) used in political polls
Other auxiliary information – e.g. sex, income, age…
Especially useful for customer data base
If you sample in direct proportion to strata size, you reduce variation in estimates
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Cluster sampling
Cluster sampling requires that the population be divided into N groups of elements called clusters.
We then select a simple random sample of n clusters.
A primary application of cluster sampling involves area sampling, where the clusters are counties, city blocks, or other well-defined geographic sections.
Can increase variation as no longer information may not be ‘unique’ for individuals with in cluster
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Systematic sampling
Choosing, say, every 10th person in your data frame
Assumes no relationship between selection choice and sampling frame
Used in transportation studies…
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Quota/booster sampling
Some groups are of particular interest
E.g., In NZ Maori/PI people
In SRS we will typically get smaller proportions of these people – as it will reflect general population
So these people are contacted until pre-specified numbers are reached so we can do more in depth analysis
Strictly speaking this is not a random sample
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
The sampling frame consists of all households in NZ
200 regions chosen randomly within 14 regional strata
5 households per region
Random adult chosen within each household
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
The sampling frame consists of all households in NZ
200 Regions chosen randomly within 14 regional strata
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
Sample design:
“The sample design used by ACNielsen in the Ministry’s project is best described as a fully national multi-stage stratified probability sample with clustering.”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
Quota/ Booster samples
“The main sample was supplemented with ‘booster’ samples of 250 Mäori and 250 Pacific Peoples adults aged 18 years and over.”…
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Accuracy statements
Sampling Errors vs. Non sampling errors
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Sampling errors
This is not an "error" in the sense of making a mistake. Rather, it is a measure of the possible range of approximation in the results because a sample was used
Interviews with a representative sample of 1,000 adults can accurately reflect the opinions of nearly ~2 million NZ adults
This range of possible results is called the error due to sampling, often called the margin of error (MOE)
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
More on sampling – a heuristic presentation
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Population distribution, e.g. income
m ( population mean)
Sampling errors
Sampling error The sample mean falls here only because certain randomly selected observations were included in the sample
Sample
( )x sample mean
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Margin of error
A margin of error of 3% means that over the long run, 95% of the samples would give results within plus or minus 3% of the truth. 5% of the time the error would be greater
Quick method to calculate MOE for a proportion from a simple random sample:
n1
Error ofMargin
where n is the sample size.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Sampling errors
This does not address the issue of whether people cooperate with the survey, or if the questions are understood, or if any other methodological issue exists.
The sampling error is only the portion of the potential error in a survey introduced by using a sample rather than interviewing the entire population
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Example: One News Colmar Brunton Poll
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Example: One News Colmar Brunton Poll
MOE: Based on the total sample of 1000 Eligible Voters, the maximum sampling error estimated is plus or minus 3.2%, expressed at the 95% confidence level
Looking for a difference between parties at any point in time
Needs to be a difference of 2xMOE % =6.4%
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Example: One News Colmar Brunton Poll
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Meanwhile, in the US, Bush and approval
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Meanwhile, in the US, Bush and approval
This chart plots all the different polls (grey dots) at once;
the blue line is the estimated approval rate over time
while the scatter of grey dots provides an estimate of the reliability of the blue line
Different polls are different random samples of the population
Random sampling is not fool-proof; any one sample has a chance, albeit small, to poorly represent the population. That's why the dots add greatly to the chart
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Non-sampling errors…
Process errors:
Examples include measurement error, interviewer error, and processing error.
It can be minimised by proper interviewer training, good questionnaire design, pre-testing, and careful management of the data recording process.
The problem is most serious when a bias is created.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Errors in data acquisition:
Selection bias
Randomly select people – don’t let them/you select these people!!
Non-response errors
Anonymity, questionnaire design, relevance
Call backs, substitution, re-weighting data
Non-sampling errors…
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
“In order to maximise the chances of obtaining interviews at initially-selected dwellings and to minimise replacement of dwellings, a maximum of three trips into any urban area and two trips into rural areas were permitted.”
“Up to six call-backs were made to a household before it was replaced …”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
…then the sample mean is affected
Non-sampling error
Sampling error + Non–sampling error
Population
Sample
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Non-sampling errors
Never be fooled by the number of responses
Literary Digest's non-representative (self-selection) sample of 12,000,000 people said Landon would beat Roosevelt in the 1936 Presidential election
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Non-sampling errors
Increasing sample size will not reduce all of the above types of errors!
Think long and hard about how any of these errors may occur
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Dealing with non-sampling errors…
Mistakes – check/ re-check data
Rule of thumb –if it’s too good to be true, it is
Training of interviewers
Pilot questionnaire
Wording of the questions
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Developing the questionnaire
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
A good questionnaire must:
Address the research questions of interest
Ask short, simple, and clearly-worded questions
Usually, start with demographic questions to help respondents get started comfortably
Use dichotomous and multiple-choice questions.
Be as short as possible
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
A good questionnaire must:
Use open-ended questions cautiously
Avoid using leading questions
“Should a smack as part of good parental correction be a criminal offence in New Zealand?"
Pretest a questionnaire on a small number of people
Think about the way you intend to use the collected data when preparing the questionnaire
Questions will also depend on how you are getting the data e.g. CATI, person to person, mail/web
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Using focus groups
If possible, focus groups are a great way to assist in questionnaire design
In-depth discussion by trained interviewer for small group of people
Great way to understand the language used by people
Gets to the ‘qualities’ of interest
Can eliminate your biases/assumptions
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
What type of population are you sampling?
Consider number of qualities respondents possess:
Education (specifically reading level)
Web/mail surveys
Limits of attention
avoid fatiguing respondents
telephone surveys – very important
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
What type of population are you sampling?
Motivation
Why is respondent going to/not participate
Political polls
Do I need incentives $$$$
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Some types of questions
Reports of fact - self-disclosure of some objective information
e.g., age, sex, education, behavior.
Ratings of opinion or preference - evaluative response to statement
e.g., satisfaction, agreement, like/dislike.
Reports of intended behavior - self-disclosure of motivation or intention
e.g., likeliness to purchase.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
What type of response format is appropriate for each question?
Open-ended questions
permits subject freedom to answer question in own words.
without pre-specified alternatives.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Open-ended questions
Advantages:
Obtains unanticipated answers
May better reflect respondent’s thoughts/beliefs
Appropriate when list of possible answers is excessive
Disadvantages:
Flexibility in responses difficult to code and analyse
Provides incomplete or unintelligible answers
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Close-ended questions
Subject selects from list of pre-determined, acceptable responses
Can sometimes use other to specify
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Types of closed-ended questions
Checklists - respondent selects certain number of pre-specified categories (nominal data)
Types of Exercises: Aerobics Basketball Swimming Weightlifting
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Two-way forced choice
respondent must select between two alternatives (crude ordinal/nominal)
Do you always wakeup before 8:00am?
Yes No
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Ranked
respondent must place items in order of importance or value (ordinal)
Rank in order of importance: Career Social life Love life Children
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Multiple-Choice (Likert scale)
respondent selects between range of alternatives along pre-specified continuum (ordinal/interval?)
Strongly StronglyAgree Agree Neutral Disagree Disagree
1 2 3 4 5
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Closed-ended questions
Advantages:
Obtains more reliable answers
Meaning of responses more meaningful to researcher
Straightforward analysis
Disadvantages:
Answers relative to response scale provided
Respondent's choice not among listed alternatives
Choices listed communicate kind of response wanted
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Writing good survey questions
Differences in answers should stem from differences among respondents rather than differences in the stimuli
Question's wording is obviously a central part of the stimulus
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Simple sentences
No double negatives
It is not the case that I have never cheated on my tax returns
Eliminate vagueness or poorly-defined terms
How many times in the past year have you talked with a doctor about your health?
Objectionable/Irrelevant question
How old are you?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Discrete questions/responses
Exhaustive/mutually exclusive categories
How did you last travel to the supermarket?
car, bus, foot, walking, public transportation
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Discrete questions/responses
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Limit response format (7±2)
Even vs. odd categories
Allow expression of variability
Strongly Agree Disagree Strongly Agree Disagree
Strongly Agree Neutral Disagree StronglyAgree Disagree
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Match response to item
Frequency (Never-All the time)
Likert Scaling (Disagree-Agree)
Quality (Poor-Excellent)
Service (Not Well-Extremely Well)
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Overall format
General to specific order of questions
Employ "filtering" questions (If “Yes”)
Mix question/response types to remove response bias
Minimise judgment and emphasise accuracy (social desirability)
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Example: One News Colmar Brunton Poll
Party Support
“Under MMP you get two votes.
One is for a political party and is called a party vote. The other is for your local M.P. and is called an electorate vote.”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Example: One News Colmar Brunton poll
Party Vote*
“Firstly thinking about the Party Vote which is for a political party.
Which political party would you vote for?”
IF DON’T KNOW –
“Which one would you be most likely to vote for?”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Always seek others’ advice
Pre-test on colleagues
Ask for outside advice
Run a pilot study
After a while you can become too close to the subject and a fresh perspectives are needed
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Presenting the results
The data is the story, not the graph
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Published stats/proxies example: Race and politics in New Caledonia
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
“Only 5% of the sample were within the correct range in their estimate of the amount of violent crime. “
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Violent crime perception
0
5
10
15
20
25
30
35
<=10% 10-19 20-29 30-39 40-49 50-59 60-69 79-79 80-89
Violent crime/100 incidents
%
Motivating case study: crime & punishment
* Data made to fit original numbers – pseudo-fictitious
Actual rate 10%
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
What are they trying to report here?
Order tables by most common crime to least
See if there are any changes over the years
Don’t use a 3D object when you are presenting 1D info.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
% reported crime by type of crime
0
10
20
30
40
50
60
70
Dishon
esty
Drugs
and
ant
isocia
l
Violen
ce
Prope
rty D
amag
e
Prope
rty A
buse
Admini
strat
ive
Sexua
l
%
Cri
me
The story:
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Lessons
Here, the real story was that people, on average, believed violence crime rate as being 5x worse than what is actually reported
Just because you can produce a pretty graph, doesn’t mean you should
The simplest graph shows the real story
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
You don’t have to be boring
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Graphical excellence
Show the data
Make the viewer consider the substance rather than the form
Avoid distortion
Present many numbers concisely
Make large datasets coherent
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Graphical excellence…
Make your graphics friendly:
Avoid abbreviations and encodings.
Run words left-to-right.
Explain data with little messages.
Label graphic; don’t use elaborate shadings and a complex legend.
Avoid red/green distinctions.
Use clean serif fonts in mixed case.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Tabular displays
CRIME GROUP (%) 1998 1999 2000
Dishonesty 63 61 60
Drugs and antisocial 12 13 13
Violence 9 9 10
Property Damage 8 9 10
Property Abuse 5 5 5
Administrative 3 3 3
Sexual 1 1 1
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Tabular excellence
Encourage comparisons
Reveal the data at several levels of detail
Serve a clear purpose: description, exploration …
Be closely integrated with the text
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Tabular excellence…
Round drastically
Arrange the numbers to be compared in columns, not rows
Order the columns by size (or some other natural ordering)
Use row and column averages as a focus
Provide verbal summaries
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Getting it right …
Presentations largely stand or fall on the quality, relevance, and integrity of the content. If your numbers are boring, then you've got the wrong numbers. If your words or images are not on point, making them dance in colour won't make them relevant. Audience boredom is usually a content failure, not a decoration failure.
Edward Tufte, writing in Wired Magazine
Sept 2003
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Managing the beast
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Keep it simple
Bad survey statement:
"We want to establish fiscal parameters in the customer decision making process in the plumbing and bathroom products arenas, testing price points and elasticity. After gaining this information, we will analyze its effects on marketing strategies and tactics."
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Keep it simple
Good survey statement:
"We want to know how much customers are willing to pay for sinks to see if we can make more money."
The clearer you see the target, the more easily you can see if you hit it or not.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Always communicate
Always discuss:
Your goals
What you know/don’t know
What you need
Give clear expectations/timelines
Be flexible – situations change
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Always communicate …
Be prepared to make mistakes
Fix them quickly
Be honest
Assume nothing
If any thing can go wrong it will
Does “anal retentive” have a hyphen in it?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Always communicate …
Ask for assistance
Use professional data collection/research agencies
They are the experts
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Understand the Problem
Identify Questions
Refine/Revise Questions
Choose Design
Inventory Resources
Assess Feasibility
Determine Trade-offs
STAGE 1:RESEARCHDEFINITION
STAGE 2:RESEARCHPLAN/DESIGN
The process
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Is it worth all the effort?
Compared to the alternative?
Yes. Because reputable surveying organisations consistently do good work
In spite of the difficulties, surveys correctly conducted are still the best objective measure of the state of the views of the population of interest
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
A quote …
“Why do they call it common sense?
It isn’t that common.”
- Mark Twain
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Fini
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Presentation
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Crime & punishment case study
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
‘Big drink’ proposal
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Statistics NZ’sA Guide To Good Survey
Design