How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
The Survey Cycle
Sampling Overview
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSA quote …
“Why do they call it common sense?
It isn’t that common.”
- Mark Twain
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSThe brief
Intro/first considerations Contracting out surveys Survey management Sampling issues Questionnaire development Pilot surveys/Sources of error Data collection/processing Data presentation Completing the loop
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSMajor themes
First considerations Who do I need to survey? How do I get representative samples? Representative sampling strategies Accuracy statements Developing the questionnaire Presenting the results How do I manage this beast?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSExcellent on line resources
www.stats.govt.nz/NR/rdonlyres/CA923AA8-BDF6-4EAD-834F-573F04EEF7A9/0/AGuidetoagoodSurvey.pdf
www.perseus.com/surveytips/Survey_101.htm
www.whatisasurvey.info
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Understand the Problem
Identify Questions
Refine/Revise Questions
Choose Design
Inventory Resources
Assess Feasibility
Determine Trade-offs
STAGE 1:RESEARCHDEFINITION
STAGE 2:RESEARCHPLAN/DESIGN
The process
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
First considerations
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
“The report presents the findings of the first comprehensive national survey of the views of a sample of adult New Zealanders about crime and the criminal justice system’s response to crime.”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
…“the survey results were available to the Ministry’s policy staff working on the sentencing and parole reforms.”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
“Since the survey was conducted in 1999, a major reform of the sentencing and parole regimes in New Zealand has taken place, with the commencement of the Sentencing Act 2002 and the Parole Act 2002 on 30 June 2002.”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSWhat do you want to achieve?
What are the objectives?
What are the critical questions to be answered?
How will the results be used?
How will the results be communicated?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
“Fools rush in where angels fear to tread...”
Do I have to do a survey? Has this been done by someone else? Literature search Published Statistics/Other Government
agencies Surrogate information - proxies Expert advice
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
Introduction 1 1.1 National
surveys overseas 1.2 Research at
home 1.3 The present
study
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Published Stats/proxies example: Race and politics in New Caledonia
Recent presidential election in France – and therefore New Caledonia
Nicolas Sarkozy and Ségolène Royal Anecdotal evidence suggests Kanaks
(Melanesians) were more likely to vote for Ségolène
Election results available by region No ethnicity question in latest census (2004) – Chirac banned it
s
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Published stats/proxies example: Race and politics in New Caledonia
NC’s statisticians have come up with a ‘proxy’ measure
% of people (14+ years) by administrative region who speak a Melanesian language Voting data available from
“Les Nouvelles” newspaper
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Published stats/proxies example: Race and politics in New Caledonia
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Published stats/proxies example: Race and politics in New Caledonia
% Voted for Sarkozy (who voted) vs % Speak Melanesian Language
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0% 20% 40% 60% 80% 100% 120%
% Speak Melanesian
% V
oted
Sar
k (w
ho v
oted
)
% Voted for Sarkozy (who voted) vs % Speak Melanesian Language
R2 = 85%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0% 20% 40% 60% 80% 100% 120%
% Speak Melanesian
% V
oted
Sar
k (w
ho v
oted
)
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Failing this, I will need to conduct a survey
Population Sample(select)
StatisticParameter (estimate)
sample proportion
sample mean
true proportion
true mean
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
“While no nation-wide survey focussing solely on attitudes towards crime and criminal justice issues has previously been conducted in New Zealand, some studies have touched on related topics. For example, in 1996, the National Survey of Crime Victims (Young et al. 1997)”….
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Who do I need to survey?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSWho do I need to survey?
Define who your target population is. Examples:
Main household purchaser Eligible voters Recent insurance claimant
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
The sample comprised 1,000 interviews amongst the general population aged 18 years and over (the main sample)
Person-to-person survey was conducted…
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
How do I need to survey?
Types of surveys: The three most common types of surveys,
mail/web surveys telephone surveys Person-to-person interviews.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Types of surveys
Survey costs are lowest for mail/web surveys
More expensive for telephone surveys Most expensive for personal interviews
With well-trained interviewers, higher response rates and longer questionnaires are possible with personal interviews
The design of the questionnaire is critical
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSWeb survey example:
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSTelephone survey example
METHOD: Conducted by CATI (Computer Assisted Telephone Interviewing)
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSHow much $$$ is needed?
Communication with Consumer Link
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSHow much $$$ is needed?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSHow do I sample these people?
Non-representative samples Send letters out/ web requests 0800/0900
telephone requests – wait for replies Self-selection bias Convenience/judgment/snowball
sampling
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSNon-representative samples
Sampling cost is lower and implementation easier Statistically valid statements cannot be
made about the precision of the estimates
There is some information but it cannot ‘retro-fitted’ to a different population
Why? You have no idea if the respondents are ‘representative’ of the people you are interested in.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Non-representative samples: Disaster
To prepare for her book Women and Love, Shere Hite (1976):
sent questionnaires to 100,000 women asking about love, sex, and relationships 4.5% responded Hite used those
responses to write her book
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Non-representative samples: Disaster
Moore (Statistics: Concepts and Controversies, 1997) noted: respondents “were fed up with men and
eager to fight them…” “the anger became the theme of the
book…” “but angry women are more likely” to
respond
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
When parts of the population cannot be selected...…the sample cannot representthe whole population.
Selection bias
Population
Sample
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
How do I get representative samples?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSRepresentative samples
The method use to pick interviewees relies on the bedrock of random sampling: when the chance of selecting each
person in the target population is known, Then, and only then, do the results of
the sample survey reflect the entire population
This is the reason that interviews with 1,000 NZ adults can accurately reflect the opinions of more than ~2 million NZ adults
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Representative = random sample
Each person in a population has a KNOWN RANDOM PROBABILITY of being selected
Arrange yourself randomly about room Distribute yourselves randomly in the
room E.g. randomly choose ½ of people from
today How?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Representative samples: sample frames
A critical element in any survey is to locate (or “cover”) all the members of the population being studied so that they have a chance to be sampled.
To achieve this, a list - termed a “sampling frame” - is usually constructed
The quality of the sampling frame is probably the dominant feature for ensuring adequate coverage of the desired population.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSSample frames
Any procedure and data that effectively enables the selection of a sample
Good frames require development and maintenance efforts E.g. Statistics NZ runs an annual survey
(the Annual Business Frame Update Survey) simply to update their Business Frame
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSSample frames
Most frames are imperfect, exhibiting Undercoverage
Duplicated units (perhaps under different spellings or ID numbers)
Out-of-date or missing data
Population
Sample frame
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSTelephone sampling of households
Under-coverage is a fundamental problem for telephone surveys of households Only 92% of households have a land-line Less than 80% of Maori or Pacific households Households without phones are also different in other ways; e.g. they are generally low-income households
Duplicates also occur i.e. some households have more than one phone number, and thus have more chance of being selected
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSTelephone sampling frames …
White Pages Telecom sells random samples of listed
numbers Unlisted numbers not included
So have lost another 15% of phone numbers
May be cheaper to use paper directories instead, but these are out of date (even when just distributed)
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSTelephone sampling frames …
Random digit dialing (RDD) Naïve approach
List all possible numbers, and select at random
Many non-working numbers - success rate <10%
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSTelephone sampling frames …
Better approaches E.g. Mitofsky-Waksberg Take banks of possible phone numbers, and select phone numbers more intensively from banks that have larger proportions of listed numbers Increased hit rate to 60% in US Pseudo-RDD methods using banks centered on valid “seed” phone numbers are sometimes used
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Household sampling for in-home surveys
Multi-stage approach widely used Area sample
take list of areas and select sample of areas 38,366 mesh blocks in NZ
Geostatistical System
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Household sampling for in-home surveys
Household sample Interviewers list all dwellings within
selected mesh-blocks (following mesh-block maps)
Sample of households selected in each area
Variations on this approach exist Random route within area (i.e. route
follows rules from random starting point), or ignoring area boundaries
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
“The main sample comprising 1006 adults was drawn from 1500 households in 14 locations throughout New Zealand.”
“The locations were defined in terms of region and area type and were designed to ensure a fully representative cross-section of the New Zealand population aged 18 years and over.”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating Case Study: Crime & Punishment
The population consists of all households in NZ
Sampling frame = area units
200 regions chosen randomly within 14 regional strata
5 households per region
Random adult chosen within each household
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSBusiness frames
Business Directory Excellent frame held by Statistics NZ Contained 278,000 non-farming
enterprises in Feb ‘01 Not available for market research
surveys Other business frames are marketing
databases Dun & Bradstreet, UBD, Yellow Pages
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Representative sampling strategies
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Types of representative sampling strategies
Simple random sampling Stratified random sampling Cluster sampling Systematic sampling Quota/booster sampling Combinations of the above
Multistage sampling
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Simple random sampling
Allocate labels 1, 2 …,N to population Randomly select sample of size, n, from
the above via: the use of random numbers,
This is used to ensure that each element in the sampled population has the same probability of being selected.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Stratified simple random sampling
The population is first divided into sub-groups, called strata
Take random sample from each strata The basis for forming the various strata
depends on the amount of info. known about sample frame
Can lead to more accurate estimates
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Stratified simple random sampling…
Strata can be region of country (rural/urban) used in political polls
Other auxiliary information – e.g. sex, income, age…
Especially useful for customer data base If you sample in direct proportion to
strata size, you reduce variation in estimates
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Cluster sampling
Cluster sampling requires that the population be divided into N groups of elements called clusters.
We then select a simple random sample of n clusters.
A primary application of cluster sampling involves area sampling, where the clusters are counties, city blocks, or other well-defined geographic sections.
Can increase variation as no longer information may not be ‘unique’ for individuals with in cluster
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSSystematic sampling
Choosing, say, every 10th person in your data frame
Assumes no relationship between selection choice and sampling frame
Used in transportation studies…
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSQuota/booster sampling
Some groups are of particular interest E.g., In NZ Maori/PI people
In SRS we will typically get smaller proportions of these people – as it will reflect general population
So these people are contacted until pre-specified numbers are reached so we can do more in depth analysis
Strictly speaking this is not a random sample
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
The sampling frame consists of all households in NZ
200 regions chosen randomly within 14 regional strata
5 households per region
Random adult chosen within each household
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
The sampling frame consists of all households in NZ
200 Regions chosen randomly within 14 regional strata
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
Sample design: “The sample design
used by ACNielsen in the Ministry’s project is best described as a fully national multi-stage stratified probability sample with clustering.”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
Quota/ Booster samples
“The main sample was supplemented with ‘booster’ samples of 250 Mäori and 250 Pacific Peoples adults aged 18 years and over.”…
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Accuracy statements
Sampling Errors vs. Non sampling errors
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSSampling errors
This is not an "error" in the sense of making a mistake. Rather, it is a measure of the possible range of approximation in the results because a sample was used
Interviews with a representative sample of 1,000 adults can accurately reflect the opinions of nearly ~2 million NZ adults
This range of possible results is called the error due to sampling, often called the margin of error (MOE)
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
More on sampling – a heuristic presentation
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Population distribution, e.g. income
m ( population mean)
Sampling errors
Sampling error The sample mean falls here only because certain randomly selected observations were included in the sample
Sample
( )x sample mean
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSMargin of error
A margin of error of 3% means that over the long run, 95% of the samples would give results within plus or minus 3% of the truth. 5% of the time the error would be greater
Quick method to calculate MOE for a proportion from a simple random sample:
n1Error ofMargin
where n is the sample size.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSSampling errors
This does not address the issue of whether people cooperate with the survey, or if the questions are understood, or if any other methodological issue exists. The sampling error is only the portion
of the potential error in a survey introduced by using a sample rather than interviewing the entire population
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Example: One News Colmar Brunton Poll
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Example: One News Colmar Brunton Poll
MOE: Based on the total sample of 1000 Eligible Voters, the maximum sampling error estimated is plus or minus 3.2%, expressed at the 95% confidence level
Looking for a difference between parties at any point in time
Needs to be a difference of 2xMOE % =6.4%
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Example: One News Colmar Brunton Poll
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Meanwhile, in the US, Bush and approval
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Meanwhile, in the US, Bush and approval
This chart plots all the different polls (grey dots) at once; the blue line is the estimated approval rate over time while the scatter of grey dots provides an estimate of the reliability of the blue line
Different polls are different random samples of the population Random sampling is not fool-proof; any one sample has a chance, albeit small, to poorly represent the population. That's why the dots add greatly to the chart
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Non-sampling errors…
Process errors: Examples include measurement error,
interviewer error, and processing error.
It can be minimised by proper interviewer training, good questionnaire design, pre-testing, and careful management of the data recording process.
The problem is most serious when a bias is created.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Errors in data acquisition: Selection bias Randomly select people – don’t let
them/you select these people!! Non-response errors
Anonymity, questionnaire design, relevance
Call backs, substitution, re-weighting data
Non-sampling errors…
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
“In order to maximise the chances of obtaining interviews at initially-selected dwellings and to minimise replacement of dwellings, a maximum of three trips into any urban area and two trips into rural areas were permitted.”
“Up to six call-backs were made to a household before it was replaced …”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
…then the sample mean is affected
Non-sampling error
Sampling error + Non–sampling error
Population
Sample
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSNon-sampling errors
Never be fooled by the number of responses Literary Digest's non-representative
(self-selection) sample of 12,000,000 people said Landon would beat Roosevelt in the 1936 Presidential election
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSNon-sampling errors
Increasing sample size will not reduce all of the above types of errors!
Think long and hard about how any of these errors may occur
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSDealing with non-sampling errors…
Mistakes – check/ re-check data Rule of thumb –if it’s too good to be true,
it is Training of interviewers Pilot questionnaire Wording of the questions
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Developing the questionnaire
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSA good questionnaire must:
Address the research questions of interest Ask short, simple, and clearly-worded
questions Usually, start with demographic questions
to help respondents get started comfortably
Use dichotomous and multiple-choice questions.
Be as short as possible
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSA good questionnaire must:
Use open-ended questions cautiously Avoid using leading questions
“Should a smack as part of good parental correction be a criminal offence in New Zealand?"
Pretest a questionnaire on a small number of people
Think about the way you intend to use the collected data when preparing the questionnaire
Questions will also depend on how you are getting the data e.g. CATI, person to person, mail/web
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSUsing focus groups
If possible, focus groups are a great way to assist in questionnaire design
In-depth discussion by trained interviewer for small group of people
Great way to understand the language used by people
Gets to the ‘qualities’ of interest Can eliminate your biases/assumptions
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
What type of population are you sampling?
Consider number of qualities respondents possess: Education (specifically reading level)
Web/mail surveys Limits of attention
avoid fatiguing respondents telephone surveys – very important
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
What type of population are you sampling?
Motivation Why is respondent going to/not
participate Political polls Do I need incentives $$$$
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSSome types of questions
Reports of fact - self-disclosure of some objective information e.g., age, sex, education, behavior.
Ratings of opinion or preference - evaluative response to statement e.g., satisfaction, agreement, like/dislike.
Reports of intended behavior - self-disclosure of motivation or intention e.g., likeliness to purchase.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
What type of response format is appropriate for each question?
Open-ended questions permits subject freedom to answer
question in own words. without pre-specified alternatives.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSOpen-ended questions
Advantages: Obtains unanticipated answers May better reflect respondent’s
thoughts/beliefs Appropriate when list of possible
answers is excessive
Disadvantages: Flexibility in responses difficult to code
and analyse Provides incomplete or unintelligible
answers
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSClose-ended questions
Subject selects from list of pre-determined, acceptable responses
Can sometimes use other to specify
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSTypes of closed-ended questions
Checklists - respondent selects certain number of pre-specified categories (nominal data)
Types of Exercises: Aerobics Basketball Swimming Weightlifting
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSTwo-way forced choice
respondent must select between two alternatives (crude ordinal/nominal)
Do you always wakeup before 8:00am?
Yes No
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Ranked
respondent must place items in order of importance or value (ordinal)
Rank in order of importance: Career Social life Love life Children
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Multiple-Choice (Likert scale)
respondent selects between range of alternatives along pre-specified continuum (ordinal/interval?)
Strongly StronglyAgree Agree Neutral Disagree Disagree
1 2 3 4 5
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSClosed-ended questions
Advantages: Obtains more reliable answers Meaning of responses more meaningful to
researcher Straightforward analysis
Disadvantages: Answers relative to response scale
provided Respondent's choice not among listed
alternatives Choices listed communicate kind of
response wanted
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSWriting good survey questions
Differences in answers should stem from differences among respondents rather than differences in the stimuli
Question's wording is obviously a central part of the stimulus
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSSimple sentences
No double negatives It is not the case that I have never
cheated on my tax returns Eliminate vagueness or poorly-defined
terms How many times in the past year have
you talked with a doctor about your health?
Objectionable/Irrelevant question How old are you?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSDiscrete questions/responses
Exhaustive/mutually exclusive categories How did you last travel to the
supermarket? car, bus, foot, walking, public
transportation
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSDiscrete questions/responses
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSLimit response format (7±2)
Even vs. odd categories Allow expression of variability
Strongly Agree Disagree Strongly Agree Disagree
Strongly Agree Neutral Disagree StronglyAgree Disagree
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSMatch response to item
Frequency (Never-All the time) Likert Scaling (Disagree-Agree) Quality (Poor-Excellent) Service (Not Well-Extremely Well)
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSOverall format
General to specific order of questions
Employ "filtering" questions (If “Yes”)
Mix question/response types to remove response bias
Minimise judgment and emphasise accuracy (social desirability)
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Example: One News Colmar Brunton Poll
Party Support “Under MMP you get two votes. One is for a political party and is called a
party vote. The other is for your local M.P. and is called an electorate vote.”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Example: One News Colmar Brunton poll
Party Vote* “Firstly thinking about the Party Vote
which is for a political party. Which political party would you vote
for?” IF DON’T KNOW –
“Which one would you be most likely to vote for?”
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSAlways seek others’ advice
Pre-test on colleagues Ask for outside advice Run a pilot study After a while you can become too close to
the subject and a fresh perspectives are needed
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Presenting the results
The data is the story, not the graph
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Published stats/proxies example: Race and politics in New Caledonia
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
“Only 5% of the sample were within the correct range in their estimate of the amount of violent crime. “
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Violent crime perception
0
5
10
15
20
25
30
35
<=10% 10-19 20-29 30-39 40-49 50-59 60-69 79-79 80-89
Violent crime/100 incidents
%
Motivating case study: crime & punishment
* Data made to fit original numbers – pseudo-fictitious
Actual rate 10%
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
What are they trying to report here?
Order tables by most common crime to least
See if there are any changes over the years
Don’t use a 3D object when you are presenting 1D info.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Motivating case study: crime & punishment
% reported crime by type of crime
0
10
20
30
40
50
60
70
Dishon
esty
Drugs a
nd an
tisoc
ial
Violen
ce
Propert
y Dam
age
Propert
y Abu
se
Admini
strati
ve
Sexua
l
%
Crim
eThe story:
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSLessons
Here, the real story was that people, on average, believed violence crime rate as being 5x worse than what is actually reported
Just because you can produce a pretty graph, doesn’t mean you should
The simplest graph shows the real story
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSYou don’t have to be boring
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSGraphical excellence
Show the data Make the viewer consider the substance
rather than the form Avoid distortion Present many numbers concisely Make large datasets coherent
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSGraphical excellence…
Make your graphics friendly: Avoid abbreviations and encodings. Run words left-to-right. Explain data with little messages. Label graphic; don’t use elaborate shadings and a complex legend. Avoid red/green distinctions. Use clean serif fonts in mixed case.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSTabular displays
CRIME GROUP (%) 1998 1999 2000Dishonesty 63 61 60
Drugs and antisocial 12 13 13
Violence 9 9 10
Property Damage 8 9 10
Property Abuse 5 5 5
Administrative 3 3 3
Sexual 1 1 1
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSTabular excellence
Encourage comparisons Reveal the data at several levels of detail Serve a clear purpose: description,
exploration … Be closely integrated with the text
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSTabular excellence…
Round drastically Arrange the numbers to be compared in
columns, not rows Order the columns by size (or some other
natural ordering) Use row and column averages as a focus Provide verbal summaries
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSGetting it right …
Presentations largely stand or fall on the quality, relevance, and integrity of the content. If your numbers are boring, then you've got the wrong numbers. If your words or images are not on point, making them dance in colour won't make them relevant. Audience boredom is usually a content failure, not a decoration failure.
Edward Tufte, writing in Wired MagazineSept 2003
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Managing the beast
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSKeep it simple
Bad survey statement: "We want to establish fiscal parameters
in the customer decision making process in the plumbing and bathroom products arenas, testing price points and elasticity. After gaining this information, we will analyze its effects on marketing strategies and tactics."
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSKeep it simple
Good survey statement: "We want to know how much customers
are willing to pay for sinks to see if we can make more money."
The clearer you see the target, the more easily you can see if you hit it or not.
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSAlways communicate
Always discuss: Your goals What you know/don’t know What you need
Give clear expectations/timelines Be flexible – situations change
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSAlways communicate …
Be prepared to make mistakes Fix them quickly Be honest Assume nothing If any thing can go wrong it will
Does “anal retentive” have a hyphen in it?
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSAlways communicate …
Ask for assistance Use professional data collection/research
agencies They are the experts
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Understand the Problem
Identify Questions
Refine/Revise Questions
Choose Design
Inventory Resources
Assess Feasibility
Determine Trade-offs
STAGE 1:RESEARCHDEFINITION
STAGE 2:RESEARCHPLAN/DESIGN
The process
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSIs it worth all the effort?
Compared to the alternative? Yes. Because reputable surveying
organisations consistently do good work In spite of the difficulties, surveys correctly
conducted are still the best objective measure of the state of the views of the population of interest
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICSA quote …
“Why do they call it common sense?
It isn’t that common.”
- Mark Twain
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Fini
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Presentation
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Crime & punishment case study
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
‘Big drink’ proposal
How
I Le
arne
d to
Sto
p W
orry
ing
and
Love
the
Sur
vey
Cyc
le
DEPARTMENT OF STATISTICS
Statistics NZ’sA Guide To Good Survey
Design
Top Related