The Survey Cycle

136
How I Learned to Stop Worrying and Love the Survey Cycle DEPARTMENT OF STATISTICS The Survey Cycle Sampling Overview

description

The Survey Cycle. Sampling Overview. A quote …. “Why do they call it common sense? It isn’t that common.” - Mark Twain. The brief. Intro/first considerations Contracting out surveys Survey management Sampling issues Questionnaire development Pilot surveys/Sources of error - PowerPoint PPT Presentation

Transcript of The Survey Cycle

Page 1: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

The Survey Cycle

Sampling Overview

Page 2: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSA quote …

“Why do they call it common sense?

It isn’t that common.”

- Mark Twain

Page 3: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSThe brief

Intro/first considerations Contracting out surveys Survey management Sampling issues Questionnaire development Pilot surveys/Sources of error Data collection/processing Data presentation Completing the loop

Page 4: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSMajor themes

First considerations Who do I need to survey? How do I get representative samples? Representative sampling strategies Accuracy statements Developing the questionnaire Presenting the results How do I manage this beast?

Page 5: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSExcellent on line resources

www.stats.govt.nz/NR/rdonlyres/CA923AA8-BDF6-4EAD-834F-573F04EEF7A9/0/AGuidetoagoodSurvey.pdf

www.perseus.com/surveytips/Survey_101.htm

www.whatisasurvey.info

Page 6: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Understand the Problem

Identify Questions

Refine/Revise Questions

Choose Design

Inventory Resources

Assess Feasibility

Determine Trade-offs

STAGE 1:RESEARCHDEFINITION

STAGE 2:RESEARCHPLAN/DESIGN

The process

Page 7: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Page 8: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

First considerations

Page 9: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

“The report presents the findings of the first comprehensive national survey of the views of a sample of adult New Zealanders about crime and the criminal justice system’s response to crime.”

Page 10: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

…“the survey results were available to the Ministry’s policy staff working on the sentencing and parole reforms.”

Page 11: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

“Since the survey was conducted in 1999, a major reform of the sentencing and parole regimes in New Zealand has taken place, with the commencement of the Sentencing Act 2002 and the Parole Act 2002 on 30 June 2002.”

Page 12: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSWhat do you want to achieve?

What are the objectives?

What are the critical questions to be answered?

How will the results be used?

How will the results be communicated?

Page 13: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

“Fools rush in where angels fear to tread...”

Do I have to do a survey? Has this been done by someone else? Literature search Published Statistics/Other Government

agencies Surrogate information - proxies Expert advice

Page 14: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

Introduction 1 1.1 National

surveys overseas 1.2 Research at

home 1.3 The present

study

Page 15: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Published Stats/proxies example: Race and politics in New Caledonia

Recent presidential election in France – and therefore New Caledonia

Nicolas Sarkozy and Ségolène Royal Anecdotal evidence suggests Kanaks

(Melanesians) were more likely to vote for Ségolène

Election results available by region No ethnicity question in latest census (2004) – Chirac banned it

s

Page 16: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Published stats/proxies example: Race and politics in New Caledonia

NC’s statisticians have come up with a ‘proxy’ measure

% of people (14+ years) by administrative region who speak a Melanesian language Voting data available from

“Les Nouvelles” newspaper

Page 17: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Published stats/proxies example: Race and politics in New Caledonia

Page 18: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Published stats/proxies example: Race and politics in New Caledonia

% Voted for Sarkozy (who voted) vs % Speak Melanesian Language

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0% 20% 40% 60% 80% 100% 120%

% Speak Melanesian

% V

oted

Sar

k (w

ho v

oted

)

% Voted for Sarkozy (who voted) vs % Speak Melanesian Language

R2 = 85%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0% 20% 40% 60% 80% 100% 120%

% Speak Melanesian

% V

oted

Sar

k (w

ho v

oted

)

Page 19: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Failing this, I will need to conduct a survey

Population Sample(select)

StatisticParameter (estimate)

sample proportion

sample mean

true proportion

true mean

Page 20: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

“While no nation-wide survey focussing solely on attitudes towards crime and criminal justice issues has previously been conducted in New Zealand, some studies have touched on related topics. For example, in 1996, the National Survey of Crime Victims (Young et al. 1997)”….

Page 21: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Page 22: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Who do I need to survey?

Page 23: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSWho do I need to survey?

Define who your target population is. Examples:

Main household purchaser Eligible voters Recent insurance claimant

Page 24: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

The sample comprised 1,000 interviews amongst the general population aged 18 years and over (the main sample)

Person-to-person survey was conducted…

Page 25: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

How do I need to survey?

Types of surveys: The three most common types of surveys,

mail/web surveys telephone surveys Person-to-person interviews.

Page 26: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Types of surveys

Survey costs are lowest for mail/web surveys

More expensive for telephone surveys Most expensive for personal interviews

With well-trained interviewers, higher response rates and longer questionnaires are possible with personal interviews

The design of the questionnaire is critical

Page 27: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSWeb survey example:

Page 28: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSTelephone survey example

METHOD: Conducted by CATI (Computer Assisted Telephone Interviewing)

Page 29: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSHow much $$$ is needed?

Communication with Consumer Link

Page 30: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSHow much $$$ is needed?

Page 31: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSHow do I sample these people?

Non-representative samples Send letters out/ web requests 0800/0900

telephone requests – wait for replies Self-selection bias Convenience/judgment/snowball

sampling

Page 32: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSNon-representative samples

Sampling cost is lower and implementation easier Statistically valid statements cannot be

made about the precision of the estimates

There is some information but it cannot ‘retro-fitted’ to a different population

Why? You have no idea if the respondents are ‘representative’ of the people you are interested in.

Page 33: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Non-representative samples: Disaster

To prepare for her book Women and Love, Shere Hite (1976):

sent questionnaires to 100,000 women asking about love, sex, and relationships 4.5% responded Hite used those

responses to write her book

Page 34: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Non-representative samples: Disaster

Moore (Statistics: Concepts and Controversies, 1997) noted: respondents “were fed up with men and

eager to fight them…” “the anger became the theme of the

book…” “but angry women are more likely” to

respond

Page 35: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

When parts of the population cannot be selected...…the sample cannot representthe whole population.

Selection bias

Population

Sample

Page 36: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

How do I get representative samples?

Page 37: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSRepresentative samples

The method use to pick interviewees relies on the bedrock of random sampling: when the chance of selecting each

person in the target population is known, Then, and only then, do the results of

the sample survey reflect the entire population

This is the reason that interviews with 1,000 NZ adults can accurately reflect the opinions of more than ~2 million NZ adults

Page 38: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Representative = random sample

Each person in a population has a KNOWN RANDOM PROBABILITY of being selected

Arrange yourself randomly about room Distribute yourselves randomly in the

room E.g. randomly choose ½ of people from

today How?

Page 39: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Representative samples: sample frames

A critical element in any survey is to locate (or “cover”) all the members of the population being studied so that they have a chance to be sampled.

To achieve this, a list - termed a “sampling frame” - is usually constructed

The quality of the sampling frame is probably the dominant feature for ensuring adequate coverage of the desired population.

Page 40: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSSample frames

Any procedure and data that effectively enables the selection of a sample

Good frames require development and maintenance efforts E.g. Statistics NZ runs an annual survey

(the Annual Business Frame Update Survey) simply to update their Business Frame

Page 41: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSSample frames

Most frames are imperfect, exhibiting Undercoverage

Duplicated units (perhaps under different spellings or ID numbers)

Out-of-date or missing data

Population

Sample frame

Page 42: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSTelephone sampling of households

Under-coverage is a fundamental problem for telephone surveys of households Only 92% of households have a land-line Less than 80% of Maori or Pacific households Households without phones are also different in other ways; e.g. they are generally low-income households

Duplicates also occur i.e. some households have more than one phone number, and thus have more chance of being selected

Page 43: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSTelephone sampling frames …

White Pages Telecom sells random samples of listed

numbers Unlisted numbers not included

So have lost another 15% of phone numbers

May be cheaper to use paper directories instead, but these are out of date (even when just distributed)

Page 44: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSTelephone sampling frames …

Random digit dialing (RDD) Naïve approach

List all possible numbers, and select at random

Many non-working numbers - success rate <10%

Page 45: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSTelephone sampling frames …

Better approaches E.g. Mitofsky-Waksberg Take banks of possible phone numbers, and select phone numbers more intensively from banks that have larger proportions of listed numbers Increased hit rate to 60% in US Pseudo-RDD methods using banks centered on valid “seed” phone numbers are sometimes used

Page 46: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Household sampling for in-home surveys

Multi-stage approach widely used Area sample

take list of areas and select sample of areas 38,366 mesh blocks in NZ

Geostatistical System

Page 47: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Household sampling for in-home surveys

Household sample Interviewers list all dwellings within

selected mesh-blocks (following mesh-block maps)

Sample of households selected in each area

Variations on this approach exist Random route within area (i.e. route

follows rules from random starting point), or ignoring area boundaries

Page 48: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

“The main sample comprising 1006 adults was drawn from 1500 households in 14 locations throughout New Zealand.”

“The locations were defined in terms of region and area type and were designed to ensure a fully representative cross-section of the New Zealand population aged 18 years and over.”

Page 49: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating Case Study: Crime & Punishment

The population consists of all households in NZ

Sampling frame = area units

200 regions chosen randomly within 14 regional strata

5 households per region

Random adult chosen within each household

Page 50: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSBusiness frames

Business Directory Excellent frame held by Statistics NZ Contained 278,000 non-farming

enterprises in Feb ‘01 Not available for market research

surveys Other business frames are marketing

databases Dun & Bradstreet, UBD, Yellow Pages

Page 51: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Page 52: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Representative sampling strategies

Page 53: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Types of representative sampling strategies

Simple random sampling Stratified random sampling Cluster sampling Systematic sampling Quota/booster sampling Combinations of the above

Multistage sampling

Page 54: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Simple random sampling

Allocate labels 1, 2 …,N to population Randomly select sample of size, n, from

the above via: the use of random numbers,

This is used to ensure that each element in the sampled population has the same probability of being selected.

Page 55: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Stratified simple random sampling

The population is first divided into sub-groups, called strata

Take random sample from each strata The basis for forming the various strata

depends on the amount of info. known about sample frame

Can lead to more accurate estimates

Page 56: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Stratified simple random sampling…

Strata can be region of country (rural/urban) used in political polls

Other auxiliary information – e.g. sex, income, age…

Especially useful for customer data base If you sample in direct proportion to

strata size, you reduce variation in estimates

Page 57: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Cluster sampling

Cluster sampling requires that the population be divided into N groups of elements called clusters.

We then select a simple random sample of n clusters.

A primary application of cluster sampling involves area sampling, where the clusters are counties, city blocks, or other well-defined geographic sections.

Can increase variation as no longer information may not be ‘unique’ for individuals with in cluster

Page 58: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSSystematic sampling

Choosing, say, every 10th person in your data frame

Assumes no relationship between selection choice and sampling frame

Used in transportation studies…

Page 59: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSQuota/booster sampling

Some groups are of particular interest E.g., In NZ Maori/PI people

In SRS we will typically get smaller proportions of these people – as it will reflect general population

So these people are contacted until pre-specified numbers are reached so we can do more in depth analysis

Strictly speaking this is not a random sample

Page 60: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

The sampling frame consists of all households in NZ

200 regions chosen randomly within 14 regional strata

5 households per region

Random adult chosen within each household

Page 61: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

The sampling frame consists of all households in NZ

200 Regions chosen randomly within 14 regional strata

Page 62: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

Sample design: “The sample design

used by ACNielsen in the Ministry’s project is best described as a fully national multi-stage stratified probability sample with clustering.”

Page 63: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

Quota/ Booster samples

“The main sample was supplemented with ‘booster’ samples of 250 Mäori and 250 Pacific Peoples adults aged 18 years and over.”…

Page 64: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Accuracy statements

Sampling Errors vs. Non sampling errors

Page 65: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSSampling errors

This is not an "error" in the sense of making a mistake. Rather, it is a measure of the possible range of approximation in the results because a sample was used

Interviews with a representative sample of 1,000 adults can accurately reflect the opinions of nearly ~2 million NZ adults

This range of possible results is called the error due to sampling, often called the margin of error (MOE)

Page 66: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

More on sampling – a heuristic presentation

Page 67: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Population distribution, e.g. income

m ( population mean)

Sampling errors

Sampling error The sample mean falls here only because certain randomly selected observations were included in the sample

Sample

( )x sample mean

Page 68: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSMargin of error

A margin of error of 3% means that over the long run, 95% of the samples would give results within plus or minus 3% of the truth. 5% of the time the error would be greater

Quick method to calculate MOE for a proportion from a simple random sample:

n1Error ofMargin

where n is the sample size.

Page 69: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSSampling errors

This does not address the issue of whether people cooperate with the survey, or if the questions are understood, or if any other methodological issue exists. The sampling error is only the portion

of the potential error in a survey introduced by using a sample rather than interviewing the entire population

Page 70: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Example: One News Colmar Brunton Poll

Page 71: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Example: One News Colmar Brunton Poll

MOE: Based on the total sample of 1000 Eligible Voters, the maximum sampling error estimated is plus or minus 3.2%, expressed at the 95% confidence level

Looking for a difference between parties at any point in time

Needs to be a difference of 2xMOE % =6.4%

Page 72: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Example: One News Colmar Brunton Poll

Page 73: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Meanwhile, in the US, Bush and approval

Page 74: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Meanwhile, in the US, Bush and approval

This chart plots all the different polls (grey dots) at once; the blue line is the estimated approval rate over time while the scatter of grey dots provides an estimate of the reliability of the blue line 

Different polls are different random samples of the population  Random sampling is not fool-proof; any one sample has a chance, albeit small, to poorly represent the population.  That's why the dots add greatly to the chart

Page 75: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Non-sampling errors…

Process errors: Examples include measurement error,

interviewer error, and processing error.

It can be minimised by proper interviewer training, good questionnaire design, pre-testing, and careful management of the data recording process.

The problem is most serious when a bias is created.

Page 76: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Errors in data acquisition: Selection bias Randomly select people – don’t let

them/you select these people!! Non-response errors

Anonymity, questionnaire design, relevance

Call backs, substitution, re-weighting data

Non-sampling errors…

Page 77: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

“In order to maximise the chances of obtaining interviews at initially-selected dwellings and to minimise replacement of dwellings, a maximum of three trips into any urban area and two trips into rural areas were permitted.”

“Up to six call-backs were made to a household before it was replaced …”

Page 78: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

…then the sample mean is affected

Non-sampling error

Sampling error + Non–sampling error

Population

Sample

Page 79: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSNon-sampling errors

Never be fooled by the number of responses Literary Digest's non-representative

(self-selection) sample of 12,000,000 people said Landon would beat Roosevelt in the 1936 Presidential election

Page 80: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSNon-sampling errors

Increasing sample size will not reduce all of the above types of errors!

Think long and hard about how any of these errors may occur

Page 81: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSDealing with non-sampling errors…

Mistakes – check/ re-check data Rule of thumb –if it’s too good to be true,

it is Training of interviewers Pilot questionnaire Wording of the questions

Page 82: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Developing the questionnaire

Page 83: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSA good questionnaire must:

Address the research questions of interest Ask short, simple, and clearly-worded

questions Usually, start with demographic questions

to help respondents get started comfortably

Use dichotomous and multiple-choice questions.

Be as short as possible

Page 84: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSA good questionnaire must:

Use open-ended questions cautiously Avoid using leading questions

“Should a smack as part of good parental correction be a criminal offence in New Zealand?"

Pretest a questionnaire on a small number of people

Think about the way you intend to use the collected data when preparing the questionnaire

Questions will also depend on how you are getting the data e.g. CATI, person to person, mail/web

Page 85: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSUsing focus groups

If possible, focus groups are a great way to assist in questionnaire design

In-depth discussion by trained interviewer for small group of people

Great way to understand the language used by people

Gets to the ‘qualities’ of interest Can eliminate your biases/assumptions

Page 86: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

Page 87: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

Page 88: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

What type of population are you sampling?

Consider number of qualities respondents possess: Education (specifically reading level)

Web/mail surveys Limits of attention

avoid fatiguing respondents telephone surveys – very important

Page 89: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

What type of population are you sampling?

Motivation Why is respondent going to/not

participate Political polls Do I need incentives $$$$

Page 90: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSSome types of questions

Reports of fact - self-disclosure of some objective information e.g., age, sex, education, behavior.

Ratings of opinion or preference - evaluative response to statement e.g., satisfaction, agreement, like/dislike.

Reports of intended behavior - self-disclosure of motivation or intention e.g., likeliness to purchase.

Page 91: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

What type of response format is appropriate for each question?

Open-ended questions permits subject freedom to answer

question in own words. without pre-specified alternatives.

Page 92: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSOpen-ended questions

Advantages: Obtains unanticipated answers May better reflect respondent’s

thoughts/beliefs Appropriate when list of possible

answers is excessive

Disadvantages: Flexibility in responses difficult to code

and analyse Provides incomplete or unintelligible

answers

Page 93: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSClose-ended questions

Subject selects from list of pre-determined, acceptable responses

Can sometimes use other to specify

Page 94: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSTypes of closed-ended questions

Checklists - respondent selects certain number of pre-specified categories (nominal data)

Types of Exercises: Aerobics Basketball Swimming Weightlifting

Page 95: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSTwo-way forced choice

respondent must select between two alternatives (crude ordinal/nominal)

Do you always wakeup before 8:00am?

Yes No

Page 96: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Ranked

respondent must place items in order of importance or value (ordinal)

Rank in order of importance: Career Social life Love life Children

Page 97: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Multiple-Choice (Likert scale)

respondent selects between range of alternatives along pre-specified continuum (ordinal/interval?)

Strongly StronglyAgree Agree Neutral Disagree Disagree

1 2 3 4 5

Page 98: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSClosed-ended questions

Advantages: Obtains more reliable answers Meaning of responses more meaningful to

researcher Straightforward analysis

Disadvantages: Answers relative to response scale

provided Respondent's choice not among listed

alternatives Choices listed communicate kind of

response wanted

Page 99: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSWriting good survey questions

Differences in answers should stem from differences among respondents rather than differences in the stimuli

Question's wording is obviously a central part of the stimulus

Page 100: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSSimple sentences

No double negatives It is not the case that I have never

cheated on my tax returns Eliminate vagueness or poorly-defined

terms How many times in the past year have

you talked with a doctor about your health?

Objectionable/Irrelevant question How old are you?

Page 101: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSDiscrete questions/responses

Exhaustive/mutually exclusive categories How did you last travel to the

supermarket? car, bus, foot, walking, public

transportation

Page 102: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSDiscrete questions/responses

Page 103: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSLimit response format (7±2)

Even vs. odd categories Allow expression of variability

Strongly Agree Disagree Strongly Agree Disagree

Strongly Agree Neutral Disagree StronglyAgree Disagree

Page 104: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSMatch response to item

Frequency (Never-All the time) Likert Scaling (Disagree-Agree) Quality (Poor-Excellent) Service (Not Well-Extremely Well)

Page 105: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSOverall format

General to specific order of questions

Employ "filtering" questions (If “Yes”)

Mix question/response types to remove response bias

Minimise judgment and emphasise accuracy (social desirability)

Page 106: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Example: One News Colmar Brunton Poll

Party Support “Under MMP you get two votes. One is for a political party and is called a

party vote. The other is for your local M.P. and is called an electorate vote.”

Page 107: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Example: One News Colmar Brunton poll

Party Vote* “Firstly thinking about the Party Vote

which is for a political party. Which political party would you vote

for?” IF DON’T KNOW –

“Which one would you be most likely to vote for?”

Page 108: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSAlways seek others’ advice

Pre-test on colleagues Ask for outside advice Run a pilot study After a while you can become too close to

the subject and a fresh perspectives are needed

Page 109: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Presenting the results

The data is the story, not the graph

Page 110: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Published stats/proxies example: Race and politics in New Caledonia

Page 111: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

“Only 5% of the sample were within the correct range in their estimate of the amount of violent crime. “

Page 112: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Violent crime perception

0

5

10

15

20

25

30

35

<=10% 10-19 20-29 30-39 40-49 50-59 60-69 79-79 80-89

Violent crime/100 incidents

%

Motivating case study: crime & punishment

* Data made to fit original numbers – pseudo-fictitious

Actual rate 10%

Page 113: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

What are they trying to report here?

Order tables by most common crime to least

See if there are any changes over the years

Don’t use a 3D object when you are presenting 1D info.

Page 114: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Motivating case study: crime & punishment

% reported crime by type of crime

0

10

20

30

40

50

60

70

Dishon

esty

Drugs a

nd an

tisoc

ial

Violen

ce

Propert

y Dam

age

Propert

y Abu

se

Admini

strati

ve

Sexua

l

%

Crim

eThe story:

Page 115: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSLessons

Here, the real story was that people, on average, believed violence crime rate as being 5x worse than what is actually reported

Just because you can produce a pretty graph, doesn’t mean you should

The simplest graph shows the real story

Page 116: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSYou don’t have to be boring

Page 117: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSGraphical excellence

Show the data Make the viewer consider the substance

rather than the form Avoid distortion Present many numbers concisely Make large datasets coherent

Page 118: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSGraphical excellence…

Make your graphics friendly: Avoid abbreviations and encodings. Run words left-to-right. Explain data with little messages. Label graphic; don’t use elaborate shadings and a complex legend. Avoid red/green distinctions. Use clean serif fonts in mixed case.

Page 119: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSTabular displays

CRIME GROUP (%) 1998 1999 2000Dishonesty 63 61 60

Drugs and antisocial 12 13 13

Violence 9 9 10

Property Damage 8 9 10

Property Abuse 5 5 5

Administrative 3 3 3

Sexual 1 1 1

Page 120: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSTabular excellence

Encourage comparisons Reveal the data at several levels of detail Serve a clear purpose: description,

exploration … Be closely integrated with the text

Page 121: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSTabular excellence…

Round drastically Arrange the numbers to be compared in

columns, not rows Order the columns by size (or some other

natural ordering) Use row and column averages as a focus Provide verbal summaries

Page 122: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSGetting it right …

Presentations largely stand or fall on the quality, relevance, and integrity of the content. If your numbers are boring, then you've got the wrong numbers. If your words or images are not on point, making them dance in colour won't make them relevant. Audience boredom is usually a content failure, not a decoration failure.

Edward Tufte, writing in Wired MagazineSept 2003

Page 123: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Managing the beast

Page 124: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSKeep it simple

Bad survey statement: "We want to establish fiscal parameters

in the customer decision making process in the plumbing and bathroom products arenas, testing price points and elasticity. After gaining this information, we will analyze its effects on marketing strategies and tactics."

Page 125: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSKeep it simple

Good survey statement: "We want to know how much customers

are willing to pay for sinks to see if we can make more money."

The clearer you see the target, the more easily you can see if you hit it or not.

Page 126: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSAlways communicate

Always discuss: Your goals What you know/don’t know What you need

Give clear expectations/timelines Be flexible – situations change

Page 127: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSAlways communicate …

Be prepared to make mistakes Fix them quickly Be honest Assume nothing If any thing can go wrong it will

Does “anal retentive” have a hyphen in it?

Page 128: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSAlways communicate …

Ask for assistance Use professional data collection/research

agencies They are the experts

Page 129: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Understand the Problem

Identify Questions

Refine/Revise Questions

Choose Design

Inventory Resources

Assess Feasibility

Determine Trade-offs

STAGE 1:RESEARCHDEFINITION

STAGE 2:RESEARCHPLAN/DESIGN

The process

Page 130: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSIs it worth all the effort?

Compared to the alternative? Yes. Because reputable surveying

organisations consistently do good work In spite of the difficulties, surveys correctly

conducted are still the best objective measure of the state of the views of the population of interest

Page 131: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICSA quote …

“Why do they call it common sense?

It isn’t that common.”

- Mark Twain

Page 132: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Fini

Page 133: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Presentation

Page 134: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Crime & punishment case study

Page 135: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

‘Big drink’ proposal

Page 136: The Survey Cycle

How

I Le

arne

d to

Sto

p W

orry

ing

and

Love

the

Sur

vey

Cyc

le

DEPARTMENT OF STATISTICS

Statistics NZ’sA Guide To Good Survey

Design