sampling.... Quota Sampling..

Sampling

Dr. Mary Wolfinbarger

Marketing Research

Sample vs. Census

Census -- every population member included

With sampling, researcher infers population characteristics from a sample

Why sample?

Saves money Saves time A sample can be more accurate; it has fewer

“nonsampling” errors than a census

Sampling terms Population (or Universe): a complete listing

of a set of elements having a given characteristic(s) of interest

An example of population definition: Americans Registered Voters Voters Swing voters

(Which is more relevant to politicians?)

Sampling terms

Element: Unit about which information is sought

Most common units in marketing: individuals households Sudman and Blair suggest a conceptual

sample: sales dollars or potential sales dollars

Sampling terms

Sample Frame: A list of population members May get a complete listing of population,

but often population and sample frame are different

Example: “American recreational tennis players?”

Differences between the sample frame and population: “sample frame error”

Sampling terms

Parameter: The actual characteristic of the population, the true value of which can only be known by taking an error-free census

Statistic: The estimate of a characteristic obtained from the sample

Sampling terms

Non-response error: error created when chosen sample members who do not participate

Non-response creates two problems: Need a larger initial sample size to allow

for non-response More seriously, non-respondents may differ

from respondents (“questionnaire freaks?”)

Sample Types

Two broad categories: Probability: each population element has a

known, non-zero chance of being included in the sample

Non-probability: cannot mathematically estimate the probability of a population element being included in the sample

Sample Types

Statistician’s opinion: all N-P samples are worthless because you cannot estimate the degree to which your results are generalizable

So, why are N-P samples ever used?

Non-probability Samples

Convenience Judgment Quota

Convenience Samples

“Accidental samples” -- those in sample are where the data is being collected

One major form in marketing: “Mall Intercept”

What do statisticians think? “Rarely do samples selected on a convenience sample basis, regardless of size, prove representative, and are not recommended for descriptive or causal research.”

Convenience Samples

I agree, but….

Minimizing drawbacks of convenience samples:

compare sample characteristics and findings to those collected on a census/random sample basis

speculate intelligently about bias, and how it is likely to have affected results

Convenience Samples

When possible, collect the sample where your population is likely to be (retailers collecting in-store surveys)

Cultivate diversity in the sample (e.g. mall intercept using multiple locations)

May be better at understanding relationships between variables than at making descriptive estimates

Judgment Samples

Also called purposive sampling Sample elements are hand picked because it

is felt that they are representative of some population of interest

Typically a small sample (maybe as small as 10) in which the researcher tries to represent all groups or segments from the population

Judgment Samples

Snowball design: a special form of judgment sample

Appropriate for small specialized populations

Each respondent is asked to identify one or more other population members

Judgment Samples

Drawbacks? Those with more ties to sample members

are selected Similar people are more likely to be named

Quota Sampling

Attempt to be representative by selecting sample elements in proportion to their known incidence in the population

Quota Sampling

Example: Surveying undergraduate students about campus food services

Step 1: Identify attributes researcher believes is important, e. g. sex and class level

Step 2: Look at incidence of sex and class level in population

Quota Sampling

Class Level

Freshmen 3200

Sophomores 2600

Juniors 2200

Seniors 2000

Sex

Males 4500

Females 5500

If I sample 100, how many of each type do I select?

Quota Sampling

Don’t be fooled -- relies on personal, subjective selection of quota attributes

The sample can still be non-representative with respect to some other characteristic (e.g. in this example, perhaps race)

I plead guilty -- I have sinned -- and will do so again -- …….so shoot me………….

Probability Sampling

Does not guarantee representativeness, but does allow for the assessment of sampling error

Sampling error: error that occurs because a sample rather than a census is used

Simple Random Sampling (SRS)

Each sample element has a known, non-zero, equal chance of being selected

Example: Lottery numbers Or, put everyone’s name in a hat Major polling firms use random digit dialing

to approximate random samples Or, use a random numbers table (actually

pseudo-random I’m told)

Systematic Sampling

Systematically spreads sample through a list of population members

Example: If a population contained 10,000 people, and need a size of 1000, select every 10th list name

In nearly all practical examples, the procedure results in a sample equivalent to SRS

Systematic Sampling

Only exception: when there are “regularities” in the list

Systematic Sampling

Another application of systematic sampling: select a number of millimeters or inches

down a page or column that will be selected (it’s easier than counting!)

Stratified Sampling

Information about subgroups in the sample frame is used to improve the efficiency of the sample plan

Stratified Sampling

Three major reasons to use

Some subgroups are more homogenous than others so fewer numbers are needed for those groups to obtain the same level of precision

Group comparison is the purpose of the study (disproportionate stratified sampling)

Some elements are more important in determining outcome of research interest than are others

How is this different from quota sampling? Within strata, selection of sample elements

is random, not first available

Bad Uses of Stratification

To satisfy people distrustful that random sampling will not be representative

To correct for MAJOR problems with survey cooperation

Poststratification is OK

Is done after sampling Corrects for MINOR differences between

sample and population produced by non-cooperation

Area (or Cluster) Sampling

Elements are geographically grouped into relatively homogenous clusters (e.g. a city is divided into 40 areas)

From these areas, 10 are randomly selected From these larger areas, blocks within areas

will be randomly selected Within each block, attempt to survey each

household

Area (or Cluster) Sampling

Especially useful for door-to-door personal surveys (significantly reduces costs)

However, clustering increases sampling errors (people who live close together tend to be more similar)

Statistics formula suggests in marketing research 20-25 clusters is appropriate with 20-25 observations per site

Determining Sample Size

Ad Hoc Methods (non-statistical)

Rules of thumb: Collect sample size large enough so that when divided into groups, each group will have a minimum sample of 100 or so (Sudman)

Budget constraints: calculate the cost of interview and data analysis per respondent. Divide total budget by this amount to get maximum sample size.

Ad Hoc Methods (non-statistical)

Comparable studies: Find similar studies which are successful and getting sufficiently reliable results

Most general formula

Total sampling error=

desired confidence level (Z)*standard deviation of sample (SD)/sample size (N)

Sampling error: the standard deviation of the distribution of sample means

Sampling error is expressed as an absolute, and is not a percentage: it is the amount your measurement is from the true value

Re-arranging Algebraically

N=Z22/(sampling error )2

Where N=sample size

Z=z score from normal curve table (1.96 for a confidence interval of 95%)

=standard deviation (obtained from previous survey or estimated, e. g. 95% of responses fall between 3 and 5, so 1 SD=.5)

Example:

For example, if allowable sampling error = .20 (on a 7 point scale), SD=1.34, and a confidence interval of .05 is being used,

N=1.962*1.342/.202

N=172

What this formula suggests

If the sample is more varied, a larger sample is required

If more precision is required, a larger sample is necessary

If a small confidence interval is desired, a larger sample is necessary

The increase required to achieve ever more precision and confidence increases at an increasing rate!

sampling.... Quota Sampling..

Documents

Transcript of sampling.... Quota Sampling..