HR: Samples, Sampling, and Sample size

30
HR: SAMPLES, SAMPLING, AND SAMPLE SIZE A practical guide

description

HR: Samples, Sampling, and Sample size . A practical guide. Samples, Sampling, and Sample Size. Samples – Used in research (i.e. for estimation and hypothesis testing), concerns theories around sampling and why we sample (i.e. sampling distributions). - PowerPoint PPT Presentation

Transcript of HR: Samples, Sampling, and Sample size

Page 1: HR: Samples, Sampling, and Sample size

HR: SAMPLES, SAMPLING, AND SAMPLE SIZE

A practical guide

Page 2: HR: Samples, Sampling, and Sample size

Samples, Sampling, and Sample Size Samples – Used in research (i.e. for estimation and

hypothesis testing), concerns theories around sampling and why we sample (i.e. sampling distributions).

Sampling- The process of taking samples, must guard against bias and threats to validity.

Sample size- The practical issue of how many subjects or units are needed for valid estimation or inference.

Page 3: HR: Samples, Sampling, and Sample size

Process of SamplingInvolves:

1. Identification of study population (target)2. Determination of sampling population (sampling

frame)3. Definition of the sampling unit (individual, family,

etc.)4. Choice of sampling method (what is possible,

what is optimal)5. Estimation of the sample size (depends on study

question and study design)

Page 4: HR: Samples, Sampling, and Sample size

Basic Questions about Sampling Why sample?

Efficiency and quality Who to sample?

Usually a representation of the population of interest How to sample?

Use the sampling method most appropriate. Number to sample?

As many as required to so potential sampling error is limited.

Page 5: HR: Samples, Sampling, and Sample size

Why sample?To acquire information about larger populations Less costs Less field time When it’s impossible to study the whole population More accuracy -A Better Job of Data Collection (more time per

sample unit- higher quality data)

Page 6: HR: Samples, Sampling, and Sample size

Who to sample?

Sampling is the process of selection of a number of units from a defined study population.

The study or target population is the one upon which the results of the study will be generalized.

It is crucial that the study population is clearly defined, since it is the most important determinant of the sampling population

Identification of study population

Page 7: HR: Samples, Sampling, and Sample size

The Sampling Frame The sampling frame is the one from which

the sample is drawn. The definition of the sampling frame by the

investigator is governed by two factors: Feasibility: reachable sampling population External validity: the ability to generalize

from the study results to the target population.

Page 8: HR: Samples, Sampling, and Sample size

The Sampling Unit

To define the sampling unit set: Inclusion criteria Exclusion criteria

May sample individuals, households, or larger units.

Consider unit of analysis: individual income, household income, city median income.

Page 9: HR: Samples, Sampling, and Sample size

How to sample?

Non-probability sampling Probability sampling

Choices in sampling method

Page 10: HR: Samples, Sampling, and Sample size

Non-probability sampling: Types of non probability sampling:

Convenience sampling (selected from elements of a population that are easily accessible)

Quota sampling (set number by type) Purposeful sampling (You chose who you think should be in the study) Snowball sampling (friend of friend….etc.)

Not recommended in health research if generalization or statistical analysis is intended: By far the most biased sampling procedure as it is not random (not everyone in the

population has an equal chance of being selected to participate in the study). Analytical/statistical procedures usually assume the sampled units came randomly from

the assumed statistical distribution.

Page 11: HR: Samples, Sampling, and Sample size

Probability sampling“There is a known non-zero probability of selection

for each sampling unit” Types:

Simple random sampling Systematic random sampling Stratified random sampling Cluster sampling Others:

Multi-stage random sampling Multi-phase sampling

Page 12: HR: Samples, Sampling, and Sample size

Simple random sample In this method, all subject or elements have an

equal probability of being selected. There are two major ways of conducting a random sample.

The first is to consult a random number table, and the second is to have the computer select a random sample.

Enumeration required/assumed.

Page 13: HR: Samples, Sampling, and Sample size

Systematic random sample A systematic sample is conducted by randomly

selecting a first case on a list of the population and then proceeding every Nth case until your sample is selected. This is particularly useful if your list of the population is long.

For example, if your list was the phone book, it would be easiest to start at perhaps the 17th person, and then select every 50th person from that point on.

Sampling fraction: Ratio between sample size and population size

Page 14: HR: Samples, Sampling, and Sample size

Stratified sample In a stratified sample, we sample

either proportionately or equally to represent various strata or subpopulations.

For example if our strata were cities in a country we would make sure and sample from each of the cities. If our strata were gender, we would sample both men and women.

Page 15: HR: Samples, Sampling, and Sample size

Cluster sampling Cluster: a group of sampling units close to

each other i.e. crowding together in the same area or neighborhood

In cluster sampling we take a random sample of strata and then survey every member of the group.

For example, if our strata were individuals schools in a city, we would randomly select a number of schools and then test all of the students within those schools.

Page 16: HR: Samples, Sampling, and Sample size

Section 4Section 5

Section 3

Section 2Section 1

Cluster Samples of Households

Credit:Dr. Moataza Mahmoud Abdel WahabLecturer of BiostatisticsHigh Institute of Public HealthUniversity of Alexandria

Page 17: HR: Samples, Sampling, and Sample size

More Complex Sampling MethodsMulti-stage sampling Multi-phase sampling

State County Town

Households

Person

Population

Sample:T1 Test 1

Test 2Sample:T2

Page 18: HR: Samples, Sampling, and Sample size

Number to sample?

“How many subjects should be studied?”

The sample size depends on the following factors: I. Difference to be found II. Variability of the measurement III. Level of significance IV. Power of the study

Estimation of the sample size

Page 19: HR: Samples, Sampling, and Sample size

Difference to detect “The magnitude of the difference to

be detected” A large sample size is needed to

detection a small difference. Thus, the sample size is inversely

related to the precision of difference needed to detect.

Page 20: HR: Samples, Sampling, and Sample size

Variability of the measurement

The variability of measurements is reflected by the standard deviation or the variance.

The higher the standard deviation, the larger sample size is required.

Thus, sample size is directly related to the SD

Page 21: HR: Samples, Sampling, and Sample size

Level of significance Relies on α error or type I error. The usual

level of α has been arbitrarily set to 5% or 0.05.

Alpha error can be minimized to 0.01 or even 0.001 but this consequently increases the sample size.

Thus, sample size is inversely related to the level of α error.

Alpha Error is considered before the study begins, but is only important when a significant difference or association is found.

Page 22: HR: Samples, Sampling, and Sample size

Power of the study The power of the study is the probability that it will yield a

statistically significant result. It is related to β or type II error. Power is equal to (1- β), consequently the power of the study

is increased by decreasing the beta error. Thus, sample size is inversely related to the level of β error or

directly related to the power of the study.

Beta error is considered before the study begins, but is only of consequence when no difference of association is found (in hypothesis testing studies).

Beta error is not a consideration in surveys that are only estimating parameters (descriptive studies). Estimations are only concerned with confidence (i.e. confidence level) in the estimate.

Page 23: HR: Samples, Sampling, and Sample size

Sample Size related to the Research Question, Design, and Analysis

The research question usually informs on: variables to be considered and level of measurements to be used. it also points to design type and analysis to be used.

Research type/design may address: Exploration, description, estimation (Descriptive Studies) Hypothesis testing of differences or relationships (Analytic Studies) Modeling of variables for relationships or survival (Multivariable Studies)

Sample size must consider the type/design plus the measurement level of the variables. Descriptive studies only ask how good is the estimate (and alpha error

question) Analytics studies must also consider Power (a Beta error question) Additional variables (three or more) normally require larger sample sizes to

maintain power in subgroups.

Page 24: HR: Samples, Sampling, and Sample size

3. Interval level variable 1 sample:Where: 2 samples:Where:

4. Nominal level variable 1 sample:Where: 2 samples:Where:

Sample Size Determination: Calculations^

1. Interval level variablea. 1 sample:b. 2 samples:

2. Nominal level variablea. 1 sample:b. 2 samples:

Beta error not considered

For Confidence in an Estimation: Example: Survey data (descriptive)

For Hypothesis Testing:Example: analytic studies

2)n z

2)1()1(*

21

22

211

nn

snsn spooled

2)/)(1(n Ezpp

22211 )/)](1()1(n Ezpppp

^SEE: Sullivan, Lisa M. (2008). Essentials of Biostatistics in Public Health. Jones and Barlett, Sudbury Ma.

2i )*2n z

211

ESn

zz

211

ESn

zz

211

i ES2n

zz

211

i ES2n

zz

*

2 ES

0 ES

)1( ES

00

01

pppp

)1( ES 21

pppp

Page 25: HR: Samples, Sampling, and Sample size

Four Research Questions1. What is the blood sugar level in college

students?2. What proportion of male and female college

students smoke?3. Are smoking levels in college students different

from the overall population?4. Are blood sugar levels in college students

different between males and females?

Page 26: HR: Samples, Sampling, and Sample size

Question 1: What is the blood sugar level in college students?

Estimation/interval data/1 sample If 95% confidence needed, z= 1.96 Pilot survey estimate of standard deviation is 25 And, E (margin of error) is not to exceed 5mg/dl Then:

n= 96 2)n z

2)5/2596.1n x

Page 27: HR: Samples, Sampling, and Sample size

Question 2: What proportion of male and female college students smoke?

Estimation/nominal data/2 samples If 95% confidence needed, z= 1.96 Pilot survey estimate of male p = .25; female p= .2 And, E (margin of error) is not to exceed 10% (.1) Then:

n= 95 (per group)

22211 )/)](1()1(n Ezpppp

2)1./96.1)](2.1(2.)25.1(25.n

Page 28: HR: Samples, Sampling, and Sample size

Question 3: Are smoking levels in college students different from the overall population?

Hypothesis testing/nominal data/1 sample Set acceptable alpha at .05 (z1-a/2= 1.96); Power (1-B) at .8 (z1-

B=.84) Pilot survey estimate of college students p = .22; National

average p= .3 Then:

n= 2722

11

ESn

zz )1(

ES00

01

pppp

)3.1(3.3.22.

ES

2

.1784.96.1n

Page 29: HR: Samples, Sampling, and Sample size

Question 4: Are blood sugar levels in college students different between males and females?

Hypothesis testing/interval data/2 samples Set acceptable alpha at .05 (z1-a/2= 1.96); Power (1-B) at .8 (z1-B=.84) Pilot survey estimate of females, mean = 95mg/dl, sd =10; males

mean= 100mg/dl, sd = 10 Then:

n= 63 (per group)

1010095

ES

211

i ES2n

zz

2 ES

2

.584.96.12n

Page 30: HR: Samples, Sampling, and Sample size

Other sources on Samples and Sample Size: Many statistical programs have a sample size

generators. Example: “statcalc” utility in EpiInfohttp://www.cdc.gov/epiinfo/downloads.htm

Many web sites include sample size information:http://www.stat.uiowa.edu/~rlenth/Power/

Additional lecture materials on sampling and sample size:

http://www.pitt.edu/~super1/lecture/lec19041/index.htmhttp://www.pitt.edu/~super1/lecture/lec0542/index.htm