Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information… · 2018. 3. 4. · College of...
Transcript of Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information… · 2018. 3. 4. · College of...
College of Education
School of Continuing and Distance Education 2014/2015 – 2016/2017
Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information: [email protected]
Session Overview
• In this Session we will discuss Sampling in Psychological Research and sample size determination. Sampling is the process of selecting units (e.g., people, organizations) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen.
• We will describe probability and non-probability methods and the different types of each method. At the end of the session you will be to explain the difference between probability and nonprobability sampling, and describe the major types of both sampling methods.
Session Outline
The key topics to be covered in the session are as follows:
• Topic One: What is Sampling?
• Topic Two: Types of Sampling - Probability
• Topic Three: Types of Sampling – Non-Probability
• Topic Four: Determining Sample Size
Reading List
• Cozby, P. C. (2004). Methods in behavioral research (8th Ed.). Mayfield Pub. Co. CA.
• http://open.lib.umn.edu/psychologyresearchmethods/ (Chapter 9, pages 165-167). Please refer to Sakai for the PDF version of this textbook.
WHAT IS SAMPLING? Topic One
SAMPLING
• A sample is “a smaller collection of units from a population used to determine truths about that population” (Field, 2005)
• Why do we sample?
– Lack of Resources (time, money) & workload
– Gives results with known accuracy that can be calculated mathematically
• What is a sampling frame? – The list from which the potential respondents
are drawn
Steps in Sampling Process
• Definition of target population
• Selection of a sampling frame (list)
• Probability or Nonprobability sampling
• Sampling Unit
• Error
– Random sampling error (chance fluctuations) – Nonsampling error (design errors)
Step 1 - Target Population
• Who has the information/data you need?
• How do you define your target population?
- Geography/location
- Demographics
- Use
- Awareness
Step 2 - Sampling Frame
• List of elements
• Sampling Frame error
– Error that occurs when certain sample elements are not listed or available and are not represented in the sampling frame
Step 3 - Probability or Nonprobability
Probability Sample
A sampling technique in which every member of the population will have a known, nonzero probability of being selected
Non-Probability Sample – Units of the sample are chosen on the basis of personal
judgment or convenience – There are NO statistical techniques for measuring random
sampling error in a non-probability sample – generalizability is never statistically appropriate
SAMPLING
• 3 factors that influence sample representative-ness
• Sampling procedure
• Sample size
• Participation (response rate)
• When might you sample the entire population? • When your population is very small
• When you have extensive resources
• When you don’t expect a very high response
TYPES OF SAMPLING – PROBABILITY SAMPLING
Topic Two
Probability Sampling Methods
Simple Random Sampling
the purest form of probability sampling.
Assures each element in the population has an equal chance of being included in the sample
Random number generators
Probability of Selection = Sample Size
Population Size
Simple random sampling
Advantages
Minimal knowledge of population needed
External validity high
Internal validity high
Easy to analyze data
Disadvantages
High cost; low frequency of use
Requires sampling frame
Not applicable when the population is large
Likelihood of exclusion minority or sub groups
Does not use researchers’ expertise
Larger risk of random error than stratified
SYSTEMATIC SAMPLING
• Systematic sampling relies on arranging the target
population according to some ordering scheme and then selecting elements at regular intervals through that ordered list.
• Systematic sampling involves a random start and then proceeds with the selection of every kth element from then onwards. In this case, k=(population size/sample size).
SYSTEMATIC SAMPLING
• It is important that the starting point is not automatically the first in the list, but is instead randomly chosen from within the first to the kth element in the list.
• A simple example would be to select every 10th name from the telephone directory (an 'every 10th' sample, also referred to as 'sampling with a skip of 10').
Systematic sampling
Systematic Sampling
• ADVANTAGES:
• Sample easy to select
• Suitable sampling frame can be identified easily
• Sample evenly spread over entire reference population
• DISADVANTAGES:
• Sample may be biased if hidden periodicity in population coincides with that of selection.
• Difficult to assess precision of estimate from one survey.
Stratified Sampling
• If the population has identifiable subgroups sample selection is selected based on the subgroup (stratum).
• Every unit in a stratum has same chance of being selected.
• Using same sampling fraction for all strata ensures proportionate representation in the sample.
• Adequate representation of minority subgroups of interest can be ensured by stratification & varying sampling fraction between strata as required.
• Identify variable(s) as an efficient basis for stratification. Must be known to be related to dependent variable. Usually a categorical variable
• Complete list of population elements must be obtained
• Use randomization to take a simple random sample from each stratum
Stratified Sampling
Stratified Sampling
Types of Stratified Samples
Proportional Stratified Sample: The number of sampling units drawn from each
stratum is in proportion to the relative population size of that stratum
Disproportional Stratified Sample: The number of sampling units drawn from each
stratum is allocated according to analytical considerations e.g. as variability increases sample size of stratum should increase
Stratified Sampling
Advantages
Assures representation of all groups in sample population needed
Characteristics of each stratum can be estimated and comparisons made
Reduces variability from systematic
Stratified Sampling
• Limitations
– First, sampling frame of entire population has to be prepared separately for each stratum
– Requires accurate information on proportions of each stratum
– Stratified lists costly to prepare
– Finally, in some cases (such as designs with a large number of strata, or those with a specified minimum sample size per group), stratified sampling can potentially require a larger sample than would other methods
The primary sampling unit is not the individual element, but a large cluster of elements. Either the cluster is randomly selected or the elements within are randomly selected
Frequently used when no list of population available or because of cost
Is the cluster as heterogeneous as the population? Can we assume it is representative?
Cluster Sampling
Cluster Sampling
• Cluster sampling is an example of 'two-stage sampling' .
• First stage a sample of areas is chosen;
• Second stage a sample of respondents within those areas is selected.
• Population divided into clusters of homogeneous units, usually based on geographical contiguity.
• Sampling units are groups rather than individuals.
• A sample of such clusters is then selected.
• All units from the selected clusters are studied.
Cluster Sampling
Two types of cluster sampling methods.
One-stage sampling. All of the elements within selected clusters are included in the sample.
Two-stage sampling. A subset of elements within selected clusters are randomly selected for inclusion in the sample.
Advantages
Low cost/high frequency of use
Requires list of all clusters, but only of individuals within chosen clusters
Can estimate characteristics of both cluster and population
For multistage, has strengths of used methods
Often used to evaluate vaccination coverage in EPI
Cluster Sampling
Cluster Sampling
Disadvantages
Larger error for comparable size than other probability methods
Multistage very expensive and validity depends on other methods used
TYPES OF SAMPLING – NON-PROBABILITY SAMPLING
Topic Three
Quota Ssampling
• The population is first segmented into mutually exclusive sub-groups, just as in stratified sampling.
• Then judgment used to select subjects or units from
each segment based on a specified proportion.
• For example, an interviewer may be told to sample 200 females and 300 males between the age of 45 and 60.
• It is this second step which makes the technique one of non-probability sampling.
QUOTA SAMPLING
• It is this second step which makes the technique one of non-probability sampling.
• In quota sampling the selection of the sample is non-random.
• For example interviewers might be tempted to interview those who look most helpful. The problem is that these samples may be biased because not everyone gets a chance of selection.
Convenience Sampling
• Sometimes known as grab or opportunity sampling or accidental or haphazard sampling.
• The researcher using such a sample cannot scientifically make generalizations about the total population from this sample because it would not be representative enough.
• For example, if the interviewer was to conduct a survey at a shopping center early in the morning on a given day, the people that he/she could interview would be limited to those given there at that given time, which would not represent the views of other members of society in such an area, if the survey was to be conducted at different times of day and several times per week.
• This type of sampling is most useful for pilot testing.
Snowball
• Snowball sampling is a technique, in which existing study subjects are used to recruit more subjects into the sample
• Useful when the respondents are difficult to recruit
Judgmental or Purposive sampling
• The researcher chooses the sample based on who they think would be appropriate for the study. This is used primarily when there is a limited number of people that have expertise in the area being researched
DETERMINATION OF SAMPLE SIZE Topic four
Is Sample Size Important?
• Sample size calculations are important to ensure
that estimates are obtained with required precision
or confidence.
• In experiments concerned with detecting an effect
– if an effect deemed to be clinically or biologically important exists, then
there is a high chance of it being detected, i.e. that the analysis will be
statistically significant.
– If the sample is too small, then even if large differences are observed, it will
be impossible to show that these are due to anything more than sampling
variation.
Importance of Sample Size calculation
• Scientific reasons
• Ethical reasons
• Economic reasons
Scientific Reasons
• In a trial with negative results and a sufficient sample size, the result is concrete
• In a trial with negative results and insufficient power (insufficient sample size), may mistakenly conclude that the treatment under study made no difference
Ethical Reasons
• An undersized study can expose subjects to potentially harmful treatments without the capability to advance knowledge
• An oversized study has the potential to expose an unnecessarily large number of subjects to potentially harmful treatments
• Or lead to wrong conclusions
Economic Reasons
• Undersized study is a waste of resources due to its inability to yield useful results
• Oversized study may result in statistically significant result with doubtful clinical importance leading to waste of resources
Classic Approaches to Sample Size Calculation
• Precision analysis – Bayesian
• Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available
– Frequentist
• a type of statistical inference that draws conclusions from sample data by emphasizing the frequency or proportion of the data
• Power analysis – Most common
What Is Statistical Power? Essential concepts
• The null hypothesis Ho
• Significance level, α
• Type I error
• Type II error
Statistical Hypothesis Testing
• When you perform a statistical hypothesis test, there are four possible outcomes
– Whether the null hypothesis (Ho) is true or false
– Whether you decide either to reject, or else to retain, provisional belief in Ho
Statistical Hypothesis Testing
Decision
Ho is really true i.e., there is
really no effect to find
Ho is really false i.e., there really is
an effect to be found
Retain Ho
correct decision:
prob = 1 - α
Type II error: prob = β
Reject Ho
Type I error: prob = α
correct decision: prob = 1 - β
Type I Error- When Ho Is True & It is Rejected
• When there really is no effect, but the statistical test comes out significant by chance, you make a Type I error.
• When Ho is true, the probability of making a Type I error is called alpha (α). This probability is the significance level associated with your statistical test.
Type II Error- When Ho is False but You Fail To Reject It
• When, in the population, there really is an effect, but your statistical test comes out non-significant, due to inadequate power and/or bad luck with sampling error, you make a Type II error.
• When Ho is false, (so that there really is an effect there waiting to be found) the probability of making a Type II error is called beta (β).
The Definition Of Statistical Power
• Statistical power is the probability of not missing an effect, due to sampling error, when there really is an effect to be found.
• Power is the probability (prob = 1 - β) of correctly rejecting Ho when it really is false.
Calculating Statistical Power
Calculating Statistical Power Depends On 1. The sample size
2. The level of statistical significance required
3. The minimum size of effect that it is reasonable to expect.
Sample Size Equations
• There are several equations for calculating sample size but we will discuss one common example here
Determining The Sample Size With a Specified Level Of Precision
Calculate an initial sample size using the following equation:
2
22
B
sZn
n The uncorrected sample size estimate.
Zα The standard normal coefficient from the statistical table
s The standard deviation.
recall
n
xz
2
22
x
zn
2
22
B
zn
Determining Sample Size With a Specified Level Of Precision
Calculate an initial sample size using the following equation:
2
22
B
sZn
B The desired precision level expressed as half of the maximum acceptable confidence interval width. This needs to be specified in absolute terms rather than as a percentage.
Confidence level
Alpha (α) level Zα
80% 0.20 1.28
90% 0.10 1.64
95% 0.05 1.96
99% 0.01 2.58
Determining Sample Size With a Specified Level Of Precision
References
• Cozby, P. C. (2004). Methods in behavioral research (8th Ed.). Mayfield Pub. Co. CA.
• http://open.lib.umn.edu/psychologyresearchmethods/ (Chapter 9, pages 165-167). Please refer to Sakai for the PDF version of this textbook.
Thank You