sampling methods in research design

Bahir Dar University

Bahir Dar institute of technology

Faculty of computing

Department of computer science

By: -

Tesfahunegn Minwuyelet

[email protected]

Sampling methods in research design

Probability Sampling

1. Simple random sampling

2. Systematic sampling

3. Stratified sampling

4. Cluster sampling

5. Multi –stage sampling

Non-primality Sampling

1. Convenience sampling

2. Judgmental sampling

3. Quota sampling

4. Snowball sampling

5. Self-selection sampling

6. Expert sampling

mailto:[email protected]

A research design is the arrangement of conditions for the collection and analysis of data in a manner that

aims to address the research problem.

We may split the overall research design into three:

The sampling design - which deals with the method of selecting items to be observed.

The statistical design - which concerns with the question of how many items are to be

observed and how the information and data gathered are to be analysed.

The operational design - which deals with the techniques by which the procedures specified

in the sampling, Statistical and observational designs can be carried out.

Two Types of Sampling Methods:

Probability sampling: members of the population have a known chance of being selected

Non-probability sampling: the chances of selecting members from the population are

unknown.

1. Simple Random

A sampling procedure that ensures that each element in the population will have an equal chance

of being included in the sample.

Selection process

Identify and define the population

Determine the desired sample size

List all members of the population

Assign all members on the list a consecutive number

Select an arbitrary starting point from a table of random numbers and read the appropriate

number of digits

Advantage

o Easy to implement with random dialing

o Easy to conduct

o .high probability of achieving a representative sample

o Meets assumption of any statistical procedures.

Disadvantage

o Requires list of population elements.

o Time consuming

o Larger sample needed

o Produces larger errors

o High cost

o Identification of all members of the population can be difficult

o Contacting all members of the sample can be difficult

2. Systematic Sampling

Selecting every Kth

subject from a list of the members of the population

A simple process

Every nth name from the list will be drawn

selection procese



Obtain a list of the population

Determine what k is equal to by dividing the size of the population by the desired

sample size

Start at some random place in the population list

Take every kth

individual on the list

Advantage

o Simple to design

o Easier to than simple random

o Easy to determine sampling distribution of mean or proportion.

Disadvantage

o Periodicity within population may skew sample and results.

o Trends in list may bias results

o Moderate cost

3. Stratified sampling

Probability sample

Subsamples are drawn within different strata

Each stratum is more or less equal on some characteristics

Do not confuse with quota sample

The population is divided into two or more groups called strata, according to

some criterion, such as geographic location, grade level, age, or income, and

subsample are randomly selected from each strata.

Selection process



Identify the variable and subgroups(i.e. strata) for which you want to guarantee

appropriate representation

Classify all members of the population as members of one of the identified

subgroups

Advantage

o Control of sampling size in strata

o More accurate sample

o Can be used both proportional and non-proportional samples

o Increased statistical efficiency

o Provides data to represent and analyze subgroups

o Enables use of different methods in strata

Disadvantage

o Increased error if subgroups are selected at different rates and identifying

members of all subgroups can be difficult

o Especially expensive if strata on population must be created

o High cost

o Identification of all members of the population can be difficult

4. Cluster sampling

The purpose of cluster sampling is to sampling economically while retaining the

characteristics of a probability sample/

The primary sampling unit is no longer the individual element in the population

The primary sampling unit is a larger cluster of elements located in proximity to

one another

The process of randomly selecting intact groups, not individuals, within the

defined population sharing similar characteristics.

Clusters are location within which an intact group of members of the population

can be found.

Selection process



Identify and define a logical cluster

List all cluster that make up the population of clusters

Estimate the average number of population members per cluster

Determine the number of cluster needed by dividing the sample size by the

estimated size of a cluster

Randomly select the needed numbers of the clusters

Include in the study all individuals in each selected cluster

Advantage

o Provides an unbiased estimate of population parameters if properly done.

o Economically more efficient than simple random

o Easy to do without list

o Very useful when population are larger and spread over a large geographic

region

o Convenient and expedient

o Do not need the name of everyone in the population

Disadvantage

o Often lower statistical efficiency due to subgroups being homogeneous rather

than heterogeneous

o Moderate cost

o Representation is likely to become an issue

5. Multistage sampling: -

Tesfish

Strikeout

Tesfish

Strikeout

Can substantially reduce sampling costs, where the complete population list

would need to be constructed (before other sampling methods could be applied).

By eliminating the work involved in describing clusters that are not selected,

multistage sampling can reduce the large costs associated with traditional cluster

sampling. However, each sample may not be a full representative of the whole

population.

6. Convenience sampling

The process of including whoever happens to be available at the time …called

“accidental” or “haphazard” sampling.

Advantage

o A sample selected for ease of access, immediately known population group

and response rate.

Disadvantage

o Difficult in deterring how much of the effect(dependent variable) result

from the cause(independent variable)

o Cannot generalize finds (do not know what population group the sample is

representative of) so cannot move beyond describing the sample.

o Problems of reliability

o Do respondents represent the target population

o Results are not generalizable

7. Purposive sampling

The process whereby the researcher selects a sample based on experience or

knowledge of the group to be sampled … called “judgment” sampling

Advantage

o Based on the experienced person judgment

Disadvantage

o Potential for inaccuracy in the researcher’s criteria and resulting sample

selection

o Cannot measure the representativeness of the sample

Tesfish

Strikeout

8. Quota sampling

The process whereby a researcher gathers data from individuals processing

identified characteristics and quotas.

Non-proportionate quota sampling

o Use when it is important to ensure that a number of sub-groups in the field of

study are well-covered.

o Use when you want to compare results across sub-groups.

o Use when there is likely to a wide variation in the studied characteristic within

minority groups.

Method

o Identify sub-groups from which you want to ensure sufficient coverage.

o Specify a minimum sample size from each sub-group.

Example

A study of the prosperity of ethnic groups across a city specifies that a minimum

of 50 people in ten named groups must be included in the study. The distribution

of incomes across each ethnic group is then compared against one another.

Discussion

In proportionate quota sampling, the sample size from each sub-group is

proportionate to the size of the sub-group in relation to the overall the population.

The non-proportionate method does not do this balancing, perhaps because the

exact proportions are not known. In the proportionate method, if the sub-group is

2% of the population and 100 people are being studied, it may be feared that

sampling only 2 people in a group may give results that are not typical for that

group. A minimum of 10 people taken from each sub-group would reduce the

chance of non-typical people biasing the results.

Advantage

o Contains specific subgroups in the proportions desired

o May reduce bias

o Easy to manage, quick

o Used when research budget is limited

o Very extensively used/understood

o No need for list of population elements

Disadvantages

o People who are less accessible (more difficult to contact, more reluctant to

participate) are under-represented

o Only the selected traits of the population were taken into account in forming the

subgroups

o Dependent on subjective decisions

o Not possible to generalize

o Only reflects population in terms of the quota, possibility of bias in selection, no

standard error

o Time consuming

o Project data beyond sample not justified

o Variability and bias cannot be measured/controlled

9. Snowball sampling

It is when you don’t know the best people to study because of the unfamiliarity of the

topic or the complexity of events.so you ask participants during interviews to suggest

other individuals to be sampled.

It can be classified into three

o Linear snowball sampling

o Exponential non-discriminative snowball sampling

o Exponential discriminative snowball sampling

Advantage

o The chain referral process allows the researcher to reach population that is difficult to

sample when using other sampling methods.

o Cheap, simple and cost-efficient.

o Little planning and fewer workforce compared to other sampling techniques

o Identifying small, hard -to reach uniquely defined target population

o Useful in qualitative research

Disadvantage

o Little control over the sampling method

o Representativeness of the sample is not guaranteed

o Bias can be present

o Limited generalizability

o Not representative of the population and well result in a biased sample as it is

self-selecting.

10. Judgmental sampling

Involves selecting a group of people because they have particular traits that the

researcher wants to study

The type of sampling techniques is also known as purposive sampling and

authoritative sampling

Advantage

o There is an assurance of quality response

o Meet the specific objective

Disadvantage

Time consuming process

11. Multi-stage random sapling

Cluster sampling repeated at a number of levels.

Carried out in stages

Using smaller and smaller sampling units at each stage

Advantage

o More accurate

o More effective

Disadvantage

o Costly

o Each stage in sampling introduce sampling error-the more stage there

are , the more error there tends to be

12. Expert sampling

Expert sampling involves the assembling of a sample of with known or demonstrable

experience in some area. Often, we convene such a sample under the auspices of a

"panel of experts." There are actually two reasons you might do expert sampling.

First, because it would be the best way to elicit the views of persons who have

specific expertise. In this case, expert sampling is essentially just a specific sub case

of purposive sampling. But the other reason you might use expert sampling is to

provide evidence for the validity of another sampling approach you've chosen. For

instance, let's say you do modal instance sampling and are concerned that the criteria

you used for defining the modal instance are subject to criticism.

Advantage

o The advantage of doing this is that you aren't out on your own trying to defend

your decisions -- you have some acknowledged experts to back you.

Disadvantage

o The disadvantage is that even the experts can be, and often are, wrong.

13. Steps in Sampling Process

Defining the population

Specifying the sampling unit

Specifying the sampling frame ( the means of representing the elements of the

population)

Specifying the sampling method

Determining the sampling size

Specifying the sampling plan

Selecting the sample

Sample size

The determination of sample size is a common task for many organizational researchers.

Inappropriate, inadequate, or excessive sample sizes continue to influence the quality and

accuracy of research. A discussion and illustration of sample size formulas, including the

formula for adjusting the sample size for smaller populations, is included. A table is provided

that can be used to select the sample size for a research problem based on three alpha levels and

a set error rate. Procedures for determining the appropriate sample size for multiple regression

and factor analysis, and common issues in sample size determination are examined. Formulas,

tables, and power function charts are well known approaches to determine sample size.

Foundations for Sample Size Determination

1. Primary Variables of Measurement

The researcher must make decisions as to which variables will be incorporated into formula

calculations. For example, if the researcher plans to use a seven-point scale to measure a

continuous variable, e.g., job satisfaction, and also plans to determine if the respondents

differ by certain categorical variables, e.g., gender, tenured, educational level, etc., which

variable(s) should be used as the basis for sample size? This is important because the use of

gender as the primary variable will result in a substantially larger sample size than if one

used the seven-point scale as the primary variable of measure.

2. Error Estimation formula uses two key factors:

(1) the risk the researcher is willing to accept in the study, commonly called the margin

of error, or the error the researcher is willing to accept, and

(2) the alpha level, the level of acceptable risk the researcher is willing to accept that the

true margin of error exceeds the acceptable margin of error; i.e., the probability that

differences revealed by statistical analyses really do not exist; also known as Type I error.

Another type of error will not be addressed further here, namely, Type II error, also

known as beta error. Type II error occurs when statistical procedures result in a judgment

of no significant differences when these differences do indeed exist.

3. Variance Estimation

There are four ways of estimating population variances for sample size determinations:

take the sample in two steps, and use the results of the first step to determine

how many additional responses are needed to attain an appropriate sample size

based on the variance observed in the first step data;

use pilot study results;

use data from previous studies of the same or a similar population; or

Estimate or guess the structure of the population assisted by some logical

mathematical results.

http://en.wikipedia.org/wiki/Sample_size

A researcher typically needs to estimate the variance of scaled and categorical variables.

Number of points on the scale

S = ---------------------------------------------

Number of standard deviations

Basic Sample Size Determination

Continuous Data

Before proceeding with sample size calculations, assuming continuous data, the researcher

should determine if a categorical variable will play a primary role in data analysis. If so, the

categorical sample size formulas should be used. If this is not the case, the sample size formulas

for continuous data described in this section are appropriate.

(t) 2* (s) 2

No= -----------------

(D) 2

Where t = value for selected alpha level of .025 in each tail = 1.96

Where s = estimate of standard deviation in the population = 1.167.

Where d = acceptable margin of error for mean being estimated = .21.

Therefore, for a population of 1,679, the required sample size is 118. However, since this sample

size exceeds 5% of the population (1,679*.05=84), Cochran’s (1977) correction formula should

be used to calculate the final sample size. These calculations are as follows:

No (118)

n= ------------------------------ = ----------------------------- = 111

(1 + no / Population) (1 + 118/1679)

Where population size = 1,679.

Where n0 = required return sample size according to Cochran’s formula= 118.

Where n1 = required return sample size because sample > 5% of population.

Categorical Data

The sample size formulas and procedures used for categorical data are very similar, but some

variations do exist. Assume a researcher has set the alpha level a priori at .05, plans to use a

Proportional variable, has set the level of acceptable error at 5%, and has estimated the standard

deviation of the scale as .5

(t) 2* (p) (q)

No= ---------------------

(D) 2

(1.96)2(.5) (.5)

No= ---------------------- = 384

(.05)2

Where t = value for selected alpha level of .025 in each tail = 1.96. (The alpha level of .05)

Where (p) (q) = estimate of variance = .25

Therefore, for a population of 1,679, the required sample size is 384. However, since this

Sample size exceeds 5% of the population (1,679*.05=84), Cochran’s (1977) correction

Formula should be used to calculate the final sample size. These calculations are as follows:

No

n1= ------------------------------

(1 + no / Population)

(384)

n1= ---------------------------- = 313

(1 + 384/1679)

Where population size = 1,679

Where n0 = required return sample size according to Cochran’s formula= 384

Where n1 = required return sample size because sample > 5% of population

Steps for using sample size tables

1. Postulate the effect size of interest, α, and β.

2. Check sample size table

2.1. Select the table corresponding to the selected α

2.2. Locate the row corresponding to the desired power

2.3. Locate the column corresponding to the estimated effect size.

2.4. The intersection of the column and row is the minimum sample size required.

Figure 1. The above table shows the sample size using table

sampling methods in research design

Education

Transcript of sampling methods in research design