Unit 2

28
UNIT-2: Sampling (Theory Only): Points to be covered: Terminology: population, Sample, Parameter, Statistics. Characteristics of ideal sample, population survey, sample survey, sampling errors, Sampling methods: procedure, merits and demerits of (i) Simple Random Sampling (ii) Stratified Random Sampling (iii) Systematic Sampling (iv) Cluster sampling

Transcript of Unit 2

UNIT-2: Sampling (Theory Only):

Points to be covered:

• Terminology: population, Sample, Parameter, Statistics.

• Characteristics of ideal sample, population survey, sample survey, sampling errors,

• Sampling methods: procedure, merits and demerits of

• (i) Simple Random Sampling (ii) Stratified Random Sampling (iii) Systematic Sampling (iv) Cluster sampling

Population

• A population is any entire collection of people,

animals, plants or things from which we may

collect data.

• It is the entire group we are interested in, which

we wish to describe or draw conclusions about.

• It is important that the investigator carefully and

completely defines the population before

collecting the sample, including a description of

the members to be included.

Sample:

• A sample is a group of units selected from a larger group (the population). By studying the sample it is hoped to draw valid conclusions about the larger group.

• A sample is generally selected for study because the population is too large to study in its entirety.

• The sample should be representative of the general population.

• Also, before collecting the sample, it is important that the researcher carefully and completely defines the population, including a description of the members to be included.

Parameter:

• A parameter is a value, usually unknown (and

which therefore has to be estimated), used to

represent a certain population characteristic.

• For example, the population mean is a parameter

that is often used to indicate the average value of

a quantity.

Statistics

• A statistic is a quantity that is calculated from a

sample of data. It is used to give information about

unknown values in the corresponding population.

• For example, the average of the data in a sample is

used to give information about the overall average in

the population from which that sample was drawn.

• Statistics are often assigned Roman letters (e.g. m

and s), whereas the equivalent unknown values in

the population (parameters ) are assigned Greek

letters (e.g. µ and ).

Characteristic of good sample:

1. The sample must be representative of the population. i.e. all the characteristics of the population should be in the sample.

2. Each unit of the population must be independent.

3. All units of the sample should be selected during the same period of time.

4. No unit of the population should be favoured at the time of selecting a sample.

5. All the units of the sample should be adequate. If less number of units are selected, better accuracy cannot be achieved. If more number of units are selected, the expenditure, time etc. Will be more.

Population Survey and Sample Survey

Sr. No. Population Survey Sample Survey

1. In population study all units are

examined and hence it requires more

time

In sample study few units are examined

and hence it requires less time

2. The cost of survey is more. The cost of survey is less

3. As more observations are to be

studied, the accuracy cannot be

maintained.

As less no. of observations are to be

studied, the desired accuracy can be

maintained.

4. As more persons are to be employed in

the survey work, experts may not be

available.

As few persons are to be employed the

experts can be appointed.

5. As all observations are to be studied,

detail study cannot be done.

As few observations are to be studied

detail study can be done.

6. When the field of inquiry is very large

this method is lengthy and difficult.

The sample survey is less lengthy and

relatively easy.

7. When complete information is needed

this method is to be used.

When the complete information of all

units is required, this method cannot be

used.

Sampling Error

• “Sampling error is the error that arises in a data collection process as a result of taking a sample from a population rather than using the whole population.

• Sampling error is one of two reasons for the difference between an estimate of a population parameter and the true, but unknown, value of the population parameter.

• The sampling error for a given sample is unknown but when the sampling is random, for some estimates (for example, sample mean, sample proportion) theoretical methods may be used to measure the extent of the variation caused by sampling error.”

Sampling Methods

(A)Simple Random Sampling:

• If each and every unit of the population is given

equal chance to enter into the sample, the method

of sampling is known as simple random sampling.

• If the given population is homogeneous, this

method of sampling gives very reliable results,

but if the population is not homogeneous, the

method does not give a representative sample.

• Random sampling plays a very important role in

higher studies.

Methods of Drawing a random sample

(1) Lottery Method:

• Very popular method of drawing a random sample.

• In this method the names or numbers of all the units of the population are written on different slips of paper.

• These slips should be identical in shape, size and color.

• The slips are folded up and placed in a container and are mixed up.

• From the container, the number of slips equal to the desired size of the sample are chosen at random.

• The names or numbers in the slips constitute e a random sample.

• In this method there is no partiality or prejudice against the selection of any unit of the population.

(2) Using random numbers tables:

• The lottery method of selecting a random sample

becomes tedious when the population is very large.

• In such cases a random sample can be drawn by

using tables of random numbers.

• Ransom numbers tables are scientifically prepared

and they are free from personal bias.

• Commonly used random numbers tables are (i)

Tippet’s random numbers table (ii) Fisher’s and

Yates's random numbers (iii) Tables of random

numbers by Rend corporation.

Advantages of random sampling:

1. As all the members of the population have equal chance of being selected in the sample, the selection is without any partiality or prejudice.

2. Random sample is a representative sample. As the size of the sample increase, the reliability of the result also increase.

3. The error committed about the result of the population from the results of the sample can be estimated.

4. The reliability of the results can also be checked by this method.

Limitations of Random Sampling

1. A list of all the units of the population is required. In absence of such a list, the method cannot be used.

2. S.R.S. can give reliable results only when the population is homogeneous.

3. The work of preparing slips or giving numbers is tedious and tiresome particularly when the population is large.

4. At times, the results obtained by the method are biased.

5. When the population is very small, this method fails to give reliable results

(2) Stratified Random Sampling:

• If the population is not homogeneous, i.e. there are more variations among the units of population, simple random sampling cannot be a representative sample.

• In this case, the population is divided into different parts, such that each part is internally homogeneous.

• These homogenous parts of the population are known as strata.

• Simple random samples are independently drawn from different strata and are combined into a single sample known as a Stratified Random Sample.

• The method of taking a sample in this way is called Stratified Random Sampling.

For example:

• Determining the standard of English of the

students of a commerce college we can divide

the students into three different strata, viz,

students of F.Y. class, S.Y. class and students

of T.Y. class.

• From each class i.e. from each stratum random

sample is drawn independently.

• When these samples are combined into a

single sample we get a stratified random

sample.

Advantages:

1. As the population is divided into different strata, all the parts of the population are adequately represented.

2. Administrative convenience increases in this type of sampling.

3. Each stratum is being internally homogenous even a small sample from that stratum will give reliable information about the stratum.

4. Sampling problems differ in different parts of the population.

5. If different standards of accuracy are required for different strata, this method is convient.

Limitation:

1. It is always difficult to divide the population into homogenous strata.

2. The results obtained by this method cannot be reliable if stratification is not proper.

3. The mathematics involved in estimating the population characteristics from the sample results is difficult in this method.

4. Simple random samples are to be drawn from different strata. In absence of required number of efficient persons, the desired standard of accuracy cannot be attained.

(3) Systematic Sampling

• Systematic sampling is a random sampling technique which is frequently chosen by researchers for its simplicity and its periodic quality.

• In systematic random sampling, the researcher first randomly picks the first item or subject from the population. Then, the researcher will select each n'thsubject from the list.

• The procedure involved in systematic random sampling is very easy and can be done manually.

• The results are representative of the population unless certain characteristics of the population are repeated for every n'th individual, which is highly unlikely.

• The process of obtaining the systematic sample is much like an arithmetic progression.

• Starting number:The researcher selects an integer that must be less than the total number of individuals in the population. This integer will correspond to the first subject.

• Interval:The researcher picks another integer which will serve as the constant difference between any two consecutive numbers in the progression.

• The integer is typically selected so that the researcher obtains the correct sample size

Advantages of Systematic Sampling

• The main advantage of using systematic sampling over simple random sampling is its simplicity.

• It allows the researcher to add a degree of system or process into the random selection of subjects.

• Another advantage of systematic random sampling over simple random sampling is the assurance that the population will be evenly sampled.

• There exists a chance in simple random sampling that allows a clustered selection of subjects.

• This is systematically eliminated in systematic sampling.

Disadvantage of Systematic Sampling

• The process of selection can interact with a

hidden periodic trait within the population.

• If the sampling technique coincides with the

periodicity of the trait, the sampling technique

will no longer be random and representativeness

of the sample is compromised.

(4) Cluster Sampling:

• The population is divided into N groups, called clusters.

• The researcher randomly selects n clusters to include in the sample.

• The number of observations within each cluster Mi is known, and M = M1 + M2 + M3 + ... + MN-1 + MN.

• Each element of the population can be assigned to one, and only one, cluster.

• This tutorial covers two types of cluster sampling methods.

• One-stage sampling. All of the elements within selected clusters are included in the sample.

• Two-stage sampling. A subset of elements within selected clusters are randomly selected for inclusion in the sample.

When to Use Cluster Sampling

• Cluster sampling should be used only when it is economically justified - when reduced costs can be used to overcome losses in precision. This is most likely to occur in the following situations.

• Constructing a complete list of population elements is difficult, costly, or impossible.

• For example, it may not be possible to list all of the customers of a chain of hardware stores.

• However, it would be possible to randomly select a subset of stores (stage 1 of cluster sampling) and then interview a random sample of customers who visit those stores (stage 2 of cluster sampling).

• The population is concentrated in "natural" clusters (city blocks, schools, hospitals, etc.).

• For example, to conduct personal interviews of operating room nurses, it might make sense to randomly select a sample of hospitals (stage 1 of cluster sampling) and then interview all of the operating room nurses at that hospital.

• Using cluster sampling, the interviewer could conduct many interviews in a single day at a single hospital.

• Simple random sampling, in contrast, might require the interviewer to spend all day traveling to conduct a single interview at a single hospital.

Advantages and Disadvantages of Cluster Sampling

• This sampling technique is cheap, quick and easy.

Instead of sampling an entire country when using

simple random sampling, the researcher can allocate

his limited resources to the few randomly selected

clusters or areas when using cluster samples.

• Related to the first advantage, the researcher can

also increase his sample size with this technique.

Considering that the researcher will only have to

take the sample from a number of areas or clusters,

he can then select more subjects since they are more

accessible.

• From all the different type of probability sampling, this technique is the least representative of the population.

• The tendency of individuals within a cluster is to have similar characteristics and with a cluster sample, there is a chance that the researcher can have an overrepresented or underrepresented cluster which can skew the results of the study.

• This is also a probability sampling technique with a possibility of high sampling error.

• This is brought by the limited clusters included in the sample leaving off a significant proportion of the population unsampled.

The Difference Between Strata and Clusters

• Although strata and clusters are both non-overlapping subsets of the population, they differ in several ways.

• All strata are represented in the sample; but only a subset of clusters are in the sample.

• With stratified sampling, the best survey results occur when elements within strata are internally homogeneous. However, with cluster sampling, the best results occur when elements within clusters are internally heterogeneous.

REFERENCES

• http://www.stats.gla.ac.uk/steps/glossary/basic_definitions.html

• http://stattrek.com/survey-research/cluster-sampling.aspx

• https://explorable.com/cluster-sampling