Sampling distribution

30
A PRESENTATION ON Sampling Distribution A Brief Explanation

Transcript of Sampling distribution

Page 1: Sampling distribution

A PRESENTATION ON

Sampling Distribution

A Brief Explanation

Page 2: Sampling distribution

SAMPLING DISTRIBUTION

There are three distinct types of distribution of data which are –1.Population Distribution, characterizes the distribution of elements of a population 2.Sample Distribution, characterizes the distribution of elements of a sample drawn from a population3.Sampling Distribution, describes the expected behavior of a large number of simple random samples drawn from the same population.Sampling distributions constitute the theoretical basis of statistical inference and are of considerable importance in business decision-making. Sampling distributions are important in statistics because they provide a major simplification on the route to statistical inference.

Page 3: Sampling distribution

DEFINITION A sampling distribution is a theoretical probability

distribution of a statistic obtained through a large number of samples drawn from a specific population ( McTavish : 435)

A sampling distribution is a graph of a  statistics(i.e. mean, mean absolute value of the deviation from the mean,range,standard deviation of the sample, unbiased estimate of variance, variance of the sample) for sample data.

Sampling distribution is a theoretical distribution of an infinite number of sample means of equal size taken from a population . ( Walsh : 95)

Page 4: Sampling distribution

CHARACTERISTICS

Usually a univariate distribution.

Closely approximate a normal distribution.

Sample statistic is a random variable – sample mean , sample & proportionA theoretical probability distribution

The form of a sampling distribution refers to the shape of the particular curve that describes the distribution.

Page 5: Sampling distribution

FUNCTIONS OF SAMPLING DISTRIBUTION

Sampling distribution is a graph which perform several duties to show data graphically.Sampling distribution works for :MeanMean absolute value of the deviation from the meanRangeStandard deviation of the sampleUnbiased estimate of the sampleVariance of the sample

Page 6: Sampling distribution

WHY SAMPLING DISTRIBUTION IS IMPORTANT????

PROPERTIES OF STATISTICS

SELECTION OF DISTRIBUTIO TYPE TO MODEL SCORE

HYPOTHESIS TESTING

Page 7: Sampling distribution

i)Properties of Statistic : Statistic have different properties as estimators of a population parameters. The sampling distribution of a statistic provides a window into some of the important properties. For example if the expected value of a statistic is equal to the expected value of the corresponding population parameter, the statistic is said to be unbiased Consistency is another valuable property to have in the estimation of a population parameter, as the statistic with the smallest standard error is preferred as an estimator estimator A statistic used to estimate a model parameter.of the corresponding population parameter, everything else being equa.l

Page 8: Sampling distribution

ii) Selection of distribution type to model scores : The sampling distribution provides the theoretical foundation to select a distribution for many useful measures. For example, the central limit theorem describes why a measure, such as intelligence, that may be considered a summation of a number of independent quantities would necessarily be distributed as a normal (Gaussian) curve.

iii) Hypothesis Testing : The sampling distribution is integral to the hypothesis testing procedure. The sampling distribution is used in hypothesis testing to create a model of what the world would look like given the null hypothesis was true and a statistic was collected an infinite number of times. A single sample is taken, the sample statistic is calculated, and then it is compared to the model created by the sampling distribution of that statistic when the null hypothesis is true. If the sample statistic is unlikely given the model, then the model is rejected and a model with real effects is more likely.

Page 9: Sampling distribution

TYPES OF SAMPLING DISTRIBUTIONThe types of sampling distribution are as follows:1) Sampling Distribution of the Mean:Sampling distribution of means of a population data is defined as the theoretical probability distribution of the sample means which are obtained by extracting all the possible samples  having the same size from the given population.Given a finite population with mean (m) and variance (s2).  When sampling from a normally distributed population, it can be shown that the distribution of the sample mean will have the following properties -

Page 10: Sampling distribution

CENTRAL LIMIT THEOREM

The central limit theorem, first introduced by De Moivre during the early eighteenth century, happens to be the most important theorem in statistics. According to this theorem, if we select a large number of simple random samples, for example, from any population distribution and determine the mean of each sample, the distribution of these sample means will tend to be described by the normal probability distribution with a mean µ and variance /n.Or in other words, we can say that, the sampling distribution of sample means approaches to a normal distribution.Symbolically, the theorem can be explained as following :

Page 11: Sampling distribution

When given n independent random variables ,,,….. which have the same distribution ( no matter what distribution),then : X =

is a normal variate. The mean µ and variance of X are

=

where are the mean and variance of

Page 12: Sampling distribution

UTILITY : The utility of this theory is that it requires virtually no conditions on distribution patterns of the individual random variable being summed. As a result, it furnishes a practical method of computing approximate probability values associated with sums of arbitrarily distributed independent random variables. This theorem helps to explain why a vast number of phenomena show approximately a normal distribution. Because of its theoretical and practical significance, this theorem is considered as most remarkable theoretical formulation of all probability laws. However, most of hypothesis testing and sampling theory is based on this theorem. So the central limit theorem is perhaps the most fundamental result in all of statistics.

Page 13: Sampling distribution

2) SAMPLING DISTRIBUTION OF THE PROPORTION :

Sampling distribution of the proportion is found when the sample proportion and proportion of successes are given.

Properties :

Sample proportion tend to target the value of proportion. Under certain conditions, the distribution of sample proportion can be approximated by a normal distribution.

Page 14: Sampling distribution

Example:Sample distribution of the proportion of the girls from sample space for two randomly selected births:bb,bg,gb,ggAll four outcomes are equally likely:Probabilities: P(0 girls)=0.25 P(1 girl)=0.50 P(2girls)=0.75

Page 15: Sampling distribution

STANDARD ERROR OF THE SAMPLING DISTRIBUTION The sampling distribution has a standard deviation. The mean of the sampling distribution will be the same as the population mean, but the standard deviation will be smaller than the Population Standard Deviation. The standard deviation of the sampling distri bution has a special name : ‘The Standard Error’ or sometimes ‘The Standard Error of the Mean . The variation of sample mean around the population mean is the sampling error and is measured using a statistic known as the standard error of the mean. This is an estimate of the amount that a sample mean is likely to differ from the population mean. This consideration is important because sampling theory tells us that 68% of all sample means will lie between + or – one standard error from the population mean. And that 95 % of all sample mean will lie between + or – 1.96 standard errors from the population mean (Bryman,Alan,2004, P: 96 ) .

Page 16: Sampling distribution

Formula : The standard error of a sampling distribution is equal to the standard deviation of the population divided by the square root of the sample size. The formula of the standard error is as follows : = σ/ Here, = Standard deviation of sample mean . = Standard deviation of population . Total Population .How to reduce Error : When sample size increases, sampling error decreases .

Page 17: Sampling distribution

Purpose :

1. Allows us to quantify the extent to which a ‘test’ provides accurate scores.2. If the standard error is smaller,the range of population mean will be narrower.3. When standard error is larger, the range of population mean will be wider

Application : 95 % CI = Mean ( 1.96 SEM ) 99 % CI = Mean ( 2.58 SEM )

Page 18: Sampling distribution

STANDARD ERROR TABLE

SAMPLING DISTRIBUTION

STANDARD ERROR SAMPLING DISTRIBUTION

STANDARD ERROR

MEANS = FIRST & THIRD QUARTILES

= =

PROPORTIONS = = SEMI-INTERQUARTILE RANGESS

=

STANDARD DEVIATIONS

= =

VARIANCES = =

MEDIANS =σ COEFFICIENTS OF VARIATION

=

Page 19: Sampling distribution

Point & Interval Estimates There are two kinds of estimates of population parameters from sample statistics :

A point estimate is a single value and an interval estimate is a range of values.

POINT ESTIMATES

INTERVAL ESTIMATES

Page 20: Sampling distribution

POINT ESTIMATION :

A point estimate of a population parameter is a single value of a statistic. 

For example,the sample mean ¯x is a point estimate of the population mean μ. Similarly, the sample proportion p is a point estimate of the population proportion P. Interval Estimation :

An interval estimate is defined by two numbers, between which a population parameter is said to lie.

Page 21: Sampling distribution

For example a < x < b is an interval estimate of the population mean μ. It indicates that the population mean is greater than a but less than b. In any estimation problem, we need to obtain both a point estimate and an interval estimate. The point estimate is our best guess of the true value of the parameter, while the interval estimate gives a measure of accuracy of that point estimate by providing an interval that contains plausible values.

Page 22: Sampling distribution

MATHEMATICAL PROBLEMS Sampling Distribution of means Prob. 1 : A population consists of the five numbers 2,3,6,8 and 11. Consider all possible samples of size 2 that can be drawn with and without replacement from this population .a)The mean of the population.b)The standard deviation of the population .c)The mean of the sampling distribution of means.d)Standard deviation of the sampling distribution of means (the standard error of means ).

Page 23: Sampling distribution

# Answer :a) Mean of the population = = = 6b)Standard deviation of population ,= = = = = 10.8

With replacement :c)There are 5(5)= 25 samples of size 2 that can be drawn with replacement. These are : (2,2) (2,3) (2,6) (2,8) (2,11) (3,2) (3,3) (3,6) (3,8) (3,11) (6,2) (6,3) (6,6) (6,8) (6,11) (8,2) (8,3) (8,6) (8,8) (8,11)

Page 24: Sampling distribution

The corresponding sample means are : 2.0 2.5 4.0 5.0 6.5 2.5 3.0 4.5 5.5 7.0 4.0 4.5 6.0 7.0 8.5. 5.0 5.5 7.0 8.0 9.5 6.5 7.0 8.5 9.5 11.0And the mean of sampling distribution of mean is , = = = 6.0

Illustrating the fact that = µ

Page 25: Sampling distribution

d) Here, standard deviation of the sampling distribution of mean is, x =( substracting the mean 6 from each numbers, squaring the result, adding all 25 numbers thus obtained and dividing by 25 ) = = 5.40 σx = This illustrates the fact that for finite populations involving sampling with replacement , x = - since the right hand side is 10.8/2 = 5.40 ; agreeing with the above value . Without Replacement: c) There are 10 samples of size 2 that can be drawn without replacement from the population :

(2,3) (2,6) (2,8) (2,11) (3,6) (3,8) (3,11) (6,8) (6,11) (8,11)

Page 26: Sampling distribution

The corresponding sample means are : 2.5, 4.0 , 5 , 0 , 6.5 , 4.5 , 5.5 , 7.0 , 7.0 , 8.5 , 9.5 .

The mean of sampling distribution of means is , = = 6.0 = µ(d) The variance of sampling distribution of mean is , x = = 4.05 And, = 2.01this illustrates, x = () = ( ) =4.05As obtained above .

Page 27: Sampling distribution

SAMPLING DISTRIBUTION OF PROPORTIONSProb. 2 : Find the probability that in 120 tosses of a fair coin , a)Between 40 % and 60 % will be heads and b)5/8 or more will be heads .

Answer: We consider the 120 tosses of the coin to be simple from the infinite population of all possible tosses of the coin. In this population the probability of heads is p=1/2 and the probability of tails is q= 1-p = ½

Page 28: Sampling distribution

a) = = = 0.045640 % in standard units = = -2.1960 % in standard units = = 2.19Required probability = (area under normal curve between z= -2.19 and z= 2.19 ) = 2 ( 0.4857 ) = 0.9714 Although this result is accurate to two significant figures, it does not agree exactly since we have not used the fact that the proportion is actually a discrete variable. To account for this, we subtract ½ N = ½ (120) from 0.40 and add ½ N = ½ (120) to 0.60 ; thus, since 1/240 = 0.00417, the required proportions in standard units are, = -2.28 and = 2.28

Page 29: Sampling distribution

b) According to (a) since 5/8 = 0.6250(0.6250 – 0.00417 ) in standard units = = 2.65

Required probability = ( area under normal curve to right of z=2.65 ) =(area to right of z = 0) – (area between z=0 and z= 2.65 ) = 0.5 – 0.4960 =0.0040 .

Page 30: Sampling distribution

REFERENCES :1.Statistics For The Social Sciences with Computer Applications – Anthony Walsh2.Schaum’s Outline of Theory and Problems of STATISTICS – Murray R. Spiegel3.Business Statistics – SP Gupta & MP Gupta 4.Descriptive and Inferential Statistics – An introduction - Herman J Loether & Donald G McTavish