Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

31
Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad

Transcript of Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

Page 1: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

Sampling Design and AnalysisMTH 494

Lecture-22

Ossam ChohanAssistant Professor

CIIT Abbottabad

Page 2: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

2

Review

Page 3: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

3

Regression Estimation

• We observed that the ratio estimator is most appropriate when the relationship between y and x is linear through the origin.

• If there is evidence of a linear relationship between the observed y’s and x’s, but not necessarily one that would pass through the origin, then this extra information provided by the auxiliary variable x may be taken into account through a regression estimator of the mean µy.

Page 4: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

4

• One must still have knowledge of µx before the estimator can be employed, as it was in the case of ratio estimation of µy.

• The underlying line that shows the basic relationship between y’s and x’s is sometimes referred to as the regression line of y upon x.

• Thus the subscript L in the ensuing formulas is used to denote linear regression.

Page 5: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

5

• The estimator given in next section assumes the x’s to be fixed in advance and the y’s to be random variable.

• We can think of the x values as something that has already been observed, like last year’s first quarter earnings, and the y response as a random variable yet to be observed, such as the current quarterly earnings of a company for which x is already known.

• The probabilistic properties of the estimator then depend only on y for a given set of x’s.

Page 6: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

6

• If stratum sample sizes are very small, or if the within-stratum ratios are all approximately equal, then the combined ratio estimator may perform better.

• Of course, an estimator of the population total can be found by multiplying either of the estimators above by the population size N, and the variances can be adjusted accordingly.

• Thus we might use the notationyRSyRS N ˆˆ

Page 7: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

7

Estimators

• Regression estimator of the population mean µy.

(3.28)

• Estimated Variance of

(3.29)

:ˆ yL

Page 8: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

8

Estimator

• Bound of the error of estimation:

(3.30)

• When calculating b from observed pairs (y1,x1),…,(yn, xn), we may use the fact that

n

ii

n

iii

n

i

n

iii

xnx

yxnxy

xxi

xxyy

1

22

1

1

2

1

)(

Page 9: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

9

Example 3.9

• A mathematical achievement test was given to 486 students prior to their entering a certain college. From these students a simple random sample of n=10 students was selected and their progress in calculus observed. Final calculus grades were then reported, as given in the accompanying table.

• It is known that µx=52 for all 486 students taking the achievement test.

• Estimate µy for this population, and place a bound on the error of estimation.

Page 10: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

10

Data for problemStudent Achievement test score, x Final Calculus grade, y

1 39 65

2 43 78

3 21 52

4 64 82

5 57 92

6 47 89

7 28 73

8 75 98

9 34 56

10 52 75

Page 11: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

11

Solution

Page 12: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

12

Solution

Page 13: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

13

• A close examination of the data on sugar content and weight of oranges given in example 3.2 might suggest that a regression estimator is more appropriate than ratio estimator.

• A plot of the points will show that the regression line does not appear to go through the origin.

• However, the regression estimator of a total is of the form , specifically requiring knowledge of N.

• Since the ratio estimator also works well in this case, determining the number of oranges in the truckload may not be worth the extra cost and time

yLN̂

Page 14: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

14

• In other cases N may be known or easily found.

• Thus one should carefully consider the choice between ratio and regression estimators when estimating population means or totals.

Page 15: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

15

Difference Estimation

• The difference method of estimating a population mean or total is similar to the regression method in that it adjusts the value up or down by an amount depending on the difference ( ).

• However, the regression coefficient b is not computed. In effect, b is set equal to unity.

• The difference method is, then, easier to employ than the regression method and frequently works just as well.

y)( xx

Page 16: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

16

• It is commonly employed in auditing procedures, and we will consider such an example in this section.

• The following formulas hold provided that simple random sampling was employed.

Page 17: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

17

Estimators

• Difference estimator of a population µy:

(3.31)

• Estimated variance of :

(3.32)

yD̂

Page 18: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

18

Estimators

• Bound on the error of estimation

(3.33)

Page 19: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

19

Example 3.10

• Auditors are often interested in comparing the audited value of item with the book value. Generally, book values are known for every item in the population, and audit values are obtained for a sample of these items. The book values can be used to obtain a good estimate of the total or average audit value for the population.

• Suppose a population contains 180 inventory items with a stated book value of $13,320. Let xi denote the book value and yi the audit value of the ith item. A simple random sample of n=10 items yields the results shown in the accompanying table. Estimate the mean audit value of µy by the difference method and estimate the variance of .yD̂

Page 20: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

20

Data for ProblemSample Audit Value, yi Book Value, xi di

1 9 10 -1

2 14 12 2

3 7 8 -1

4 29 26 3

5 45 47 -2

6 109 112 -3

7 40 36 4

8 238 240 -2

9 60 59 1

10 170 167 3

Page 21: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

21

Solution

Page 22: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

22

Systematic Sampling

Page 23: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

23

Session Objectives

• To introduce basic sampling concepts in systematic sampling

• Demonstrate how to select a random sample using systematic sampling design

• Estimation of different parameters in systematic random sampling

Page 24: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

24

Sample Selection Procedure• List all the units in the population from 1,2,…,N –

Sampling frame• Select a random number g in the interval • 1 g K, using a random mechanism e.g. random

number tables, where K =

• K is called the Sampling Interval• N is the population size; n is the sample size • The random number g is called the random start and

constitutes the first unit of the sample

N

n

Page 25: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

25

Sample Selection Procedure

• Take every kth unit after the random start• The selected units will be • g, g+k, g+2k, g+3k, g+4k, …,g+(n-1)k• Until we have n units• Example N =10000, n=100• k = =100

• Suppose g=87

10000

100

Page 26: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

26

Sample Selection Procedure

• We select the following units• 87, 187, 287, 387,…, 9987

• NB: This procedure is however only valid if k is an integer (whole number)

• If k is not an integer (whole number) there are a number of methods we can use. We will consider just two of them

Page 27: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

27

Sample Selection Procedure

• Method 1: Use Circular Sampling• Treat the list as circular so that the last unit is

followed by the first• Select a random start g between 1 and N,

using a random mechanism• Add the intervals k until n units are selected• Any convenient interval k will result into a

random sample

Page 28: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

28

Sample Selection Procedure

• One suitable suggestion is to choose the integer k closest to the ratio

• Method 2: Use Fractional Intervals• Suppose we want to select a sample of 100 units

from a population of 21,156.• Calculate k = =211.56

• Select a random start g between 1 and 21156 using a random mechanism

N

n

21156

100

Page 29: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

29

Sample Selection Procedure

• Suppose g = 582• Add the interval 21156 successively obtaining

exactly 100 numbers• The numbers will be 582, 21738, 42894, …• Divide each number by 100 and round to the

nearest whole number to get the selected sample, i.e.

• 6, 217, 429, etc

Page 30: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

30

Advantages and Disadvantages of Systematic sampling

• Advantages:– The major advantage is that it is easy, almost

foolproof and flexible to implement– It is especially easy to give instructions to

fieldworkers– If we order our list prior to taking the sample,

the sample will reflect the ordering and as such can easily give a proportionate sample

Page 31: Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

31

Advantages and Disadvantages of Systematic sampling

• Disadvantages:– The main disadvantage is that if there is an

ordering (monotonic trend or periodicity) in the list which is unknown to the researcher, this may bias the resulting estimates

– There is a problem of estimating variance from systematic sampling- variance is biased