Statistics for Business and Economics: bab 21

75
1 Slides Prepared by JOHN S. LOUCKS St. Edward’s Universi 2002 South-Western College Publishing/Thomson Learning

description

Materi Statistik untuk Bisnis dan Ekonomi:Anderson, Sweeney, Williams; Bab 21

Transcript of Statistics for Business and Economics: bab 21

Page 1: Statistics for Business and Economics: bab 21

1 Slide

Slides Prepared byJOHN S. LOUCKS

St. Edward’s University

© 2002 South-Western College Publishing/Thomson Learning

Page 2: Statistics for Business and Economics: bab 21

2 Slide

Chapter 21 Sample Survey

Terminology Used in Sample Surveys Types of Surveys and Sampling Methods Survey Errors Simple Random Sampling Stratified Simple Random Sampling Cluster Sampling Systematic Sampling

Page 3: Statistics for Business and Economics: bab 21

3 Slide

Terminology Used in Sample Surveys

An element is the entity on which data are collected.

A population is the collection of all elements of interest.

A sample is a subset of the population.

Page 4: Statistics for Business and Economics: bab 21

4 Slide

Terminology Used in Sample Surveys

The target population is the population we want to make inferences about.

The sampled population is the population from which the sample is actually selected.

These two populations are not always the same.

If inferences from a sample are to be valid, the sampled population must be representative of the target population.

Page 5: Statistics for Business and Economics: bab 21

5 Slide

Terminology Used in Sample Surveys

The population is divided into sampling units which are groups of elements or the elements themselves.

A list of the sampling units for a particular study is called a frame.

The choice of a particular frame is often determined by the availability and reliability of a list.

The development of a frame can be the most difficult and important steps in conducting a sample survey.

Page 6: Statistics for Business and Economics: bab 21

6 Slide

Types of Surveys

Surveys Involving Questionnaires• Three common types are mail surveys,

telephone surveys, and personal interview surveys.

• Survey cost are lower for mail and telephone surveys.

• With well-trained interviewers, higher response rates and longer questionnaires are possible with personal interviews.

• The design of the questionnaire is critical.

Page 7: Statistics for Business and Economics: bab 21

7 Slide

Types of Surveys

Surveys Not Involving Questionnaires• Often, someone simply counts or measures

the sampled items and records the results.• An example is sampling a company’s

inventory of parts to estimate the total inventory value.

Page 8: Statistics for Business and Economics: bab 21

8 Slide

Sampling Methods

Sample surveys can also be classified in terms of the sampling method used.

The two categories of sampling methods are:• Probabilistic sampling• Nonprobabilistic sampling

Page 9: Statistics for Business and Economics: bab 21

9 Slide

Nonprobabilistic Sampling Methods

The probability of obtaining each possible sample can be computed.

Statistically valid statements cannot be made about the precision of the estimates.

Sampling cost is lower and implementation is easier.

Methods include convenience and judgment sampling.

Page 10: Statistics for Business and Economics: bab 21

10 Slide

Nonprobabilistic Sampling Methods

Convenience Sampling• The units included in the sample are chosen

because of accessibility.• In some cases, convenience sampling is the

only practical approach.

Page 11: Statistics for Business and Economics: bab 21

11 Slide

Nonprobabilistic Sampling Methods

Judgment Sampling• A knowledgeable person selects sampling

units that he/she feels are most representative of the population.

• The quality of the result is dependent on the judgment of the person selecting the sample.

• Generally, no statistical statement should be made about the precision of the result.

Page 12: Statistics for Business and Economics: bab 21

12 Slide

Probabilistic Sampling Methods

The probability of obtaining each possible sample can be computed.

Confidence intervals can be developed which provide bounds on the sampling error.

Methods include simple random, stratified simple random, cluster, and systematic sampling.

Page 13: Statistics for Business and Economics: bab 21

13 Slide

Survey Errors

Two types of errors can occur in conducting a survey:• Sampling error• Nonsampling error

Page 14: Statistics for Business and Economics: bab 21

14 Slide

Survey Errors

Sampling Error • It is defined as the magnitude of the

difference between the point estimate, developed from the sample, and the population parameter.

• It occurs because not every element in the population is surveyed.

• It cannot occur in a census.• It can not be avoided, but it can be

controlled.

Page 15: Statistics for Business and Economics: bab 21

15 Slide

Survey Errors

Nonsampling Error• It can occur in both a census and a sample

survey. • Examples include:

• Measurement error• Errors due to nonresponse• Errors due to lack of respondent

knowledge• Selection error• Processing error

Page 16: Statistics for Business and Economics: bab 21

16 Slide

Survey Errors

Nonsampling Error• Measurement Error

• Measuring instruments are not properly calibrated.

• People taking the measurements are not properly trained.

Page 17: Statistics for Business and Economics: bab 21

17 Slide

Survey Errors

Nonsampling Error• Errors Due to Nonresponse

• They occur when no data can be obtained, or only partial data are obtained, for some of the units surveyed.

• The problem is most serious when a bias is created.

Page 18: Statistics for Business and Economics: bab 21

18 Slide

Survey Errors

Nonsampling Error• Errors Due to Lack of Respondent

Knowledge• These errors on common in technical

surveys.• Some respondents might be more

capable than others of answering technical questions.

Page 19: Statistics for Business and Economics: bab 21

19 Slide

Survey Errors

Nonsampling Error• Selection Error

• An inappropriate item is included in the survey.

• For example, in a survey of “small truck owners” some interviewers include SUV owners while other interviewers do not.

Page 20: Statistics for Business and Economics: bab 21

20 Slide

Survey Errors

Nonsampling Error• Processing Error

• Data is incorrectly recorded.• Data is incorrectly transferred from

recording forms to computer files.

Page 21: Statistics for Business and Economics: bab 21

21 Slide

Simple Random Sampling

A simple random sample of size n from a finite population of size N is a sample selected such that every possible sample of size n has the same probability of being selected.

We begin by developing a frame or list of all elements in the population.

Then a selection procedure, based on the use of random numbers, is used to ensure that each element in the sampled population has the same probability of being selected.

Page 22: Statistics for Business and Economics: bab 21

22 Slide

Simple Random Sampling

We will see in the upcoming slides how to: Estimate the following population parameters:

• Population mean• Population total• Population proportion

Determine the appropriate sample size

Page 23: Statistics for Business and Economics: bab 21

23 Slide

In a sample survey it is common practice to provide an approximate 95% confidence interval estimate of the population parameter.

Assuming the sampling distribution of the point estimator can be approximated by a normal probability distribution, we use a value of z = 2 for a 95% confidence interval.

The interval estimate is: Point Estimator +/- 2(Estimate of the Standard Error

of the Point Estimator) The bound on the sampling error is: 2(Estimate of the Standard Error of the Point

Estimator)

Simple Random Sampling

Page 24: Statistics for Business and Economics: bab 21

24 Slide

Population Mean• Point Estimator

• Estimate of the Standard Error of the Mean

Simple Random Sampling

x

xN n ss

N n

Page 25: Statistics for Business and Economics: bab 21

25 Slide

Population Mean• Interval Estimate

• Approximate 95% Confidence Interval Estimate

Simple Random Sampling

/ 2 xx z s

2 xx s

Page 26: Statistics for Business and Economics: bab 21

26 Slide

Population Total• Point Estimator

• Estimate of the Standard Error of the Total

Simple Random Sampling

X̂ Nx

ˆ xxs Ns

Page 27: Statistics for Business and Economics: bab 21

27 Slide

Population Total• Interval Estimate

• Approximate 95% Confidence Interval Estimate

Simple Random Sampling

ˆ/ 2 xNx z s

ˆ2 xNx s

Page 28: Statistics for Business and Economics: bab 21

28 Slide

Simple Random Sampling

Population Proportion• Point Estimator

• Estimate of the Standard Error of the Proportion (1 )

1pp pN ns

n n

p

Page 29: Statistics for Business and Economics: bab 21

29 Slide

Simple Random Sampling

Population Proportion• Interval Estimate

• Approximate 95% Confidence Interval Estimate

/ 2 pp z s

2 pp s

Page 30: Statistics for Business and Economics: bab 21

30 Slide

Determining the Sample Size

An important consideration in sample design is the choice of sample size.

The best choice usually involves a tradeoff between cost and precision (size of the confidence interval).

Larger samples provide greater precision, but are more costly.

A budget might dictate how large the sample can be.

A specified level of precision might dictate how small a sample can be.

Page 31: Statistics for Business and Economics: bab 21

31 Slide

Determining the Sample Size

Smaller confidence intervals provide more precision.

The size of the approximate confidence interval depends on the bound B on the sampling error.

Choosing a level of precision amounts to choosing a value for B.

Given a desired level of precision, we can solve for the value of n.

Page 32: Statistics for Business and Economics: bab 21

32 Slide

Simple Random Sampling

Necessary Sample Sizefor Estimating the Population Mean

Hence,2

22

4

NsnBN s

2 N n sBN n

Page 33: Statistics for Business and Economics: bab 21

33 Slide

Example: Innis Investments

Simple Random SamplingInnis is a financial advisor for 200

clients. A sample of 40 clients has been taken to obtain various demographic data and information about the clients’ investment objectives. Statistics of particular interest are the clients’ age, clients’ total net worth, and the proportion favoring fixed income investments.

Page 34: Statistics for Business and Economics: bab 21

34 Slide

Example: Innis Investments

Simple Random SamplingFor the sample, the mean age was 52

(with a standard deviation of 10), the mean net worth was $480,000 (with a standard deviation of $120,000), and the proportion favoring fixed-income investments was .30.

Page 35: Statistics for Business and Economics: bab 21

35 Slide

Estimate of Standard Error of Mean Age

Approximate 95% Confidence Interval for Mean Age

s N nN

snx

200 40200

1040 141.s N n

Nsnx

200 40200

1040 141.

x sx 2 52 2 141 52 282 = = = 49.18 to 54.82( . ) .x sx 2 52 2 141 52 282 = = = 49.18 to 54.82( . ) .

Example: Innis Investments

Page 36: Statistics for Business and Economics: bab 21

36 Slide

Point Estimate of Total Net Worth (TNW) of Clients

Estimate of Standard Error of TNW

= $6,788,400 Approximate 95% Confidence Interval for TNW

= $82,423,200 to $109,576,800

( ) ,X Nx 200 480 96 000 thousand = $96,000,000 ( ) ,X Nx 200 480 96 000 thousand = $96,000,000

ˆ200 40 120400 6,788.4 200 40xX

N n ss Ns NN n

Nx sx 2 96 000 2 6 7884 ( , . ) = , = 82,423.2 to 109,576.8Nx sx 2 96 000 2 6 7884 ( , . ) = , = 82,423.2 to 109,576.8

Example: Innis Investments

Page 37: Statistics for Business and Economics: bab 21

37 Slide

Point Estimate of Population ProportionFavoring Fixed-Income Investments

p = .30 Estimate of Standard Error of Proportion

Approximate 95% Confidence Interval

s N nn

p pnp

( ) . ( . ) .11

200 40200

3 1 3200 1 029s N n

np pnp

( ) . ( . ) .11

200 40200

3 1 3200 1 029

p sp 2 300 2 029 = = .242 to .358. (. )p sp 2 300 2 029 = = .242 to .358. (. )

Example: Innis Investments

Page 38: Statistics for Business and Economics: bab 21

38 Slide

One year later Innis wants to again survey his clients. He now has 250 clients and wants to set a bound of $30,000 on the error of the estimate of their mean net worth.

Necessary Sample Size

He will need a sample size of 51.

n Ns

N B s

2

2 2

22 2

4

250 120

250 304 120

5096( ) .n Ns

N B s

2

2 2

22 2

4

250 120

250 304 120

5096( ) .

Example: Innis Investments

Page 39: Statistics for Business and Economics: bab 21

39 Slide

Stratified Simple Random Sampling The population is first divided into H groups,

called strata. Then for stratum h, a simple random sample of

size nh is selected. The data from the H simple random samples

are combined to develop an estimate of a population parameter.

If the variability within each stratum is smaller than the variability across the strata, a stratified simple random sample can lead to greater precision.

The basis for forming the various strata depends on the judgment of the designer of the sample.

Page 40: Statistics for Business and Economics: bab 21

40 Slide

Stratified Simple Random Sampling Population Mean

• Point Estimator

where: H = number of strata = sample mean for stratum

hNh = number of elements in the

population in stratum h N = total number of elements

in the population (all strata)

1

Hh

st hh

Nx xN

hx

Page 41: Statistics for Business and Economics: bab 21

41 Slide

Stratified Simple Random Sampling

Population Mean• Estimate of the Standard Error of the Mean

23

21

1 ( )st

hx h h h

h h

ss N N nN n

Page 42: Statistics for Business and Economics: bab 21

42 Slide

Population Mean• Interval Estimate

• Approximate 95% Confidence Interval Estimate

Stratified Simple Random Sampling

2stst xx s

/ 2 stst xx z s

Page 43: Statistics for Business and Economics: bab 21

43 Slide

Stratified Simple Random Sampling

Population Total• Point Estimator

• Estimate of the Standard Error of the Total

ˆstX Nx

ˆ stxxs Ns

Page 44: Statistics for Business and Economics: bab 21

44 Slide

Stratified Simple Random Sampling

Population Total• Interval Estimate

• Approximate 95% Confidence Interval Estimate

ˆ/ 2ˆ

xX z s

ˆˆ 2 xX s

Page 45: Statistics for Business and Economics: bab 21

45 Slide

Population Proportion• Point Estimator

where: H = number of strata h = sample proportion for

stratum hNh = number of elements in the

population in stratum h N = total number of elements

in the population (all strata)

Stratified Simple Random Sampling

1

Hh

st hh

Np pN

Page 46: Statistics for Business and Economics: bab 21

46 Slide

Stratified Simple Random Sampling

Population Proportion• Estimate of Standard Error of the Proportion

21

(1 )1 ( ) 1st

Hh h

p h h hh h

p ps N N nN n

Page 47: Statistics for Business and Economics: bab 21

47 Slide

Population Proportion• Interval Estimate

• Approximate 95% Confidence Interval Estimate

Stratified Simple Random Sampling

2stst pp s

/ 2 stst pp z s

Page 48: Statistics for Business and Economics: bab 21

48 Slide

Stratified Simple Random Sampling

Total Sample Size When Estimating Population Mean

2

12

2 2

14

H

h hh

H

h hh

N sn

BN N s

Page 49: Statistics for Business and Economics: bab 21

49 Slide

Stratified Simple Random Sampling

Total Sample Size When Estimating Population Total

2

12

2

14

H

h hh

H

h hh

N sn

B N s

Page 50: Statistics for Business and Economics: bab 21

50 Slide

Stratified Simple Random Sampling

Allocating Total Sample Size When Estimating Population Mean or Total

1

h hh H

h hh

N sn nN s

Page 51: Statistics for Business and Economics: bab 21

51 Slide

Stratified Simple Random Sampling

Total Sample Size When Estimating Population Proportion

2

12

2

1

(1 )

(1 )4

H

h h hh

H

h h hh

N p pn

BN N p p

Page 52: Statistics for Business and Economics: bab 21

52 Slide

Stratified Simple Random Sampling

Allocating Total Sample Size When Estimating Population Proportion

1

(1 )(1 )

h h hh H

h h hh

N p pn n

N p p

Page 53: Statistics for Business and Economics: bab 21

53 Slide

Example: Mill Creek Co.

Stratified Simple Random SamplingMill Creek Co. has used stratified simple

random sampling to obtain demographic information and preferences regarding health care coverage for its employees and their families.

The population of employees has been divided into 3 strata on the basis of age: under 30, 30-49, and 50 or over. Some of the sample data is shown on the next slide.

Page 54: Statistics for Business and Economics: bab 21

54 Slide

Data Annual Family Dental Expense Proportion Stratum Nh nh Mean St.Dev. MarriedUnder 30 100 30 $250 $75 .60 30-49 250 45 400 100 .70

50 or Over 125 30 425 130 .68 475 105

Example: Mill Creek Co.

Page 55: Statistics for Business and Economics: bab 21

55 Slide

Point Estimate of Mean Annual Dental Expense

= $375 Estimate of Standard Error of Mean

= 9.27

x NN

xsth

hh

1

3 100475 250 250

475 400 125475 425x N

Nxst

h

hh

1

3 100475 250 250

475 400 125475 425

sN

N N n snx h h h

h

hhst

1 1475

19 390 97222

1

32( ) , ,s

NN N n s

nx h h hh

hhst

1 1475

19 390 97222

1

32( ) , ,

Example: Mill Creek Co.

Page 56: Statistics for Business and Economics: bab 21

56 Slide

Approximate 95% Confidence Intervalfor Mean Annual Dental Expense

An approximate 95% confidence interval for mean annual family dental expense is $356.46 to $393.54.

x sst xst 2 375 2 927 = = 356.46 to 393.54( . )x sst xst 2 375 2 927 = = 356.46 to 393.54( . )

Example: Mill Creek Co.

Page 57: Statistics for Business and Economics: bab 21

57 Slide

Point Estimate of Total Family Expense for All Employees

Approximate 95% Confidence Interval

= $169,318 to $186,932

( ) , $178,X Nxst 475 375 178 125 125 ( ) , $178,X Nxst 475 375 178 125 125

, ( )( . ) ,X Nsxst 2 178 125 2 475 927 178 125 8807 = = , ( )( . ) ,X Nsxst 2 178 125 2 475 927 178 125 8807 = =

Example: Mill Creek Co.

Page 58: Statistics for Business and Economics: bab 21

58 Slide

Point Estimate of Proportion Married

Estimate of Standard Error of Proportion

= .0417 Approximate 95% Confidence Interval for

Proportion

p NN

psth

hh

1

3 100475 6 250

475 7 125475 68 6737. . . .p N

Npst

h

hh

1

3 100475 6 250

475 7 125475 68 6737. . . .

sN

N N n p pnp h h h

h h

hhst

1 11

1475

3916372 1

32( ) ( ) .s

NN N n p p

np h h hh h

hhst

1 11

1475

3916372 1

32( ) ( ) .

p sst pst 2 6737 2 0417 = = .5903 to .7571. (. )p sst pst 2 6737 2 0417 = = .5903 to .7571. (. )

Example: Mill Creek Co.

Page 59: Statistics for Business and Economics: bab 21

59 Slide

Cluster Sampling

Cluster sampling requires that the population be divided into N groups of elements called clusters.

We would define the frame as the list of N clusters.

We then select a simple random sample of n clusters.

We would then collect data for all elements in each of the n clusters.

Page 60: Statistics for Business and Economics: bab 21

60 Slide

Cluster Sampling

Cluster sampling tends to provide better results than stratified sampling when the elements within the clusters are heterogeneous.

A primary application of cluster sampling involves area sampling, where the clusters are counties, city blocks, or other well-defined geographic sections.

Page 61: Statistics for Business and Economics: bab 21

61 Slide

Cluster Sampling

Notation N = number of clusters in the

population n = number of clusters selected in the

sampleMi = number of elements in cluster i M = number of elements in the

population M = average number of elements in a

cluster xi = total of all observations in cluster i ai = number of observations in cluster i

witha certain characteristic

Page 62: Statistics for Business and Economics: bab 21

62 Slide

Population Mean• Point Estimator

• Estimate of Standard Error of the Mean

Cluster Sampling

1

1

n

ii

c n

ii

xx

M

2

12

( )

1c

n

i c ii

x

x x MN nsNnM n

Page 63: Statistics for Business and Economics: bab 21

63 Slide

Population Mean• Interval Estimate

• Approximate 95% Confidence Interval Estimate

Cluster Sampling

/ 2 cc xx z s

2cc xx s

Page 64: Statistics for Business and Economics: bab 21

64 Slide

Population Total• Point Estimator

• Estimate of Standard Error of the Total

Cluster Sampling

ˆcX Mx

ˆ cxXs Ms

Page 65: Statistics for Business and Economics: bab 21

65 Slide

Population Total• Interval Estimate

• Approximate 95% Confidence Interval Estimate

Cluster Sampling

/ 2 cc xMx z s

2cc xMx s

Page 66: Statistics for Business and Economics: bab 21

66 Slide

Population Proportion• Point Estimator

Cluster Sampling

1

1

n

ii

c n

ii

ap

M

Page 67: Statistics for Business and Economics: bab 21

67 Slide

Cluster Sampling

Population Proportion• Estimate of Standard Error of the Proportion

2

12

( )

1c

n

i c ii

p

a p MN nsNnM n

Page 68: Statistics for Business and Economics: bab 21

68 Slide

Cluster Sampling

Population Proportion• Interval Estimate

• Approximate 95% Confidence Interval Estimate

/ 2 cc pp z s

2cc pp s

Page 69: Statistics for Business and Economics: bab 21

69 Slide

Example: Cooper County Schools

Cluster SamplingThere are 40 high schools in Cooper County. School

officials are interested in the effect of participation in

athletics on academic preparation for college. A cluster sample of 5 schools has been taken and a

questionnaire administered to all the seniors on the

football team at those schools. There are a total of 1200

high school seniors in the county playing football.Data obtained from the questionnaire are shown on

the next slide.

Page 70: Statistics for Business and Economics: bab 21

70 Slide

Example: Cooper County Schools

Data

Number Average Number Planning School of Players SAT Score to Attend College

1 45 840 152 20 980 163 30 905 124 38 880 185 40 970 23

173 84

Page 71: Statistics for Business and Economics: bab 21

71 Slide

Point Estimate of Mean SAT Score

xx

Mc

ii

ii

1

5

1

545 840 20 980 40 970

45 20 30 38 40 906( ) ( ) ... ( )xx

Mc

ii

ii

1

5

1

545 840 20 980 40 970

45 20 30 38 40 906( ) ( ) ... ( )

Example: Cooper County Schools

Page 72: Statistics for Business and Economics: bab 21

72 Slide

Example: Cooper County Schools

Estimate of Standard Error of the Mean

s N nNnM

x x M

nx

i c ii

c

2

21

5

1( )

s N nNnM

x x M

nx

i c ii

c

2

21

5

1( )

1200 1731200 173 30

18 541 9445 1 504782( )( )

, , .

1200 1731200 173 30

18 541 9445 1 504782( )( )

, , .

Page 73: Statistics for Business and Economics: bab 21

73 Slide

Approximate 95% Confidence Interval

Point Estimate of Proportion Planning to Attend College

x sc xc 2 = 906 2(5.0478) = 896 to 916x sc xc 2 = 906 2(5.0478) = 896 to 916

pa

Mc

ii

ii

1

5

1

584173 49.p

a

Mc

ii

ii

1

5

1

584173 49.

Example: Cooper County Schools

Page 74: Statistics for Business and Economics: bab 21

74 Slide

Systematic Sampling Systematic Sampling is often used as an

alternative to simple random sampling which can be time-consuming if a large population is involved.

If a sample size of n from a population of size N is desired, we might sample one element for every N/n elements in the population.

We would randomly select one of the first N/n elements and then select every (N/n)th element thereafter.

Since the first element selected is a random choice, a systematic sample is often assumed to have the properties of a simple random sample.

Page 75: Statistics for Business and Economics: bab 21

75 Slide

End of Chapter 21