Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by...

18
Chapter 9 Estimating the Value of a Parameter Section 9.1 Estimating a Population Proportion Objectives 1. Obtain a point estimate for the population proportion 2. Construct and interpret a confidence interval for the population proportion 3. Determine the sample size necessary for estimating a population proportion within a specified margin of error Obtain a Point Estimate for the Population Proportion Example Obtaining a Point Estimate of a Population Proportion In a survey of 2019 adult Americans aged 18 years or older, 1252 stated that they frequently worry about their financial situation. Find the sample proportion of adult Americans aged 18 years or older who frequently worry about their financial situation. Note: Round proportions to three decimal places. Construct and Interpret a Confidence Interval for the Population Proportion Two questions. 1. Why does the level of confidence represent the expected proportion of intervals that contain the parameter if a large number of different samples is obtained? 2. How is the margin of error determined?

Transcript of Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by...

Page 1: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

Chapter 9 Estimating the Value of a Parameter Section 9.1 Estimating a Population Proportion Objectives 1. Obtain a point estimate for the population proportion

2. Construct and interpret a confidence interval for the population proportion

3. Determine the sample size necessary for estimating a population proportion within a specified margin of error

❶ Obtain a Point Estimate for the Population Proportion

Example Obtaining a Point Estimate of a Population Proportion In a survey of 2019 adult Americans aged 18 years or older, 1252 stated that they frequently worry about their financial situation. Find the sample proportion of adult Americans aged 18 years or older who frequently worry about their financial situation. Note: Round proportions to three decimal places. ❷ Construct and Interpret a Confidence Interval for the Population Proportion

Two questions.

1. Why does the level of confidence represent the expected proportion of intervals that contain the parameter if a large number of different samples is obtained?

2. How is the margin of error determined?

Page 2: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

A review of the sampling distribution of the sample proportion, .

95% of all sample proportions are in the following inequality: Rewrite the inequality with the population proportion, p, in the middle: So, 95% of all sample proportions will result in confidence interval estimates that contain the population proportion, while 5% of all sample proportions (those in the tails of the distribution above), will result in confidence interval estimates that do not contain the population proportion. Write the 95% confidence interval in the form point estimate margin of error: Whatisthemarginoferrorfora95%confidenceinterval?

±

Page 3: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

ActivityIllustratingtheMeaningofLevelofConfidenceGotoStatCrunchandselectApplets>Confidenceintervals>foraproportion.Checkthe“Proportionwithcharacteristic”radiobuttonandsetpto0.3.ClickCompute!Orgotowww.pearsonhigher.com/sullivanstatsandselectthe“ConfidenceIntervalsforaProportionwithp=0.3”applet.Changethesamplesizeto150(thisistoensurethesamplingdistributionofthesampleproportionisapproximatelynormal).(a)Click“100intervals”twotimestogenerateconfidenceintervalsfor200independentsimplerandomsamplesofsizen=150fromapopulationwithp=0.3.Whatproportionofthe200intervalsincludethepopulationproportion? (b) What causes an interval to not include the population proportion?

Page 4: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

Constructing Any Confidence Interval

The value is called the critical value.

1−α( ) ⋅100%

2

Page 5: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

Example Constructing a Confidence Interval for a Population Proportion In a survey of 2019 adult Americans aged 18 years or older, 1252 stated that they frequently worry about their financial situation. Obtain a 90% confidence interval for the proportion of adult Americans who frequently worry about their financial situation.

Page 6: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

Example The Role of the Level of Confidence on the Margin of Error Redo the previous example by increasing the level of confidence from 90% to 95%. Example The Role of the Sample Size on the Margin of Error Suppose the number of individuals surveyed is increased four-fold to 8076 (but the sample proportion remains at 0.620. Compute the margin of error. Compare this margin of error to that when the sample size is 2019. ➌ Determine the Sample Size Necessary for Estimating a Population Proportion within a Specified Margin of Error

Solve for n.

E = zα

2

⋅p̂ 1− p̂( )

n

Page 7: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

Example Determining Sample Size A social worker wants to know the proportion of the U.S. population over age 16 who provide eldercare – unpaid care for someone with a condition related to aging – to others. What size sample should be obtained if the social worker wants an estimate within 3 percentage points of the true proportion with 95% confidence if

(a) the social worker uses the 2015 estimate from the American Time Use Survey of 16%. (b) the social worker does not use any prior estimates?

Page 8: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

9.2 Estimating a Population Mean Objectives 1.Obtainapointestimateforthepopulationmean

2.StatepropertiesofStudent’st-distribution

3.Determinet-values

4.Constructandinterpretaconfidenceintervalforapopulationmean

5.Determinethesamplesizeneededtoestimatethepopulationmeanwithinaspecifiedmarginoferror

The point estimate of the population mean, , is the sample mean, . Example Computing a Point Estimate of the Population Mean The following data represent the amount spent on Valentine’s Day among males who purchased gifts for the holiday. The data is based on reports from fundivo.com.

$216 $156 $125 $133 $229 $155 $146 $120 $189 $201 $137 $262

Obtain a point estimate of the population mean amount spent on Valentine’s Day among males who purchased gifts for the holiday. ❷ State Properties of Student’s t-distribution Using the logic used in constructing a confidence interval about a population proportion, the formula for a confidence interval about a population mean should be of the form

Theproblemwiththisformulaisthatwedonotknowthepopulationstandarddeviation, .Thesolutiontothisproblemwouldbetoreplace withthesamplestandarddeviation,s.The“new”formulafortheconfidenceintervalwillbe

µ x

σσ

Page 9: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

Theproblemwiththisapproachisthatthesamplestandarddeviationisarandomvariable.

Wecannotusethecriticalvalue inthisformulabecause doesnotfollowthe

normaldistributionwithmean0andstandarddeviation1.ThesolutionistouseanewprobabilitydistributioncalledStudent’st-distribution.

ActivityComparingtheStandardNormalDistributiontothet-DistributionUsingSimulation (a)UseStatCrunchtoobtain2000simplerandomsamplesofsizen=9fromanormalpopulationwithμ = 100andstandarddeviationσ=15.Calculatethesamplemeanand

samplestandarddeviationforeachsample.Compute and

foreachsample.Drawhistogramsfrobothzandt.

2

x − µsn

z = x − µσn

= x −10015

9

= x −1005

t = x − µsn

= x −100s9

Page 10: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

(b)Repeatpart(a)for2000simplerandomsamplesofsizen=16.

Page 11: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

❸ Determine t-Values The notation is the z-score such that the area under the standard normal curve to the right of

is . Similarly, will be the t-value such that the area under the t-distribution to the right

of will be . Example Finding t-Values Find the t-value such that the area under the t-distribution to the right of the t-value is 0.05, assuming 20 degrees of freedom. That is, find t0.05 with 20 degrees of freedom. ❹ Construct and Interpret a Confidence Interval for a Population Mean

TheprocedureforconstructingaconfidenceintervalaboutthepopulationmeanusingStudent’st-distributionisrobust–meaningitisaccuratedespiteminordeparturesfromnormality.

α

zα α α

tα α

Page 12: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

ExampleConstructingaConfidenceIntervalaboutaPopulationMeanThe following data represent the amount spent on Valentine’s Day among males who purchased gifts for the holiday. The data is based on reports from fundivo.com. Construct a 90% confidence interval for the mean amount spent on Valentine’s Day among males who purchased gifts for the holiday.

$216 $156 $125 $133 $229 $155 $146 $120 $189 $201 $137 $262

❺ Determine the Sample Size Needed to Estimate a Population Mean within a Specified Margin of Error

Example Determining Sample Size Again consider the Valentine’s Day gift data. How large a sample is required to estimate the mean amount spent on Valentine’s Day among males who purchased gifts for the holiday within $10 with 90% confidence.

Page 13: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

9.4 Putting It Together: Which Procedure Do I Use? Objective1.DeterminetheAppropriateConfidenceIntervaltoConstruct❶ Determine the Appropriate Confidence Interval to Construct

ExampleConstructingaConfidenceIntervalInarecentsurveyof2221adultAmericans,1666indicatedthattheyturnofflights,televisions,orotherapplianceswhennotinuse.

(a) Whatisthevariableofinterestinthisstudy?Isitqualitativeorquantitative?(b) Whattypeofconfidenceintervalwouldmakesensetoconstructforthisvariableof

interest?(c) Verifythemodelrequirementstoconstructtheconfidenceintervalaresatisfied.(d) Constructa95%confidenceintervalforthevariableofinterest.

Page 14: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

ExampleConstructingaConfidenceIntervalThefollowingdatarepresentthepitchspeed,inmilesperhour,ofarandomsampleof12slidersthrownbyJakeArrietta,pitcherfortheChicagoCubs.88.09 89.01 90.45 89.19 89.74 86.7188.58 88.82 86.92 87.44 89.07 89.00Source:BrooksBaseball.net

(a) Whatisthevariableofinterestinthisstudy?Isitqualitativeorquantitative?(b) Whattypeofconfidenceintervalwouldmakesensetoconstructforthisvariableof

interest?(c) Verifythemodelrequirementstoconstructtheconfidenceintervalaresatisfied.(d) Constructa90%confidenceintervalforthevariableofinterest.

Page 15: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

Bootstrapping Objective

1. EstimateaparameterusingthebootstrapmethodThe methods for estimating parameters (such as or p ) by constructing confidence intervals rely on parametric methods. Parametric statistics use estimates of parameters along with probability distributions to make inferences about the parameter. For example, we use the normal probability distribution to make inferences about the population proportion by constructing a confidence interval. To make inferences about a parameter using parametric statistics certain conditions (such as a normality requirement) must be satisfied. What if the conditions for using parametric statistics are not satisfied? In this case, nonparametric statistics, which do not require sample data follow a specified distribution, may be used. The bootstrap method was invented by Bradley Efron, from Stanford University, in 1979. Bootstrapping is a method in which sample data may be used to build a sampling distribution of a sample statistic without having access to the entire population. To use bootstrapping, however, two basic requirements must be satisfied.

1. The “center” of the bootstrap distribution must be close to the “center” of the original sample data. For example, the mean of all bootstrap means must be close to the mean of the original data.

2. The distribution of the bootstrap sample statistic must be symmetric.

If using the bootstrap method to obtain a confidence interval for a population mean,

we can use the distribution of the sample means obtained from the B resamples as an approximation for the actual sampling distribution. A confidence interval

may be obtained using the percentile method. With this method, determine the

percentile (the lower bound) and the percentile (the upper bound) of the

distribution. For example, the lower bound of a 95% confidence interval would be the 2.5th percentile and the upper bound would be the 97.5th percentile.

µ

( )1 100%a- ×

100th2a×

1 100th2aæ ö- ×ç ÷

è ø

DEFINITIONBootstrappingisacomputer-intensiveapproachtostatisticalinferencewherebyparametersareestimatedbytreatingasetofsampledataasapopulation.Acomputerisusedtoresamplewithreplacementnobservationsfromthesampledata.Theprocessisrepeatedmanytimes.Foreachresample,thestatistic(suchasthesamplemean)isobtained.

Page 16: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

TheBootstrapAlgorithm

1. SelectBindependentbootstrapsamplesofsizenwithreplacement.Notethatnisthenumberofobservationsintheoriginalsample.So,iftheoriginalsamplehas10observations,eachbootstrapsamplewillalsohave10observations.Samplingwithreplacementmeansthatonceanobservationisselectedtobeinthesample,itsvalueis“putbackintothehat”soitmaybesampledagain.

2. Determinethevalueofthestatisticofinterest,suchasthemean,foreachoftheBsamples.

3. UsethedistributionoftheBstatisticstomakeajudgementaboutthevalueoftheparameter.Forexample,findthe2.5thand97.5thpercentilestodeterminethelowerandupperboundofa95%confidenceinterval.

EXAMPLE 1 Using the Bootstrap Method to Construct a 95% Confidence Interval Thefollowingdatarepresentthepitchspeed,inmilesperhour,ofarandomsampleof12slidersthrownbyJakeArrietta,pitcherfortheChicagoCubs.Constructa95%confidenceintervalforthemeanpitchspeed,inmilesperhour,ofaJakeArriettaslider.88.09 89.01 90.45 89.19 89.74 86.7188.58 88.82 86.92 87.44 89.07 89.00Source:BrooksBaseball.netBootstrap Confidence Intervals using Simulation 1. Enter the raw data into column var1. Name the column. 2. Select the Data menu, highlight Sample. Select the variable from column var1. In the “Sample Size:” cell, enter the number of observations, n. In the “Number of samples:” cell, enter the number of resamples. Check the “Sample with replacement” box. Check the radio button “Stacked with a sample id”. Click Compute!. 3. Select the Stat menu, highlight Summary Stats, and select Columns. Select the Sample(var1) under “Select Columns:” For example, if the variable name in column 1 is MPG, select Sample(MPG). In the dropdown menu “Group by:” select “Sample”. Select only the Mean (or statistic of interest) in the Statistics menu. Check the box “Store output in data table”. Click Compute!. A popup window will appear that says, “Whoa! Lots of numeric. . .” Click Cancel. 4. Select the Stat menu, highlight Summary Stats, and select Columns. Select the column “Mean” (or statistic of interest). Highlight “Mean” (or statistic of interest) in the “Statistics:” cell. Enter the percentiles corresponding to the lower and upper bound in the “Percentiles” cell. For example, enter 2.5, 97.5 for a 95% confidence interval. Click Compute!.

Page 17: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

Whatwasthe95%confidenceintervalforanArriettasliderusingStudent’st-distribution?The approach used in Example 1 illustrates the bootstrap method and logic, but many statistical spreadsheets (such as StatCrunch) have built-in algorithms that do bootstrapping. EXAMPLE 2 Using StatCrunch’s Resample Command and Bootstrap Applet to Obtain a Bootstrap Confidence Interval UseStatCrunchtoestimatea95%confidenceintervalforthemeanspeedofanArriettasliderusingStatCrunch’sResampleCommandandBootstrapapplet.Bootstrap Confidence Intervals Using the Resample Command Select the Stat menu, highlight Resample, and select Statistic. Under “Columns to resample:” select the variable you wish to estimate. Enter the statistic you wish to estimate in the “Statistic:” cell. For example, enter “mean(variable name)” for the mean, “median(variable name)” for the median. Select the “Bootstrap” radio button. Select the “Univariate” radio button. Enter B, the number of resamples, in the “Number of resamples:” cell. Enter the percentiles corresponding to the confidence interval you wish to determine. Check any boxes you wish. Click Compute!. Bootstrap Confidence Intervals Using the Bootstrapping Applet

1. Select the Applets menu, highlight Resampling, and select Bootstrap a statistic. Select the “From data table:” radio button. In the pull-down menu “Samples in:”, choose the column that contains the original sample data. In the pull-down menu “Statistic:”, select the statistic of interest. Click Compute!.

Click 1000 times for as many bootstrap samples as you want (at least 1000). Determine the lower and upper bounds of the confidence interval from the percentiles.

Page 18: Chapter 9 Estimating the Value of a ParameterThe methods for estimating parameters (such as or p) by constructing confidence intervals rely on parametric methods. Parametric statistics

Bootstrapping to Estimate a Population Proportion To construct a bootstrap confidence interval for a proportion, we need raw data using 0 for a failure and 1 for a success. The sample proportion is then estimated using the mean of the 0s and 1s. Some Final Thoughts Bootstrapping appears to be a powerful method for constructing confidence intervals without the need for underlying model requirements (such as the normality condition for constructing confidence intervals for the mean when sample sizes are small). However, there are some items to be aware of when constructing confidence intervals using the bootstrap approach.

1. If you have small samples, it is possible that the bootstrap distribution will not accurately reflect the true sampling distribution. This could happen if the sample data has less variability than the underlying population.

2. The standard error of bootstrap distributions tends to be narrower by a factor of

than those obtained using .

3. Although Efron originally stated that 1000 resamples is enough, it is better to have 10,000 resamples.

1nn- s

n