Section 5.2 Confidence Intervals and P-values using Normal Distributions.

17
Section 5.2 Confidence Intervals and P-values using Normal Distributions

Transcript of Section 5.2 Confidence Intervals and P-values using Normal Distributions.

Page 1: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

Section 5.2

Confidence Intervals and P-values using

Normal Distributions

Page 2: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

slope (thousandths)-60 -40 -20 0 20 40 60

Measures from Scrambled RestaurantTips Dot Plot

r-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6

Measures from Scrambled Collection 1 Dot Plot

Nullxbar98.2 98.3 98.4 98.5 98.6 98.7 98.8 98.9 99.0

Measures from Sample of BodyTemp50 Dot Plot

Diff-4 -3 -2 -1 0 1 2 3 4

Measures from Scrambled CaffeineTaps Dot Plot

xbar26 27 28 29 30 31 32

Measures from Sample of CommuteAtlanta Dot Plot

Slope :Restaurant tips

Correlation: Malevolent uniforms

Mean :Body Temperatures

Diff means: Finger taps

Mean : Atlanta commutes

phat0.3 0.4 0.5 0.6 0.7 0.8

Measures from Sample of Collection 1 Dot PlotProportion : Owners/dogs

Bootstrap and Randomization Distributions

Page 3: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

Central Limit Theorem

If random samples with large enough sample size are taken, then

the distribution of sample statistics for a mean or a proportion is

normally distributed.

Page 4: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

CLT and Sample Size

The CLT is true for ANY original distribution, although “sufficiently large sample size” varies

• The more skewed the original distribution is, the larger n has to be for the CLT to work

• For quantitative variables that are not very skewed, n ≥ 30 is usually sufficient

• For categorical variables, counts of at least 10 within each category is usually sufficient

Page 5: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

Confidence Interval using N(0,1)

If a statistic is normally distributed, we find a confidence interval for the parameter using

statistic z* SE

where the area between –z* and +z* in the standard normal distribution is the desired

level of confidence.

Page 6: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

Hearing Loss

• In a random sample of 1771 Americans aged 12 to 19, 19.5% had some hearing loss

• What proportion of Americans aged 12 to 19 have some hearing loss? Give a 95% CI.

Rabin, R. “Childhood: Hearing Loss Grows Among Teenagers,” www.nytimes.com, 8/23/10.

Page 7: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

Hearing Loss

(0.177, 0.214)

Page 8: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

Hearing Loss

Page 9: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

Hearing LossFind a 99% confidence interval for the

proportion of Americans aged 12-19 with some hearing loss.

statistic z* SE0.195 2.575 0.0095

(0.171, 0.219)

Page 10: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

News Sources• “A new national survey shows that the majority

(64%) of American adults use at least three different types of media every week to get news and information about their local community”

• The standard error for this statistic is 1%

• Find a 90% CI for the true proportion.

Source: http://pewresearch.org/databank/dailynumber/?NumberID=1331

statistic z* SE0.64 1.645 0.01

(0.624, 0.656)

Page 11: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

P-values

If the randomization distribution is normal:

To calculate a P-value, we just need to find the area in the appropriate tail(s) beyond the observed statistic of the distribution

N(null value, SE)

Page 12: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

P-value using N(0,1)

If a statistic is normally distributed under H0, the p-value is the probability a standard normal is beyond

Page 13: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

Standardized Test Statistic

The standardized test statistic is the number of standard errors a statistic is from the null:

• Calculating the number of standard errors a statistic is from the null value allows us to assess extremity on a common scale

Page 14: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

First Born Children• Are first born children actually smarter?

• Explanatory variable: first born or not• Response variable: combined SAT score

• Based on a sample of college students, we find

• From a randomization distribution, we find SE = 37

Page 15: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

First Born Children

SE = 37

1) Find the standardized test statistic

2) Compute the p-value

= = 0.818

p-value

Page 16: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

Summary: Confidence Intervals

From original data

From bootstrap

distribution

From N(0,1)

statistic z* SE

Page 17: Section 5.2 Confidence Intervals and P-values using Normal Distributions.

Summary: P-values

From randomization distribution

From H0From original

data

Compare to N(0,1) for p-value

𝒔𝒕𝒂𝒕𝒊𝒔𝒕𝒊𝒄−𝒏𝒖𝒍𝒍𝑺𝑬