September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t...

31
Jun 18, 2022 Chapter 11: Chapter 11: Inference About a Inference About a Mean Mean

Transcript of September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t...

Page 1: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Apr 19, 2023

Chapter 11: Chapter 11: Inference About a MeanInference About a Mean

Page 2: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

In Chapter 11:

11.1 Estimated Standard Error of the Mean

11.2 Student’s t Distribution

11.3 One-Sample t Test

11.4 Confidence Interval for μ

11.5 Paired Samples

11.6 Conditions for Inference

11.7 Sample Size and Power

Page 3: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

§11.1 Estimated Standard Error of the Mean

• We rarely know population standard deviation σ instead, we calculate sample standard deviations s and use this as an estimate of σ

• We then use s to calculate this estimated standard error of the mean:

• Using s instead of σ adds a source of uncertainty z procedures no longer apply use t procedures instead

n

sSEx

Page 4: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

§11.2 Student’s t distributions • A family of distributions identified by “Student”

(William Sealy Gosset) in 1908

• t family members are identified by their degrees of freedom, df.

• t distributions are similar to z distributions but with broader tails

• As df increases → t tails get skinnier → t become more like z

Page 5: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.
Page 6: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

t table (Table C)

Table C:Entries t values Rows dfColumns probabilities

Use Table C to look up t values and probabilities

Page 7: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Left tail:Pr(T9 < -1.383) = 0.10

Right tail:Pr(T9 > 1.383) = 0.10

Understanding Table CLet tdf,p ≡ a t value with df degrees of freedom and cumulative probability p. For example, t9,.90 = 1.383

Table C. Traditional t tableCumulative p 0.75 0.80 0.85 0.90 0.95 0.975

Upper-tail p 0.25 0.20 0.15 0.10 0.05 0.025

df = 9 0.703 0.883 1.100 1.383 1.833 2.262

Page 8: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

§11.3 One-Sample t TestA. Hypotheses. H0: µ = µ0 vs. Ha: µ ≠ µ0 (two-sided)

[ Ha: µ < µ0 (left-sided) or Ha: µ > µ0 (right-sided)]

B. Test statistic.

C. P-value. Convert tstat to P-value [table C or software]. Small P strong evidence against H0

D. Significance level (optional). See Ch 9 for guidelines.

1 with 0stat

ndf

ns

xt

Page 9: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

One-Sample t Test: ExampleStatement of the problem: • Do SIDS babies have lower than average birth

weights?• We know from prior research that the mean birth

weight of the non-SIDs babies in this population is 3300 grams

• We study n = 10 SIDS babies, determine their birth weights, and calculate x-bar = 2890.5 and s = 720.

• Do these data provide significant evidence that SIDs babies have different birth weights than the rest of the population?

Page 10: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

A. H0: µ = 3300 versus Ha: µ ≠ 3300 (two-sided)

B. Test statistic

C. P = 0.1054 [next slide]Weak evidence against H0

D. (optional) Data are not significant at α = .10

91101

80.110720

33005.2890 0

stat

ndf

SE

xt

x

One-Sample t Test: Example

Page 11: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Converting the tstat to a P-value

tstat P-value via Table C. Wedge |tstat| between critical value landmarks on Table C. One-tailed 0.05 < P < 0.10 and two-tailed 0.10 < P < 0.20.

tstat P-value via software. Use a software utility to determine that a t of −1.80 with 9 df has two-tails of 0.1054.

Table C. Traditional t tableCumulative p 0.75 0.80 0.85 0.90 0.95 0.975

Upper-tail p 0.25 0.20 0.15 0.10 0.05 0.025

df = 9 0.703 0.883 1.100 1.383 1.833 2.262

|tstat| = 1.80

Page 12: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Two-sided P-value associated with a t statistic of -1.80 and 9 df

Page 13: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

§11.4 Confidence Interval for µ

n

stx n 21,1for CI %100)1(

• Typical point “estimate ± margin of error” formula • tn-1,1-α/2 is from t table (see bottom row for conf. level)• Similar to z procedure except uses s instead of σ • Similar to z procedure except uses t instead of z• Alternative formula:

n

sSESEtx xxn where

21,1

Page 14: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Confidence Interval: Example 1

grams 3405.6) to(2375.4 =

515.1 ±5.2890 10

720262.25.2890

for CI 95%

10 0.720 5.2890

205.1,110

n

stx

nsx

Let us calculate a 95% confidence interval for μ for the birth weight of SIDS babies.

Page 15: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Confidence Interval: Example 2

Data are “% of ideal body weight” in 18 diabetics: {107, 119, 99, 114, 120, 104, 88, 114, 124, 116, 101, 121, 152, 100, 125, 114, 95, 117}. Based on these data we calculate a 95% CI for μ.

120.0) (105.6, = 7.17 ±112.778

)44.3)(110.2(778.112))((

table) (from 110.2

400.318

242.14

18 424.14 778.112

2

205.

2

1,1

975,.171,1181,1

xn

n

x

SEtx

ttttn

sSE

nsx

Page 16: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

§11.5 Paired Samples

• Paired samples: Each point in one sample is matched to a unique point in the other sample

• Pairs be achieved via sequential samples within individuals (e.g., pre-test/post-test), cross-over trials, and match procedures

• Also called “matched-pairs” and “dependent samples”

Page 17: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Example: Paired Samples

• A study addresses whether oat bran reduce LDL cholesterol with a cross-over design.

• Subjects “cross-over” from a cornflake diet to an oat bran diet. – Half subjects start on CORNFLK, half on OATBRAN– Two weeks on diet 1 – Measures LDL cholesterol– Washout period– Switch diet– Two weeks on diet 2– Measures LDL cholesterol

Page 18: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Example, Data Subject CORNFLK OATBRAN

---- ------- -------

1 4.61 3.84

2 6.42 5.57

3 5.40 5.85

4 4.54 4.80

5 3.98 3.68

6 3.82 2.96

7 5.01 4.41

8 4.34 3.72

9 3.80 3.49

10 4.56 3.84

11 5.35 5.26

12 3.89 3.73

Page 19: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Calculate Difference Variable “DELTA”• Step 1 is to create difference variable “DELTA”• Let DELTA = CORNFLK - OATBRAN • Order of subtraction does not materially effect results (but but does

change sign of differences)• Here are the first three observations:

Positive values represent lower LDL on oatbran

ID CORNFLK OATBRAN DELTA ---- ------- ------- ----- 1 4.61 3.84 0.77 2 6.42 5.57 0.85 3 5.40 5.85 -0.45 ↓ ↓ ↓ ↓

Page 20: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Explore DELTA Values

Stemplot

|-0|42|+0|0133|+0|667788 ×1

Here are all the twelve paired differences (DELTAs): 0.77, 0.85, −0.45, −0.26, 0.30, 0.86, 0.60, 0.62, 0.31, 0.72, 0.09, 0.16

EDA shows a slight negative skew, a median of about 0.45, with results varying from −0.4 to 0.8.

Page 21: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Descriptive stats for DELTA

Data (DELTAs): 0.77, 0.85, −0.45, −0.26, 0.30, 0.86, 0.60, 0.62, 0.31, 0.72, 0.09, 0.16

0.4335

0.3808

12

d

d

s

x

n

The subscript d will be used to denote statistics for difference variable DELTA

Page 22: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

95% Confidence Interval for µd

n

stx dndd 21,1 for CI %100)1(

A t procedure directed toward the DELTA variable calculates the confidence interval for the mean difference.

)656.0 to105.0(

2754.00.3808 12

4335.201.20.3808 for CI %95

C) Table (from 201.2 use confidence 95%For 975,.111112 205

d

, tt .

“Oat bran” data:

Page 23: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Paired t Test

• Similar to one-sample t test • μ0 is usually set to 0, representing “no

mean difference”, i.e., H0: μ = 0• Test statistic:

ndf

ns

xt

d

d

1

0stat

Page 24: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Paired t Test: Example“Oat bran” data

A. Hypotheses. H0: µd = 0 vs. Ha: µd 0B. Test statistic.

C. P-value. P = 0.011 (via computer). The evidence against H0 is statistically significant.

D. Significance level (optional). The evidence against H0 is significant at α = .05 but is not significant at α = .01

111121

043.312/4335.

038083.0 0

stat

ndf

ns

xt d

Page 25: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

SPSS Output: Oat Bran data

Page 26: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

§11.6 Conditions for Inference

t procedures require these conditions:

• SRS (individual observations or DELTAs)

• Valid information (no information bias)

• Normal population or large sample (central limit theorem)

Page 27: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

The Normality Condition

The Normality condition applies to the sampling distribution of the mean, not the population. Therefore, it is OK to use t procedures when:

• The population is Normal

• Population is not Normal but is symmetrical and n is at least 5 to 10

• The population is skewed and the n is at least 30 to 100 (depending on the extent of the skew)

Page 28: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Can a t procedures be used?• Dataset A is skewed

and small: avoid t procedures

• Dataset B has a mild skew and is moderate in size: use t procedures

• Data set C is highly skewed and is small: avoid t procedure

Page 29: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

§11.7 Sample Size and Power

• Questions: – How big a sample is needed to limit the

margin of error to m?

– How big a sample is needed to test H0 with 1−β power at significance level α?

– What is the power of a test given certain conditions?

• In this presentation, we cover only the last question

Page 30: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Power

where: • α ≡ (two-sided) alpha level of the test• Δ ≡ “the mean difference worth detecting” (i.e.,

the mean under the alternative hypothesis minus the mean under the null hypothesis)

• n ≡ sample size• σ ≡ standard deviation in the population• Φ(z) ≡ the cumulative probability of z on a

Standard Normal distribution [Table B]

n

z||

121

Page 31: September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.

Power: Illustrative Example

SIDS birth weight example. Consider the SIDS illustration in which n = 10 and σ is assumed to be 720 gms. Let α = 0.05 (two-sided). What is the power of a test under these conditions to detect a mean difference of 300 gms?

26%about ispower The

2611.064.0

720

10|300|96.1

||1

21

n

z