Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.
-
Upload
brittney-lloyd -
Category
Documents
-
view
225 -
download
3
Transcript of Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.
![Page 1: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/1.jpg)
Chapter 9: Inferences for Two –Samples
Yunming Mu
Department of Statistics
Texas A&M University
![Page 2: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/2.jpg)
Outline
1 Overview
2 Inferences about Two Means: Independent and Small Samples
3 Inferences about Two Means: Independent and Large Samples
4 Inferences about Two Proportions
5 Inferences about Two Means: Matched Pairs
![Page 3: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/3.jpg)
OverviewThere are many important and meaningful
situations in which it becomes necessary to compare two sets of sample data.
![Page 4: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/4.jpg)
Definitions
Two Samples: IndependentThe sample values selected from one population are not related or somehow paired with the sample values selected from the other population.
If the values in one sample are related to the values in the other sample, the samples are dependent. Such samples are often referred to as matched pairs or paired samples.
![Page 5: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/5.jpg)
Example
Population of all female college students
Sample of n2 = 21 females report average of 85.7 mph
Population of all male college students
Sample of n1 = 17 males report average of 102.1 mph
Do male and female college students differ with respect to their fastest reported driving speed?
![Page 6: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/6.jpg)
Graphical summary of sample data
75 85 95 105 115 125 135 145
Fastest Driving Speed (mph)
Gender
female
male
![Page 7: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/7.jpg)
Numerical summary ofsample data
Gender N Mean Median TrMean StDevfemale 21 85.71 85.00 85.26 9.39male 17 102.06 100.00 101.00 17.05
Gender SE Mean Minimum Maximum Q1 Q3female 2.05 75.00 105.00 77.50 92.50male 4.14 75.00 145.00 90.00 115.00
The difference in the sample means is 102.06 - 85.71 = 16.35 mph
![Page 8: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/8.jpg)
The Question in Statistical Notation
Let M = the average fastest speed of all male students.and F = the average fastest speed of all female students.
Then we want to know whether M F.
This is equivalent to knowing whether M - F 0
![Page 9: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/9.jpg)
All possible questions in statistical notation
In general, we can always compare two averages by seeing how their difference compares to 0:
This comparison… is equivalent to …
1 2
1 - 2
0
1 > 2
1 - 2 > 0
1 < 2
1 - 2 < 0
![Page 10: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/10.jpg)
Set up hypotheses
• Null hypothesis: – H0: M = F [equivalent to M - F = 0]
• Alternative hypothesis:– Ha: M F [equivalent to M - F 0]
![Page 11: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/11.jpg)
Inferences about Two Means:Independent and Small Samples
![Page 12: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/12.jpg)
Assumptions:
1. The two samples are independent.
2. Both samples are normal or the two sample sizes are small, n1 < 30 and n2 < 30
3. Both variances are unknown but equal. Assume variances are equal only if neither sample standard deviation is more than twice that of the other sample standard deviation.
Pooled Two-Sample T Test and T Interval
![Page 13: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/13.jpg)
Confidence IntervalsNormal Samples w/ Unknown Equal
Variance
1 2
2/ 2, 2 1 2
2 22 1 1 2 2
1 2
(1/ 1/ )
( 1) ( 1)
( 2)
n n p
p
E t S n n
n S n SS
n n
(x1 - x2) - E < (µ1 - µ2) < (x1 - x2) + E
where
![Page 14: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/14.jpg)
Leaded vs Unleaded
Each of the cars selected for the EPA study was tested and the number of miles per gallon for each was obtained and recorded (Leaded=1 and Unleaded=2).
Leaded (1) Unleaded(2)
n 11 10
x 17.2 19.9
S 2.1 2
![Page 15: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/15.jpg)
95% Confidence Interval
1 2/ 2, 2 0.025,11 10 2
2 22
0.05, 2.093
(11 1)2.1 (10 1)2.04.216
(11 10 2)
1 117.2 19.9 2.093* 4.216( )
11 10
n n
p
t t
S
(-4.58, -0.82)
![Page 16: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/16.jpg)
Pooled Two-Sample T TestsNormal Samples w/ unknown Variance
1 2 1 2
21 2
( )
(1/ 1/ )p
X Xt
S n n
P-value: Use t distribution with n1+n2-2 degrees of freedom and find the P-value by following the same procedure for t tests summarized in Ch 8.
Critical values: Based on the significance level , use for upper tail tests, use
for lower tail tests and use for two tailed tests.
1 2, 2n nt
1 2, 2n nt 1 2/ 2, 2| |n nt
![Page 17: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/17.jpg)
Leaded vs Unleaded
Claim: 1 < 2
Ho : 1 = 2
H1 : 1 < 2
= 0.01
t
Fail to reject H0Reject H0
-1.729
0.05,19 1.729t
![Page 18: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/18.jpg)
Leaded vs Unleaded
Pooled Two-Sample T Test
1 2 1 2
21 2
( ) 17.2 19.9 03.01
4.216(1/11 1/10)(1/ 1/ )p
X Xt
S n n
Claim: 1 < 2
Ho : 1 = 2 H1 : 1 < 2 = 0.05
![Page 19: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/19.jpg)
Leaded vs Unleaded
Claim: 1 < 2
Ho : 1 = 2
H1 : 1 < 2
= 0.01
t
Fail to reject H0Reject H0
-1.729sample data:t = - 3.01
Reject Null
There is significant evidence to support the claim that the leaded
cars have a lower mean mpg than unleaded cars
P-value=0.0077(=area of red region)
![Page 20: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/20.jpg)
Assumptions:
1. The two samples are independent.
2. Both samples are normal or the two sample sizes are small, n1 < 30 and n2 < 30
3. Both variances are unknown but unequal
Two-Sample T Test and T Interval
![Page 21: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/21.jpg)
Confidence IntervalsNormal Samples w/ Unknown Unequal
Variance
2 21 2
/ 2,1 2
v
S SE t
n n
(x1 - x2) - E < (µ1 - µ2) < (x1 - x2) + E
where
2 2 21 2 1 2
1 24 41 2 1 2
1 2
[( ) ( ) ], ,
( ) ( )1 1
se se S Sv se se
se se n nn n
(round v down to the nearest integer)
![Page 22: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/22.jpg)
Unpooled Two Sample T-TestNormal Samples w/ Unknown Variance
1 2 1 2
2 22 1 1 2
( )
/ /
X Xt
S n S n
P-value: Use t distribution with v degrees of freedom and find the P-value by following the same procedure for t tests summarized in Ch 8.
Critical values: Based on the significance level , use for upper tail tests, use
for lower tail tests and use for two tailed tests.
,vt,vt
/ 2,| |vt
![Page 23: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/23.jpg)
Example
We compare the density of two different types of brick. Assuming normality of the two densities distributions and unequal unknown variances, test if there is a difference in the mean densities of two different types of brick. Type I brick Type 2 brick
n 6 5
x 22.73 21.95
S 0.10 0.24
1x
![Page 24: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/24.jpg)
Unpooled Two-Sample T-Test
1 2 1 2
2 2 2 21 1 2 2
0.025,6
( ) 22.73 21.95 06.792
/ / 0.1 / 6 0.24 / 5
6,| | | 2.446 |
X Xt
S n S n
v t
Ho : 1 = 2 H1 : 1 2 = 0.05
P-Value = 0.001; Reject the null and conclude that there is significant difference in the mean densities of the two types of brick
![Page 25: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/25.jpg)
Two-sample t-test in Minitab
• Select Stat. Select Basic Statistics. • Select 2-sample t to get a Pop-Up window.• Click on the radio button before Samples in one
Column. Put the measurement variable in Samples box, and put the grouping variable in Subscripts box.
• Specify your alternative hypothesis.• If appropriate, select Assume Equal Variances.• Select OK.
![Page 26: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/26.jpg)
Pooled two-sample t-test
Two sample T for Fastest
Gender N Mean StDev SE Meanfemale 21 85.71 9.39 2.0male 17 102.1 17.1 4.1
95% CI for mu (female) - mu (male ): ( -25.2, -7.5)T-Test mu (female) = mu (male ) (vs not =): T = -3.75 P = 0.0006 DF = 36Both use Pooled StDev = 13.4
![Page 27: Chapter 9: Inferences for Two –Samples Yunming Mu Department of Statistics Texas A&M University.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649d215503460f949f6729/html5/thumbnails/27.jpg)
(Unpooled) two-sample t-test
Two sample T for Fastest
Gender N Mean StDev SE Meanfemale 21 85.71 9.39 2.0male 17 102.1 17.1 4.1
95% CI for mu (female) - mu (male ): ( -25.9, -6.8)T-Test mu (female) = mu (male ) (vs not =): T = -3.54 P = 0.0017 DF = 23