Stats chapter 13
-
Upload
richard-ferreria -
Category
Documents
-
view
3.358 -
download
5
description
Transcript of Stats chapter 13
Chapter 13
Comparing Two Population Parameters
13.1 COMPARING TWO MEANS
Two-Sample Problems
The goal of this type of inference • compare the responses of two treatments
-or- • compare the characteristics of two
populations
• Separate samples from each population• Responses of each group are independent
of those in the other group
Before We Begin
• This is another set of PHANTOMS procedures
• It is important to note that “two populations” means that there is no overlap in the samples
• The sample sizes do not need to be equal
Hypotheses
There are two styles of writing hypotheses
Style 1H0: 1 = 2
Ha: 1 2, or
Ha: 1 > 2, or
Ha: 1 < 2
Hypotheses
There are two styles of writing hypotheses
Style 2H0: 1 - 2 = 0
Ha: 1 - 2 0, or
Ha: 1 - 2 > 0 (this implies 1 > 2), or
Ha: 1 - 2 < 0 (this implies 1 < 2)
Hypotheses
There are two styles of writing hypotheses
Style 2H0: 1 - 2 = 0
Ha: 1 - 2 0, or
Ha: 1 - 2 > 0 (this implies 1 > 2), or
Ha: 1 - 2 < 0 (this implies 1 < 2)
This style is more versatilesince it allows you to use adifference other than zero
Assumptions
Simple Random SampleEach sample must be from an SRS
IndependenceSamples may not influence each otherNo paired data!N1 > 10n1 and N2 > 10n2
(if sampling w/o replacement)
Assumptions
Normality (of sampling distibution)large samples (n1 > 30 and n2 > 30)this is the Central Limit Theorem
-OR-medium samples (15<n1<30 and 15<n2<30)-Histogram symmetric or slight skew and single peak-Norm prob plots for n1 and n2 are linear-No Outliers
-OR-
Assumptions
Normality (of sampling distibution)small samples (n1<15 and n2<15)-Histogram symmetric and single peak-Norm prob plots for n1 and n2 are linear-No Outliers
2-sample test statistic
z-tests
t-tests
1 2 1 2
2 21 2
1 2
x xz
n n
1 2 1 2
2 21 2
1 2
df
x xt
s sn n
df = smaller of n1 -1 or n2 - 1
Example 13.2
Researchers designed a randomized comparative experiment to establish the relationship between calcium intake and blood pressure in black men. Group 1 (n1 = 10) took calcium supplement, Group 2 (n2 =11) took a placebo. The response is the decrease in systolic blood pressure
Group 1: 7, -4, 18, 17, -3, -5, 1, 10, 11, -2Group 2: -1, 12, -1, -3, 3, -5, 5, 2, -11, -1, -3
Example 13.2
Parameter1 - 2 = difference in average systolic blood pressure in healthy black men between the calcium regimen and the placebo regimen xbar1 - xbar2 = difference in average systolic blood pressure in healthy black men in the two samples between the calcium regimen and the placebo regimen
Example 13.2
Hypotheses H0: 1 - 2 = 0
Ha: 1 - 2 > 0
Example 13.2
AssumptionsSimple Random Sample
We are told that both samples come from a randomized design
IndependenceBoth samples are independent, and (n1) N1 > 10(10) =100, (n2) N2 > 10(11)=110the population of black men is greater than 110
Example 13.2
Assumptions (cont)Sample 1 Sample 2
Example 13.2
Assumptions (cont)Normality
Both samples are single peaked with moderate skewness and approximately normal with no outliers.Although sample 1 shows some skewness, the t-procedures are robust enough to handle this skew.
Example 13.2
Name of TestWe will conduct a 2-sample t-test for population means
Test Statistic
1 1 1
2 2 2
10, 5, 8.7433
11, 0.2727, 5.9007
9
n x s
n x s
df
1 2 1 2
2 21 2
1 2
df
x xt
s sn n
9 2 2
5 0.2727 0
87433 5.900710 11
t
9 1.604t
Example 13.2
P Value
DecisionFail to Reject H0 at the 5% significance level
9PValue = P 1.604t
PValue = 0.0716
Example 13.2
SummaryApproximately 7% of the time, our samples of size 10 and 11 would produce a difference at least as extreme as 5.2727Since this p-value is not less than the presumed = 0.05, we will fail to reject H0
We do not have enough evidence to conclude that calcium intake reduces the average blood pressure in healthy black men.
Confidence Intervals
Confidence Interval for a difference to two sample means
2 2
1 221
1 2
*s s
C x x tn n
Robustness
2-sample t-procedures are more robust than one sample procedures. They can be used for sample sizes as small as n1 = n2 = 5 when the samples have similar shapes.
Guidelines for using t-procedures• n1 + n2 < 15: data must be approx normal,
no outliers• n1 + n2 >15: data can have slight skew,
no outliers• n1 + n2 > 30: data can have skew
Degrees of Freedom
• We have been using the smaller of n1 or n2 to determine the df
• This will ensure that our pvalue is smaller than the calculated pvalue and confidence intervals are smaller than calculated.
• These are “worst case scenario” calculations• There is a more exact df formula on p792• Your calculator also uses a df formula for two
samples• You do not need to memorize these other
formulas!
Calculators
• The tests we are using are located in the [STAT] -> “TESTS” menu
• 2-SampZTest = two sample z-test for means
• 2-SampTTest = two sample t-test for mean• 2-SampZInt = two sample z Confidence
Interval for difference of means• 2-SampTInt = two sample t Confidence
Interval for difference of means
Calculators
• Freq1 and Freq2 should be set to “1”• Pooled should be set to “NO”
13.2 COMPARING TWO PROPORTIONS
2-Sample Inference for Proportions
• We are testing to see if– Two populations have the same
proportionOR
– A treatment affects the proportion
• Remember: this is not a procedure for paired data (matched pair design/pre- and post-test)
Combined Proportion
• One of the underlying assumptions of the test is that the two proportions actually come from the same population.
• The test makes use of the “combined proportion” as below: 1 2
1 2
combined successes
combined individualsc
X Xp
n n
Hypotheses
There are two styles of writing hypotheses
Style 1H0: p1 = p2
Ha: p1 p2, or
Ha: p1 > p2, or
Ha: p1 < p2
Hypotheses
There are two styles of writing hypotheses
Style 2H0: p1 - p2 = 0
Ha: p1 - p2 0, or
Ha: p1 - p2 > 0 (this implies p1 > p2), or
Ha: p1 - p2 < 0 (this implies p1 < p2)
Hypotheses
There are two styles of writing hypotheses
Style 2H0: p1 - p2 = 0
Ha: p1 - p2 0, or
Ha: p1 - p2 > 0 (this implies p1 > p2), or
Ha: p1 - p2 < 0 (this implies p1 < p2)
This style is more versatilesince it allows you to use adifference other than zero
Assumptions
• Simple Random SampleBoth samples must be viewed as an SRS from their respective population or two groups from a randomized experiment
• IndependenceN1 > 10n1 and N2 > 10n2
• Normalityn1(pchat)> 5, n1(qchat)> 5 and n2(pchat)> 5, n2(qchat)> 5
Test Statistic
• The test statistic for proportions is always from the Normal distribution
1 2 1 2
1 2
1 1c c
p p p pz
p qn n
Example 13.9
A study was conducted to find the effects of preschool programs in poor children. Group 1 (n=61) had no preschool and group 2 (n=62) had similar backgrounds and attended preschool. The study measured the need for social services when the children became adults. After investigation it was found that p1hat = 49/61 and p2hat = 38/62.Does the data support the claim that preschool reduced the social services claimed?
Example 13.9
Parameters• p1 = proportion of adults who did not receive
preschool and file for social services• p2 = proportion of adults who received
preschool and filed for social services• p1hat = proportion of adults in group 1who did
not receive preschool and file for social services
• p2hat = proportion of adults in group 2 who received preschool and filed for social services
Example 13.9
Hypotheses• H0: p1 – p2 = 0
• Ha: p1 – p2 > 0
• The proportion of non-preschool is greater than that of pre-school
49 38 870.7073
61 62 123cp
Example 13.9
AssumptionsSimple Random Sample
Since the measurements are from a randomized experiment, we can assume that they are from an SRS
IndependenceN1 > 10(61) = 610: more than 610 do not attend preschoolN2 > 10(62) = 620: more than 620 attend preschool
Normality61(.70) = 42.7 > 5, 61(.30) = 18.3 > 562(.70) = 43.4 > 5, 62(.30) = 18.6 > 5
Example 13.9
Name of Test2-Sample Z-test for proportionsTest Statistic
1 2 1 2
1 2
1 1c c
p p p pz
p qn n
0.803 0.613 0
1 10.7073 0.2927
61 62
2.316
Example 13.9
PvaluePval = P(z > 2.316) = 0.0103
Make DecisionReject H0
Example 13.9
SummaryApproximately 1% of the time, two samples of size 61 and 62 will produce a difference of at least 0.190.Since our p value is less than an of 0.05, we will reject our H0.
Our evidence supports the claim that enrollment in preschool reduces the proportion of adults who file social services claims.
Confidence Intervals
The confidence interval for the difference between the proportions of two samples is given as:
• Notice that the Confidence Interval does not use pchat and qchat.
1 1 2 21 2
1 2
CI *p q p q
p p zn n
Confidence Intervals
Assumptions• Simple Random Sample
Both samples must be viewed as an SRS from their respective population or two groups from a randomized experiment
• IndependenceN1 > 10n1 and N2 > 10n2
• Normalityn1(p1)> 5, n1(q1)> 5 and n2(p2)> 5, n2(q2)> 5(again, not pc or qc)
Calculators
The tests we are using are located in the [STAT] -> “TESTS” menu
• 2-PropZTest = 2 proportion z-test• 2-PropZInt = 2 proportion confidence
interval