Chapter6.pptx
Transcript of Chapter6.pptx
CHAPTER 6
Hypothesis Testing
Hypothesis Testing• A statistical hypothesis is a claim about a
population characteristic (and on occasion more than one).
• An example of a hypothesis is the claim that the population mean is some value, e.g. .
Hypotheses and Test Procedures• Null Hypothesis: H0
The claim that is initially assumed to be true
• Alternative Hypothesis: H1 or Ha
The complementary assertion to H0
The new statement that we wish to test
Hypothesis Test Example
We own a paint company. The old paint takes 60 minutes to dry. We want to see if a new paint will dry faster.
H0: = 60μ
H1: < 60μ
A test procedure is created under the assumption of H0 and then it is determined how likely that assumption is compared to its complement HA.
The decision will be based onTest Statistic and Rejection RegionOrp-value and Significance Level
Test Procedures
Test Statistic - a function of the sample data on which the decision (reject or do not reject H0 is made)
Rejection Region - set of all test statistic values for which H0 will be rejected
The basis for choosing a particular rejection region lies in an understanding of the errors that can be made.
Hypotheses Test Errors• Type I Error: rejecting a true H0
• Type II Error: failing to reject a false H0
REJECT H0FAIL TO
REJECT H0
True H0TYPE I ERROR CORRECT
False H0 CORRECT TYPE II ERROR
Since we wish to control for the type I error, we set,,
The default value of (significance level) is usually taken to be 0.05.
Motivating the test procedure
Example: The drying time of a certain type of paint, under fixed environmental conditions, is known to be normally distributed with mean 75 min. and standard deviation 9 min. Chemists have added a new additive that is believed to decrease drying time and have obtained a sample of 35 drying times and wish to test their assertion at significance level.Solution: Here we are interesting in estimating the following hypotheses (let be the mean of drying time),
An obvious candidate for a test statistic is which is normally distributed.Thus, under ,
or, .If the test value is small enough, i.e then, we reject .
What is the logic?
We assume that sample mean is a “good” estimate for μ and hence should be close to 0, which implies T.S. should be close to zero. However, if it is not, then it implies that was not a “good” hypothesis value for the true mean.
Assume that from the 35 samples, then, T.S.=
thus,
So, we reject the null hypothesis at significance level.
We can also make conclusion using p-value!
P-value
The p-value of a hypothesis test is the probability of observing the specific value of the test statistic, T.S., or a more extreme value, under the null hypothesis.
The direction of the extreme values is indicated by the alternative hypothesis.
Computing p-value for our example
In this example values more extreme than -2.76 are as the alternative, , is indicating values less than. Thus,
p-value=
which indicates that p-value
so we reject the null hypothesis!
The null hypothesis is rejected in favor of the alternative hypothesis as the probability of observing the test statistic value of -2.76 or more extreme (as indicated by Ha) is smaller than the probability of the type I error () we are willing to undertake.
Large sample test for population mean (section 6.1)
Let be a random sample with (n>30) and hence is normally distributed. To test,I. vs II. vs III. vs at the significance level, first compute the test statistic,
Making decision
Reject the null if,(i) (i) p-value=(ii) (ii) p-value=(iii) (iii) p-value=
Remark 6.1. If is unknown and instead s is used, one should be using Student’s-t and the relevant t-table instead of the z-table, but since the sample size is large the two distributions are equivalent.
Example: A scale is to be calibrated by weighing a 1000 g test weight 60 times. The 60 scale readings have mean 1000.6 g and standard deviation 2 g. Find the P-value for testing versus .Solution:Assuming is true, from C.L.T we can say,
.We can approximate with because sample size is large. Thus,
p-value=
P-value is very small, we have some strong evidence to reject the null hypothesis.
If significance level is given in the problem, then compare p-value with and reject whenever p-value is less than
p-value Evidence against
No evidence
Weak evidence
Strong evidence
Very strong evidence
Making decision solely based on p-value, i.e. when significance level is not given,
Example: in the previous example perform a hypotheses testing for versus at significance level .Solution:
p-value=So, we do not have any evidence to reject the null hypothesis.
Tests for population proportion (section 6.3)
• Let be the number of successes in i.i.d Bernoulli trials with probability of success , then
• By C.L.T. we know under certain conditions (, ),
To test,I. vs II. vs III. vs
we must assume, under the null hypothesis , the number of successes and failures is greater than 5, i.e. and , such that under and using C.L.T, we can say,
The test statistic is
and the r.v. corresponding to the test statistic has a standard normal distribution under the null hypothesis assumption. Reject the null if
(i) (i) p-value=(ii) (ii) p-value=(iii) (iii) p-value=
Example: For a sample of 1225 baselines, 926 gave results that were within the class C spirit leveling tolerance limits. Can we conclude that this method produces results within the tolerance limits more than 75% of the time?Solution: First, we should write the hypotheses,
Second, we should check the normality conditions under the null hypothesis,
So, we have normality under the assumption of . Thus,
The observed sample proportion is,
the test statistic is,
and p-value is,
So, we do not have any evidence to reject
Small sample test for population mean (section 6.4)
If the sample size is small, i.e. , then the C.L.T. is not applicable for and therefore we must assume that the individual random variables corresponding to the sample are normal random variables with mean and variance. As a result,
.
Thus, if is known then we can proceed exactly as in the case of large sample test for population mean.
What if is unknown?
If is unknown, which is usually the case, we replace it by its sample estimate s. Consequently, under we have,
and then for the observed value
At the significance level, for the same hypothesis tests as before, we reject if
(i) (i) p-value=(ii) (ii) p-value=(iii) (iii) p-value=
Example: Muzzle velocities of eight shells tested with a new gunpowder yield a sample mean of 2959 feet per second and a standard deviation of 39.4. The manufacturer claims that the new gunpowder produces an average velocity of no less than 3000 feet per second. Does the sample provide enough evidence to contradict the manufacturer’s claim at 0.05 significance level? (assume velocity of the new gunpowder is normally distributed)
Solution: Let be the mean velocity of the new gunpowder.μHere, we are interested in testing
H0: 3000μ
H1: < 3000μ
Because we want to see whether there is evidence to refuse the manufacturer's claim. The test statistic is,
and the rejection region is
0.0101
So, we have very strong evidence against the null hypothesis.
Remark: The values contained within a two-sided C.I. are precisely those values for which the p-value of a two sided hypothesis test will be greater than .
Example: The lifetime of single cell organism is believed to be on average 257 hours. A small preliminary study was conducted to test whether the average lifetime was different when the organism was placed in a certain medium. The measurements are assumed to be normally distributed and turned out to be 253, 261, 258, 255, and 256.
Solution 1: Here we want to test v.s. with and , the teat statistic value is
p-value Hence, since the p-value is large we fail to reject the null hypothesis and we conclude that the population mean is not statistically different from 257.
Solution 2: Instead of hypotheses testing if a two sided 95% confidence interval was constructed by,
it is clear that the null hypothesis value of is a plausible value and consequently we do not reject at 0.05 significance level.
Large sample test for difference of two means (section 6.5)
Let and represent two independent random large samples with and with means and variances , respectively. By C.L.T we have,
How To Test the following hypotheses?!I. vs II. vs III. vs
we assume that the variances are known and the test statistic is
The r.v. corresponding to the test statistic has a standard normal distribution under the null hypothesis , that . Reject the null if
(i) (i) p-value=(ii) (ii) p-value=(iii) (iii) p-value=
Example: Two welding procedures are two be testing on the property of the diameter of inclusions, which are particles embedded in the weld. A sample of 544 inclusions in welds made using method X and averaged 0.37 m in diameter, with a μstandard deviation of 0.25 m. A sample of 581 inclusions in μwelds made using method Y and averaged 0.40 m in diameter, μwith a standard deviation of 0.26 m. Can you μ conclude that the mean diameter for Y exceeds that of X by more than 0.015 m.μSolution: vs The test statistics is
This is a one-tailed test with .We failed to reject the null hypothesis.
Tests for the difference between two proportions (section 6.6)
Let and Y represent two independent Binomial random variables resulted from two independent i.i.d. Bernoulli trials. To test,
I. vs II. vs III. vs we first need an appropriate test statistic.
We must assume that the number of successes and failures is greater than 10 for both samples.
As the null hypotheses values for and are not available we simply check that the sample successes and failures are greater than 10. By virtue of the C.L.T.
and test statistic would be constructed in the usual way.
However, under it is assumed that = which implies that the two variances of the two Bernoulli trials are equal ().
Therefore we can replace and in the variance by the pooled estimate,
The test statistic is then,
and the r.v. corresponding to the test statistic has a standard normal distribution under the null hypothesis.Thus, we reject the null hypothesis whenever,(i) (i) p-value=(ii) (ii) p-value=(iii) (iii) p-value=
Example: We want to compare the proportion of defective electric motors turned out by two shifts of workers. From the large number produced in a given week, 250 motors were selected from the output of shift I and 200 motors were selected from the output of shift II. The sample from shift one revealed 25 to be defective and the sample from shift II 30 faulty motors. Is it true to say the difference between the proportions of defective motors produced in two shifts is not equal to zero? Use a 0.05 significance level.
Solution:Let be the proportion of defective motors produced by workers in shift I and be the proportion of defective motors produced by workers in shift II.
The goal is testing versus ,using , , and
we get and . Since, , and , and so the sample sizes are large enough to use normal approximation. Also,
=0.11Thus,
and,P-value=
So, we failed to reject the null hypothesis at 0.05 significance level. That is at 0.05 level significance level, difference between the proportions of defective motors
produced in two shifts is not equal to zero.
Small sample test for the difference between two means (6.7)
In this case, since the C.L.T. is not applicable we must assume that the two random samples are normally distributed and independent.1. If the variances are known, the test statistic is,
Which has a normal distribution under the null hypothesis.
2. If variances are unknown (which is usually the case),
which has a distribution under , where the degrees of freedom are given by
We reject if(i) (i) p-value=(ii) (ii) p-value=(iii) (iii) p-value=Remark: If we have equality of variances () then we replace both and with
And in this case the degrees of freedom for the t distribution is .
Example: The prestressing wire on each of two concrete pipes manufactured at different times was compared for torsion properties. Ten specimens randomly selected from each pipe were twisted in a laboratory apparatus until they broke the number of revolutions until complete failure was recorded. The results are as follows, with C1 and C2 denoting the two concrete pipes:
C1: 5.83, 8.66, 4.75, 3.00, 3.37, 3.63, 4.00, 4.63, 4.25, 4.13C2: 3.38, 2.81, 7.00, 1.50, 5.88, 5.25, 4.08, 7.63, 4.50, 4.88
Is there any evidence to suggest that the true mean revolutions to failure differ for the wire on the two pipes?Solution: MINITAB
Test for paired data (section 6.8)
In the event that two samples are dependent, i.e. paired, such as when two different measurements are made on the same experimental unit.
Where we consider the data in the form of the pairs , and construct the one-dimensional, i.e. one-sample where for As shown earlier, .
To test,I. vs II. vs III. vs
perform a one-sample hypothesis test by either a large or small sample inference using the test statistic
or
Example: The two drying methods for concrete were used on seven different mixes, with each mix of concrete subjected to each drying method. The resulting strength test measurements (in psi) are given below. Is there evidence of a difference between average strengths for the two drying methods at the 10% significance level?
Solution: MINITAB
Mix Method I Method II
A 3160 3170
B 3240 3220
C 3190 3160
D 3520 3530
E 3480 3440
F 3220 3210
G 3120 3120
Power of the Test
The power of a test is the probability of rejecting whenever it is false.
Power
Exam 2
1. Section 2.62. Section 4.113. Sections 5.1-5.74. Sections 6.1-6.8