8-Hypothesis 28 Oct 11
-
Upload
priyankabagdi1938 -
Category
Documents
-
view
97 -
download
0
Transcript of 8-Hypothesis 28 Oct 11
1/1/2007
1
Hypothesis Testing
Prof.Dhananjay M.ApteTrainer-Six Sigma,
Faculty-Statistics, Q.T. Operation Mgt.
[email protected] 98231 90939
For private circulation only. All rights reserved
Example 1
Mango farms in Ratnagiri District produce average 500mangoes (per farm) with a standard deviation 96
(Information source- Food Report journal)
After Inclusion of special fertilizer, out put was measured from 50 farms, The average output was 535 mangoes per farm
• Is the out put more ?..................................yes, algebraically.
• Is it statistically MORE ?• Is there STATISTICAL DIFFERENCE between the out put after and before the inclusion of
fertilizer ?
cal
Always consider
absolute value
Also need Z tab (Critical value)
At 0.05 significance level…
1/1/2007
2
1.645
In the said problem…
Since Z cal > Z tab….There IS Statistical DIFFERENCE between x bar and Mu.
Hence… Inference (conclusion) is:•X bar is not only algebraically but statistically more than Mu
•Output after inclusion of fertilizer, is statistically more
than Out put Before inclusion of fertilizer
•Fertilizer is effective.
Z cal = 2.57 Z tab 0.05 = 1.645
Z cal > Z tab Z cal < Z tab
There is Statistical difference
between x bar and Mu
There is NO Statistical difference
between x bar and Mu
1/1/2007
3
Example 2
A Sensex Office claims that the average annual family income on Metropolis
is $ 48,432 . But In a report prepared by the Economic Research Department
of a major bank the Department Manager says that an average income of
$ 48,574 out of random sample of 400 families. Is Sensex office’s claim is
L E S S than that of Department Manager. Consider a standard deviation of
sample 2000 ?
Z tab 0.05 = 1.645 ……
Since Z cal < Z tab…..NO DIFFERENCE
Hence… Inference (conclusion) is:
•Both claims are (Statistically) same
•Sensex office’s claim is algebraically less but statistically NOT
•Whatsoever the algebraic difference is seen is just “Chance”
•No “significant” difference.
S.D.plays vital role
Test Statistic
• Area = Significance level = α …called Alfa
• Confidence level is the level of trueness / confidence
• = 1- Significance level
• If not mentioned in the problem…consider α =
0.05
• α is described in Fraction… as 0.05 or in
Percentage… as 5 %
1/1/2007
4
Example 3The lifetime of a certain brand of heat pump is known to be normally
distributed with a standard deviation of 2. A sample of 6 heat pumps yielded
the observations: 2.0 1.3 6.0 1.9 5.1 4
At α = 0 .05 is there reason to believe that the mean life of the heat pumps is
GREATER THAN 2 ?
……Since Z cal > Z tab…..There IS a DIFFERENCE
Hence… Inference (conclusion) is:
•Mean life for Heat pump is greater than 2•There is reason to believe that the mean life of the heat pumps is
significantly greater than 2.
What, if S.D. is not given ?……calculate it from the data
Z cal = 1.69Z tab 0.05 = 1.645
….x bar = 3.383 , µ = 2
Exercise
The maker of a certain model car claimed that his car
averaged at least 31 miles per gallon of gasoline. A
sample of 36 cars was selected and each car was driven with one gallon of regular gasoline. The sample showed
a mean of 29.43 miles with a standard deviation of 3 miles. With 95 % Confidence level, what do you conclude about the manufacturers claim?
( Is he making TALLER claim ? )
Ans.
Cal = 3.14
Tab = 1.645
Taller claim
1/1/2007
5
“ t ” testExample 4
The maker of a certain model car claimed that his car averaged at least 31 miles per gallon of gasoline. A sample of 9 cars was selected and each car
was driven with one gallon of regular gasoline. The sample showed a mean
of 29.43 miles with a standard deviation of 3 miles. With 95 % Confidence
level ( Alfa = 0.05) , what do you conclude about the manufacturers claim? Whether More?
If S.D. of Population is NOT known as well as sample size is less or
equal to 30… Then use another table, t distribution table to calculate
tab value, ttab….This is.. t Test .
How ever calculated value, t cal is calculated same way as z cal
…..Consider absolute value = 1.57
For knowing t tab, we must know Degrees of Freedom (DOF)… = n-1 =9-1 =8
1/1/2007
6
t cal = 1.57 t tab 0.05,8 = 1.860
……Since t cal < t tab…..There is NO DIFFERENCE
Hence… Inference (conclusion) is:
• Manufacturer’s claim and test result is same
•There is insufficient evidence to doubt the
manufacturers claim concerning the gas mileage.
2 Tailed Test…
• Tests with > or < signs (i.e. Greater than, Less
than ) are 1 tailed Tests
• Tests with = or not equal to signs are 2 tailed
Tests
• “Cal” values remain unchanged in 2 tail,
only “Tab” values change.
• “Tab” values are determined from Alfa/ 2
1/1/2007
7
2 Tailed test…
Example 5
Suppose we have a random sample of n = 25 measurements of chest circumference from a population of newborns with σ = 0.7 inches and the sample mean = 12.6 inches. Is it likely that the population mean has the
value µ = 13.0 inches ? Consider Alfa = 0.05 ( Is….. x bar = Mu….. ? )
1. Its 2 Tailed Test….because equality is a concern
2. And its Z Test ………S.D. of population is known.
Test Statistic Z tab 0.025 = 1.96Refer to table…
……Since Z cal > Z tab…..There IS a DIFFERENCE
Hence… Inference (conclusion) is:
•our observed value of x bar = 12.6 for the sample mean
is too rare for us to believe that µ = 13.0
Hypotheses terms and wordings
• Null Hypothesis… H0– Statement of innocence made before Test calculations
– There is NO difference
– H0 : x bar = μ
• Alternative Hypothesis…HA
– There is difference
– HA : x bar ≠ μ
• Hypothesis is Accepted, Rejected, Not accepted, Not rejected
• Acceptance, Rejection Region, Critical Value
1/1/2007
8
Errors in Hypothesis Testing
A type I error consists of rejecting the
null hypothesis H0 when it was true.
A type II error consists of not rejecting
H0 when H0 is false.
and are the probabilities of type
I and type II error, respectively (The so
called Alfa, Beta Error)
Students in class C Students in class D
Student
Height
(cm)
145 149 152 153 154 148 153 157 161 162
154 158 160 166 166 162 163 167 172 172
166 167 175 177 182 175 177 183 185 187
Example 6 A random sample of 15 students- State whether there is a difference
Between (mean) height of students in class C & D (Consider 5 % significance level)
It’s a “ 2 tailed” “t” test
2 Samples Test
1/1/2007
9
x1 161.6
x2 168.27
s1 10.86
s2 11.74
n1,n2 15
tcal 1.62
t tab 0.025,28 = 2.048
…Since t cal < t tab…There is NO DIFFERENCE
Paired Tests
Does additive improves performance ? Test it with 5 % significance
(is the mileage more ?….1 tailed, t test… sample less than 30)
1/1/2007
10
Determining
t tab…
DOF = 9
Solution…
1/1/2007
11
Thus Paired Test considers….
Confidence Interval (C.I.) of Mean
• Point Estimate- We take a sample from population, Find Mean (X bar). We estimate the mean of Population.
• In some cases, the estimation of mean is stated in interval….Called Interval Estimate or Confidence Interval (C.I.)
• Formulae of C.I.…..
“Z” need to be replaced by “ t ” if n < = 30 & S.D.of population is not known
Z, t are Tabulated values ,needs two tailed consideration.
If S.D.of Population is not known, then consider that of Sample.
Alfa is generally considered = 0.05
1/1/2007
12
• Example Sample of 20 sales invoices (bills), mean amount $110.27,
sample std dev =$28.95 . Find at what range Mean of all the invoices
(in population) lies. (To determine C.I.)
56.1327.110
20
95.28093.227.110
1
n
stX n
83.123$71.96$ or
Hence… Inference (conclusion) is:
•Mean amount of all the invoices of population will lie between
96.71 & 123.83•The surety level (Confidence Level) for above statement is 95 %•Or 95 % Samples will have their (sample) mean lying in the above
rangeAs Confidence level increases…The range ( C.I.) increases
n < = 30 & S.D.of population
is not known…
t tab 0.025, 19 = 2.093
Confidence Interval (C.I.) for Proportion
• CI for p is given below, with ps = sample
proportion
– When n is large use Z value
– When n is small use t value
n
ppzpP ss
s
1
1/1/2007
13
Example
• Sample of 100 sales invoices, 10 have
errors, what is 95% CI
• Ps = 10/100 =0.1
• 95% CI for P is =
100
9.01.096.11.0
0588.01.0
P - Value
The P-value is the smallest level of
significance at which H0 would be
rejected when a specified test procedure
is used on a given data set.
0
1. -value
reject at a level of
P
H
0
2. -value
do not reject at a level of
P
H
1/1/2007
14
P - Value
The P-value is the probability,
calculated assuming H0 is true, of
obtaining a test statistic value at least as
contradictory to H0 as the value that
actually resulted. The smaller the P-
value, the more contradictory is the data
to H0.