8-Hypothesis 28 Oct 11

14
1/1/2007 1 Hypothesis Testing Prof.Dhananjay M.Apte Trainer-Six Sigma, Faculty-Statistics, Q.T. Operation Mgt. [email protected] Cell- 98231 90939 For private circulation only. All rights reserved Example 1 Mango farms in Ratnagiri District produce average 500 mangoes (per farm) with a standard deviation 96 (Information source- Food Report journal) After Inclusion of special fertilizer, out put was measured from 50 farms, The average output was 535 mangoes per farm Is the out put more ?..................................yes, algebraically. Is it statistically MORE ? Is there STATISTICAL DIFFERENCE between the out put after and before the inclusion of fertilizer ? cal Always consider absolute value Also need Z tab (Critical value) At 0.05 significance level…

Transcript of 8-Hypothesis 28 Oct 11

Page 1: 8-Hypothesis 28 Oct 11

1/1/2007

1

Hypothesis Testing

Prof.Dhananjay M.ApteTrainer-Six Sigma,

Faculty-Statistics, Q.T. Operation Mgt.

[email protected] 98231 90939

For private circulation only. All rights reserved

Example 1

Mango farms in Ratnagiri District produce average 500mangoes (per farm) with a standard deviation 96

(Information source- Food Report journal)

After Inclusion of special fertilizer, out put was measured from 50 farms, The average output was 535 mangoes per farm

• Is the out put more ?..................................yes, algebraically.

• Is it statistically MORE ?• Is there STATISTICAL DIFFERENCE between the out put after and before the inclusion of

fertilizer ?

cal

Always consider

absolute value

Also need Z tab (Critical value)

At 0.05 significance level…

Page 2: 8-Hypothesis 28 Oct 11

1/1/2007

2

1.645

In the said problem…

Since Z cal > Z tab….There IS Statistical DIFFERENCE between x bar and Mu.

Hence… Inference (conclusion) is:•X bar is not only algebraically but statistically more than Mu

•Output after inclusion of fertilizer, is statistically more

than Out put Before inclusion of fertilizer

•Fertilizer is effective.

Z cal = 2.57 Z tab 0.05 = 1.645

Z cal > Z tab Z cal < Z tab

There is Statistical difference

between x bar and Mu

There is NO Statistical difference

between x bar and Mu

Page 3: 8-Hypothesis 28 Oct 11

1/1/2007

3

Example 2

A Sensex Office claims that the average annual family income on Metropolis

is $ 48,432 . But In a report prepared by the Economic Research Department

of a major bank the Department Manager says that an average income of

$ 48,574 out of random sample of 400 families. Is Sensex office’s claim is

L E S S than that of Department Manager. Consider a standard deviation of

sample 2000 ?

Z tab 0.05 = 1.645 ……

Since Z cal < Z tab…..NO DIFFERENCE

Hence… Inference (conclusion) is:

•Both claims are (Statistically) same

•Sensex office’s claim is algebraically less but statistically NOT

•Whatsoever the algebraic difference is seen is just “Chance”

•No “significant” difference.

S.D.plays vital role

Test Statistic

• Area = Significance level = α …called Alfa

• Confidence level is the level of trueness / confidence

• = 1- Significance level

• If not mentioned in the problem…consider α =

0.05

• α is described in Fraction… as 0.05 or in

Percentage… as 5 %

Page 4: 8-Hypothesis 28 Oct 11

1/1/2007

4

Example 3The lifetime of a certain brand of heat pump is known to be normally

distributed with a standard deviation of 2. A sample of 6 heat pumps yielded

the observations: 2.0 1.3 6.0 1.9 5.1 4

At α = 0 .05 is there reason to believe that the mean life of the heat pumps is

GREATER THAN 2 ?

……Since Z cal > Z tab…..There IS a DIFFERENCE

Hence… Inference (conclusion) is:

•Mean life for Heat pump is greater than 2•There is reason to believe that the mean life of the heat pumps is

significantly greater than 2.

What, if S.D. is not given ?……calculate it from the data

Z cal = 1.69Z tab 0.05 = 1.645

….x bar = 3.383 , µ = 2

Exercise

The maker of a certain model car claimed that his car

averaged at least 31 miles per gallon of gasoline. A

sample of 36 cars was selected and each car was driven with one gallon of regular gasoline. The sample showed

a mean of 29.43 miles with a standard deviation of 3 miles. With 95 % Confidence level, what do you conclude about the manufacturers claim?

( Is he making TALLER claim ? )

Ans.

Cal = 3.14

Tab = 1.645

Taller claim

Page 5: 8-Hypothesis 28 Oct 11

1/1/2007

5

“ t ” testExample 4

The maker of a certain model car claimed that his car averaged at least 31 miles per gallon of gasoline. A sample of 9 cars was selected and each car

was driven with one gallon of regular gasoline. The sample showed a mean

of 29.43 miles with a standard deviation of 3 miles. With 95 % Confidence

level ( Alfa = 0.05) , what do you conclude about the manufacturers claim? Whether More?

If S.D. of Population is NOT known as well as sample size is less or

equal to 30… Then use another table, t distribution table to calculate

tab value, ttab….This is.. t Test .

How ever calculated value, t cal is calculated same way as z cal

…..Consider absolute value = 1.57

For knowing t tab, we must know Degrees of Freedom (DOF)… = n-1 =9-1 =8

Page 6: 8-Hypothesis 28 Oct 11

1/1/2007

6

t cal = 1.57 t tab 0.05,8 = 1.860

……Since t cal < t tab…..There is NO DIFFERENCE

Hence… Inference (conclusion) is:

• Manufacturer’s claim and test result is same

•There is insufficient evidence to doubt the

manufacturers claim concerning the gas mileage.

2 Tailed Test…

• Tests with > or < signs (i.e. Greater than, Less

than ) are 1 tailed Tests

• Tests with = or not equal to signs are 2 tailed

Tests

• “Cal” values remain unchanged in 2 tail,

only “Tab” values change.

• “Tab” values are determined from Alfa/ 2

Page 7: 8-Hypothesis 28 Oct 11

1/1/2007

7

2 Tailed test…

Example 5

Suppose we have a random sample of n = 25 measurements of chest circumference from a population of newborns with σ = 0.7 inches and the sample mean = 12.6 inches. Is it likely that the population mean has the

value µ = 13.0 inches ? Consider Alfa = 0.05 ( Is….. x bar = Mu….. ? )

1. Its 2 Tailed Test….because equality is a concern

2. And its Z Test ………S.D. of population is known.

Test Statistic Z tab 0.025 = 1.96Refer to table…

……Since Z cal > Z tab…..There IS a DIFFERENCE

Hence… Inference (conclusion) is:

•our observed value of x bar = 12.6 for the sample mean

is too rare for us to believe that µ = 13.0

Hypotheses terms and wordings

• Null Hypothesis… H0– Statement of innocence made before Test calculations

– There is NO difference

– H0 : x bar = μ

• Alternative Hypothesis…HA

– There is difference

– HA : x bar ≠ μ

• Hypothesis is Accepted, Rejected, Not accepted, Not rejected

• Acceptance, Rejection Region, Critical Value

Page 8: 8-Hypothesis 28 Oct 11

1/1/2007

8

Errors in Hypothesis Testing

A type I error consists of rejecting the

null hypothesis H0 when it was true.

A type II error consists of not rejecting

H0 when H0 is false.

and are the probabilities of type

I and type II error, respectively (The so

called Alfa, Beta Error)

Students in class C Students in class D

Student

Height

(cm)

145 149 152 153 154 148 153 157 161 162

154 158 160 166 166 162 163 167 172 172

166 167 175 177 182 175 177 183 185 187

Example 6 A random sample of 15 students- State whether there is a difference

Between (mean) height of students in class C & D (Consider 5 % significance level)

It’s a “ 2 tailed” “t” test

2 Samples Test

Page 9: 8-Hypothesis 28 Oct 11

1/1/2007

9

x1 161.6

x2 168.27

s1 10.86

s2 11.74

n1,n2 15

tcal 1.62

t tab 0.025,28 = 2.048

…Since t cal < t tab…There is NO DIFFERENCE

Paired Tests

Does additive improves performance ? Test it with 5 % significance

(is the mileage more ?….1 tailed, t test… sample less than 30)

Page 10: 8-Hypothesis 28 Oct 11

1/1/2007

10

Determining

t tab…

DOF = 9

Solution…

Page 11: 8-Hypothesis 28 Oct 11

1/1/2007

11

Thus Paired Test considers….

Confidence Interval (C.I.) of Mean

• Point Estimate- We take a sample from population, Find Mean (X bar). We estimate the mean of Population.

• In some cases, the estimation of mean is stated in interval….Called Interval Estimate or Confidence Interval (C.I.)

• Formulae of C.I.…..

“Z” need to be replaced by “ t ” if n < = 30 & S.D.of population is not known

Z, t are Tabulated values ,needs two tailed consideration.

If S.D.of Population is not known, then consider that of Sample.

Alfa is generally considered = 0.05

Page 12: 8-Hypothesis 28 Oct 11

1/1/2007

12

• Example Sample of 20 sales invoices (bills), mean amount $110.27,

sample std dev =$28.95 . Find at what range Mean of all the invoices

(in population) lies. (To determine C.I.)

56.1327.110

20

95.28093.227.110

1

n

stX n

83.123$71.96$ or

Hence… Inference (conclusion) is:

•Mean amount of all the invoices of population will lie between

96.71 & 123.83•The surety level (Confidence Level) for above statement is 95 %•Or 95 % Samples will have their (sample) mean lying in the above

rangeAs Confidence level increases…The range ( C.I.) increases

n < = 30 & S.D.of population

is not known…

t tab 0.025, 19 = 2.093

Confidence Interval (C.I.) for Proportion

• CI for p is given below, with ps = sample

proportion

– When n is large use Z value

– When n is small use t value

n

ppzpP ss

s

1

Page 13: 8-Hypothesis 28 Oct 11

1/1/2007

13

Example

• Sample of 100 sales invoices, 10 have

errors, what is 95% CI

• Ps = 10/100 =0.1

• 95% CI for P is =

100

9.01.096.11.0

0588.01.0

P - Value

The P-value is the smallest level of

significance at which H0 would be

rejected when a specified test procedure

is used on a given data set.

0

1. -value

reject at a level of

P

H

0

2. -value

do not reject at a level of

P

H

Page 14: 8-Hypothesis 28 Oct 11

1/1/2007

14

P - Value

The P-value is the probability,

calculated assuming H0 is true, of

obtaining a test statistic value at least as

contradictory to H0 as the value that

actually resulted. The smaller the P-

value, the more contradictory is the data

to H0.