Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

48
Copyright (c) Bani K. Mal lick 1 STAT 651 Lecture 6

Transcript of Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Page 1: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 1

STAT 651

Lecture 6

Page 2: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 2

Topics in Lecture #6 The language of hypothesis testing

Hypothesis tests are carried out using confidence intervals

Z-tests are also possible

P-values as a way of not having to do the mechanics of many hypothesis tests

Statistical power

Page 3: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 3

Book Sections Covered in Lecture #6

Chapter 5.4

Chapter 5.6

Page 4: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 4

Lecture 5 Review: Confidence Interval for a Population Mean

when is Known Want 90%, 95% and 99% chance of

interval including .

90%

95%

99%

n645.1 to

n645.1

n96.1 to

n96.1

n58.2 to

n58.2

Page 5: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 5

Lecture 5 Review: Confidence Intervals

There is a general formula given on page 200

If you want a (1-)100% confidence interval for the population mean when the population s.d. is known, use the formula

The term z is the value in Table 1 that gives probability 1 - /2.

= 0.10, z = 1.645: = 0.05, z = 1.96, = 0.01, z = 2.58

nz to

nz 2/2/

Page 6: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 6

Lecture 5 Review: Sample Size Determination

I want the length of a confidence interval to be

2 x E

then the sample size I need is 2

2/ Ezn

Page 7: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 7

Lecture 5 Review

Which make the lengths of CI’s become longer?

Sample sizes?

Population standard deviation?

Degree of confidence?

Page 8: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 8

Hypothesis Testing: Beginings

Suppose you want to know whether the population mean change in reported caloric intake equals zero

Note the emphasis on population

There is a reasonably elaborate structure to test such a hypothesis

Computers make this relatively simple, for simple problems, but you have to understand the language

Page 9: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 9

Hypothesis Testing

Suppose you want to know whether the population mean change in reported caloric intake equals zero

We have already done this!!!!!

Confidence intervals tell you where the population mean is, with specified probability

If zero is not in the confidence interval, then what?

Page 10: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 10

Hypothesis Testing

Can be thought of as a framework for decision making

I like to emphasize confidence intervals, since they give more information

The books talks about 1-tailed and 2-tailed tests. We will do only 2-tailed tests.

Page 11: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 11

The Null Hypothesis

Begin with a hypothesis:

hypothesized value (say 0)

This is called the null hypothesis

It is always of the form given above.

0 H : μ =

Page 12: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 12

The Alternative Hypothesis

Write down possible alternatives:

hypothesized value

The is the alternative hypothesis

For our course, it is always two-sided

hypothesized value

hypothesized value

A H : μ

0 H : μ =

A H : μ

Page 13: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 13

Type I Error (False Reject)

A Type I error occurs when you say that the null hypothesis is false when in fact it is true

You can never know for certain whether or not you have made such an error

You can only control the probability that you make such an error

t is convention to make the probability of a Type I error 5%, although 1% and 10% are also used

Page 14: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 14

Type I Error Rates

Choose a confidence level, call it 1 -

The Type I error rate is confidence interval: = 10%

confidence interval: = 5%

confidence interval: = 1%

Page 15: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 15

Choose a Decision Rule

You can reject

Or you can NOT reject

For reasons described later, we never accept

That’s a trick question on an exam!

H 0

H 0

H 0

Page 16: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 16

Type II: The Other Kind of Error

The other type of error occurs when you do NOT reject even though it is false

This often occurs because your study sample size is too small to detect meaningful departures from

Statisticians spend a lot of time trying to figure out a priori if a study is large enough to detect meaningful departures from a null hypothesis

H 0

H 0

Page 17: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 17

Making Your Decision

You have selected your level of confidence 1-. This means your Type I error rate is

You have a null hypothesis

You form a confidence interval

If the hypothesized value is not in the confidence interval, you reject and say it is false

Otherwise, you cannot reject

H 0

H 0

Page 18: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 18

The Z-Test

I like confidence intervals because they tell you two things

They tell you where the population mean is

The length of the confidence interval tells you if your study sample size is too small to be useful

Page 19: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 19

The Z-Test

There is an equivalent procedure for performing the hypothesis test, called the Z-test

It is not as useful as a confidence interval, because it does not give you the confidence interval

You actually rarely see it anymore, but sometimes you do

Page 20: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 20

The Z-Test

Form the Z-statistic:

Reject the null hypothesis if

= 001, confidence = 99%) |Z| > 2.58

= 005, confidence = 95%) |Z| > 1.96

= 010, confidence = 90%) |Z| > 1.645

0A : H

n /

X |Z| 0

0 0 H : μ = μ

Page 21: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 21

WISH Data Again We already know that the population mean change

in reported caloric intake is not zero with 99% confidence

= 600, n = 271 = -180,

|Z| = 4.9 > 2.58

Reject the null hypothesis!

X0 : H

0

2.58 /2

z

9.4271/600

0180

n/

0X Z

Page 22: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 22

P-values

Ubiquitous in journals

You need to know what they are

They are simply bookkeeping devices to save journal space, since you cannot do 3 confidence intervals for each hypothesis

Small p-values indicate that you have rejected the null hypothesis

Simple stuff!!!

Page 23: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 23

P-values

Small p-values indicate that you have rejected the null hypothesis

If p < 0.10, this means that you have rejected the null hypothesis with a confidence interval of 90% or a Type I error rate of 0.10

If p > 0.10, you did not reject the null hypothesis at these levels

Page 24: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 24

P-values

Small p-values indicate that you have rejected the null hypothesis

If p < 0.05, this means that you have rejected the null hypothesis with a confidence interval of 95% or a Type I error rate of 0.05

If p > 0.05, you did not reject the null hypothesis at these levels

Page 25: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 25

P-values

Small p-values indicate that you have rejected the null hypothesis

If p < 0.01, this means that you have rejected the null hypothesis with a confidence interval of 99% or a Type I error rate of 0.01

If p > 0.01, you did not reject the null hypothesis at these levels

Page 26: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 26

WISH Data Again

Which are true, if any? 99% CI did not include 0

p < 0.01

p < 0.05

p < 0.10

p > 0.01

p > 0.05

p > 0.10

Page 27: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 27

A Little Theory

Suppose that the p-value = 0.032. Which confidence intervals, if any, include the hypothesized value?

90%

95%

99%

Page 28: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 28

A Little Theory

Suppose that the p-value = 0.002. Which confidence intervals, if any, include the hypothesized value?

90%

95%

99%

Page 29: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 29

A Little Theory

Suppose that the p-value = 0.092. Which confidence intervals, if any, include the hypothesized value?

90%

95%

99%

Page 30: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 30

A Little Theory

Suppose that the p-value = 0.122. Which confidence intervals, if any, include the hypothesized value?

90%

95%

99%

Page 31: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 31

Hormone Assay Data

Two assay methods for measuring the amount of a hormone

Reference Method: the old standby

Test Method: A new, cheaper method

I was asked by Becton Dickenson Company to say whether the Test method was a reliable substitute for the Reference method

Page 32: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 32

Hormone Assay Data

Q-Q Plot for differences: Test - Reference

Normal Q-Q Plot of Difference: Test - Reference

Observed Value

20100-10-20

Exp

ect

ed

No

rma

l Va

lue

10

0

-10

Page 33: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 33

Hormone Assay Data

Q-Q plot of differences of logarithms: log(test) - log(reference)

Normal Q-Q Plot of log(Test / Reference)

Observed Value

.6.4.2.0-.2-.4-.6-.8-1.0

Exp

ect

ed

No

rma

l Va

lue

.6

.4

.2

0.0

-.2

-.4

-.6

-.8

Page 34: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 34

Hormone Assay Data

Q-Q Plot for differences: Test - Reference

Difference: Test - Reference

10.0

8.0

6.0

4.0

2.0

0.0

-2.0

-4.0

-6.0

-8.0

-10.0

-12.0

-14.0

-16.0

50

40

30

20

10

0

Std. Dev = 3.48

Mean = -.4

N = 85.00

Page 35: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 35

Hormone Assay Data

Q-Q plot of differences of logarithms: log(test) - log(reference)

log(Test / Reference)

.56

.44

.31

.19

.06

-.06

-.19

-.31

-.44

-.56

-.69

-.81

14

12

10

8

6

4

2

0

Std. Dev = .28

Mean = -.11

N = 85.00

Page 36: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 36

Hormone Assay Data

Box Plot for differences: Test - Reference

85N =

Difference: Test - R

20

10

0

-10

-20

63826975

738074

85

393677667243

76

70

8378

Page 37: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 37

Hormone Assay Data

Box plot of differences of logarithms: log(test) - log(reference)

85N =

log(Test / Reference

.8

.6

.4

.2

.0

-.2

-.4

-.6

-.8

-1.0

18

10431

Page 38: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 38

Hormone Assay Data

It seems to make sense to use the log(Test) and log(Reference) Data, since log(Test) - log(Reference) seems more Gaussian

Page 39: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 39

Hormone Assay Data

n = 85

For log data, log(test) - log(reference)

Sample mean = -0.1083

Sample s.d. = 0.2761

Sample std. error = 0.0299

p-value < 0.01: what is the hypothesis and what is the conclusion?

Page 40: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 40

Hormone Assay Data

For log data, X = log(test) - log(reference)

Sample mean = -0.1083

Sample std. error = 0.0299

Hypothesis: X has population mean = 0

p-value < 0.01: what is the conclusion?

Page 41: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 41

Hormone Assay Data

For log data, X = log(test) - log(reference)

Sample mean = -0.1083

Sample std. error = 0.0299

Hypothesis: X has population mean = 0

p-value < 0.01: what is the conclusion? That a 99% confidence interval did not

include zero, and hence you reject the null hypothesis that the population mean = 0

99% CI from -0.0294 to -0.1872

Page 42: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 42

Hormone Assay Data

For raw data, X = test - reference

Sample mean = -0.4424

Sample std. error = 0.3771

Hypothesis: X has population mean = 0

99% CI from -1.44 to 0.55

Is p < 0.01 for this scale?

p = 0.244

Page 43: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 43

Hormone Assay Data

For raw data, X = test - reference, p = 0.244

For log data, X = log(test) - log(reference), p < 0.001

WOW!!! Which is right?

Page 44: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 44

Hormone Assay Data

For raw data, X = test - reference, p = 0.244

For log data, X = log(test) - log(reference), p < 0.001

WOW!!! Which is right?

I believe the log data more: more nearly normal, no outliers, nice histogram

Plus, a nonparametric test (later) has p < 0.001

Page 45: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 45

Statistical Power

Statistical power is defined as the probability that you will reject the null hypothesis when you should reject it.

If is the Type II error, power = 1 -

The Type I error (test level) does NOT depend on the sample size: you chose it (5%?)

The power depends crucially on the sample size

Page 46: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 46

Statistical Power

Statistical power is defined as the probability that you will reject the null hypothesis when you should reject it.

The power depends crucially on the sample size

If you have a very small sample size (n), then you will have low power, i.e., a small chance of finding an effect even if it is there

Page 47: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 47

Statistical Power

Statistical power is defined as the probability that you will reject the null hypothesis when you should reject it.

If you have a very small sample size (n), then you will have low power, i.e., a small chance of finding an effect even if it is there

This is why we never accept the null hypothesis: because we can manipulate through n the chance of rejecting it.

Page 48: Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.

Copyright (c) Bani K. Mallick 48

Numerical Illustration Wish again, = 600, sample mean = -

180

Hypothesized value for pop. Mean = 0

Set = 0.01, 99% CI

n = 1, |Z| = 0.30, NOT Reject

n = 10, |Z| = 0.95, NOT Reject

n = 20, |Z| = 1.34, NOT Reject

n = 100, |Z| = 3.00, Reject

n = 271, |Z| = 4.95, Reject