Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning...

25
Chapter 9: Testing a Claim 9.1: Significance Tests: The Basics Learning Objectives -State the null and alternative hypotheses for a significance test about a population parameter. -Interpret a P-value in context. -Determine if the results of a study are statistically significant and draw an appropriate conclusion using a significance level. -Interpret Type I and Type II errors in context, and give a consequence of each. Significance Test A significance test is a formal procedure for comparing observed data with a claim (also called a hypothesis) whose truth we want to assess. The claim is a statement about a parameter, like the population proportion p or the population mean µ. We express the results of a significance test in terms of a probability that measures how well the data and the claim agree. Stating Hypotheses The claim we weigh evidence against in a statistical test is called the null hypothesis (H 0 ). Often the null hypothesis is a statement of “no difference.” The claim about the population that we are trying to find evidence for is the alternative hypothesis (H a ). In the free-throw shooter example, our hypotheses are where p is the long-run proportion of made free throws. In any significance test, the null hypothesis has the form H 0 : parameter = value The alternative hypothesis has one of the forms H a : parameter < value H a : parameter > value H a : parameter ≠ value The alternative hypothesis is one-sided if it states that a parameter is larger than the null hypothesis value or if it states that the parameter is smaller than the null value. 1

Transcript of Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning...

Page 1: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

9.1: Significance Tests: The BasicsLearning Objectives-State the null and alternative hypotheses for a significance test about a population parameter.-Interpret a P-value in context.-Determine if the results of a study are statistically significant and draw an appropriate conclusion using a significance level.-Interpret Type I and Type II errors in context, and give a consequence of each.

Significance TestA significance test is a formal procedure for comparing observed data with a claim (also called a hypothesis) whose truth we want to assess. The claim is a statement about a parameter, like the population proportion p or the population mean µ. We express the results of a significance test in terms of a probability that measures how well the data and the claim agree.

Stating HypothesesThe claim we weigh evidence against in a statistical test is called the null hypothesis (H0). Often the null hypothesis is a statement of “no difference.”The claim about the population that we are trying to find evidence for is the alternative hypothesis (Ha).

In the free-throw shooter example, our hypotheses are

where p is the long-run proportion of made free throws.

In any significance test, the null hypothesis has the formH0 : parameter = value

The alternative hypothesis has one of the formsHa : parameter < value Ha : parameter > valueHa : parameter ≠ value

The alternative hypothesis is one-sided if it states that a parameter is larger than the null hypothesis value or if it states that the parameter is smaller than the null value.

It is two-sided if it states that the parameter is different from the null hypothesis value (it could be either larger or smaller).

The hypotheses should express the hopes or suspicions we have before we see the data. It is cheating to look at the data first and then frame hypotheses to fit what the data show.

Hypotheses always refer to a population, not to a sample. Be sure to state H0 and Ha in terms of population parameters.

It is never correct to write a hypothesis about a sample statistic, such as p̂=0.64∨x=85.

1

Page 2: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

Example: For each of the following scenarios, define the parameter of interest and state appropriate hypotheses.

(a) According to the Web site sleepdeprivation.com, 85% of teens are getting less than 8 hours of sleep a night. Jani wonders whether this result holds in her large high school. She asks an SRS of 100 students at the school how much sleep they get on a typical night. In all, 75 of the responders said less than 8 hours.

(b) Tim is an engineer who is responsible for quality control in the manufacture of certain parts of fighter jets. Tim knows that the mean diameter of a certain rivet hole is supposed to be = 0.250 inches with a standard deviation of = 0.003 inches. He is hoping that a newly developed drill bit will cut these rivet holes so that the diameter is more consistent (less variable).

P-value

Small P-values are evidence against H0 because they say that the observed result is unlikely to occur when H0 is true.

Large P-values fail to give convincing evidence against H0 because they say that the observed result is likely to occur by chance when H0 is true.

2

Page 3: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

Example: When Tim was testing a new drill bit, the hypotheses were

: = 0.003

: < 0.003where = the true standard deviation of the diameters of the rivet holes made with the new drill bit.

Based on a sample of holes made with the new drill bit, the standard deviation was = 0.002 inch. A significance test using Tim’s sample data produced a P-value of 0.17. Interpret the P-value in this context.

Statistically SignificantIf the P-value is smaller than alpha, we say that the data are statistically significant at level α. That Greek lower-case letter alpha, α, is called the significance level. α is chosen before conducting a hypothesis test (or significance test).In that case, we reject the null hypothesis H0 and conclude that there is convincing evidence in favor of the alternative hypothesis Ha.

Conclusions to a Significance TestWhen we use a fixed level of significance to draw a conclusion in a significance test, there are two possible conclusions.

P-value < α → reject H0 → convincing evidence for Ha

Reject H0 and conclude that we have significant (or convincing) evidence that the Ha is true.

P-value ≥ α → fail to reject H0 → not convincing evidence for Ha

Fail to reject H0 and conclude that we do not have significant (or convincing) evidence that the Ha is true.

Common errors that students make in their conclusions:“Accepting” H0.Making statements about samples or statistics (past tense often implies a reference to samples).

3

Page 4: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

Example: A student decided to investigate whether students at his school prefer the taste of a certain name-brand bottled water to a certain store brand bottled water. After collecting data, the student performed a significance test

using the hypotheses : p = 0.5 versus : p > 0.5 where p = the true proportion of students at the school who prefer the name-brand water. The resulting P-value was 0.067. What conclusion would you make at each of the following significance levels?(a) = 0.10

(b) = 0.05

Choosing a Significance Level-How plausible is H 0? If H 0represents an assumption that the people you must convince have believed for years, strong evidence (a very small P-value) will be needed to persuade them. -What are the consequences of rejecting H 0? If rejecting H 0 in favor of H a means making an expensive change of some kind, you need strong evidence that the change will be beneficial.-What are the consequences of incorrectly rejecting H 0? If they are very costly, we should lower the significance

4

Page 5: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claimlevel.

Homework: pg. 551-552 #1-13 oddType I and Type II Error

Significance and Type I ErrorThe significance level α of any fixed-level test is the probability of a Type I error. That is, α is the probability that the test will reject the null hypothesis H0 when H0 is actually true. Consider the consequences of a Type I error before choosing a significance level.

5

Page 6: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

Example: A potato chip producer and its main supplier agree that each shipment of potatoes must meet certain quality standards. If the producer determines that more than 8% of the potatoes in the shipment have “blemishes,” the truck will be sent away to get another load of potatoes from the supplier. Otherwise, the entire truckload will be used to make potato chips. To make the decision, a supervisor will inspect a random sample of potatoes from the shipment. The producer will then perform a significance test using the hypothesesH0 : p = 0.08Ha : p > 0.08where p is the actual proportion of potatoes with blemishes in a given truckload.

Describe a Type I and a Type II error in this setting, and explain the consequences of each.

6

Page 7: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

HW: pg. 552-554 #19-23 odd, 25-289.2: Tests about a Population Proportion

Learning Objectives-State and check the Random, 10%, and Large Counts conditions for performing a significance test about a population proportion.-Perform a significance test about a population proportion.-Use a confidence interval to draw a conclusion for a two-sided test about a population proportion.-Interpret the power of a test and describe what factors affect the power of a test.-Describe the relationship among the probability of a Type I error (significance level), the probability of a Type II error, and the power of a test.

Conditions for Performing a Significance Test About a ProportionRandom: The data come from a well-designed random sample or randomized experiment.

Normal (Large Counts): Both n p0 and n(1−p¿¿0)¿ are at least 10.

Independent (10% Rule): When sampling without replacement, check that n≤ 110N .

Test StatisticA test statistic measures how far a sample statistic diverges from what we would expect if the null hypothesis H0 were true, in standardized units. That is,

4 Step Process for Significance TestsState: What hypotheses do you want to test, and at what significance level? Define any parameters you use. Plan: Choose the appropriate inference method. Check conditions.Do: If the conditions are met, perform calculations.

• Compute the test statistic. • Find the P-value.

Conclude: Make a decision about the hypotheses in the context of the problem.

One Sample z-Test for a ProportionChoose an SRS of size n from a large population that contains an unknown proportion p of successes. To test the hypothesis H0 : p = p0, compute the z statistic

7

Page 8: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

Find the P-value by calculating the probability of getting a z statistic this large or larger in the direction specified by the alternative hypothesis Ha:

CALCULATOR: STAT→TESTS→5:1-PropZTestp0: value of null hypothesisx: number of successesn: sample sizeprop: (Choose ≠ ,<¿>¿

Example: Can you Taste the Rainbow? Many people claim that they can taste the different colors of Skittles. Today we will conduct an experiment and significance test to see if they are correct. You will each try to guess the color of 10 skittles. The flavors are: strawberry (red), orange (orange), lemon (yellow), grape (purple), and green apple (green). We will use all of the skittles as a random sample, with p = true proportion of correct guesses. We will then conduct a significance test at α = 0.05.

8

Page 9: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

9

Page 10: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

Example: A potato-chip producer has just received a truckload of potatoes from its main supplier. If the producer determines that more than 8% of the potatoes in the shipment have blemishes, the truck will be sent away to get another load from the supplier. A supervisor selects a random sample of 500 potatoes from the truck. An inspection reveals that 47 of the potatoes have blemishes. Carry out a significance test at the α = 0.10 significance level. What should the producer conclude?

10

Page 11: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

Why Confidence Intervals Give More InformationThe result of a significance test is basically a decision to reject H0 or fail to reject H0. When we reject H0, we’re left wondering what the actual proportion p might be. A confidence interval might shed some light on this issue.

Example: Find the 90% confidence interval for the previous example.

Confidence Intervals and Two-Sided TestsThe 95% confidence interval gives an approximate range of p0’s that would not be rejected by a two-sided test at the α = 0.05 significance level.

A two-sided test at significance level α (say, α = 0.05) and a 100(1 –α)% confidence interval (a 95% confidence interval if α = 0.05) give similar information about the population parameter.

Homework: pg. 570-572 #31-39 odd, 45-51 odd

Power of a Test

11

Page 12: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

Example: The potato-chip producer wonders whether the significance test of H0 : p = 0.08 versus Ha : p > 0.08 based on a random sample of 500 potatoes has enough power to detect a shipment with, say, 11% blemished potatoes. In this case, a particular Type II error is to fail to reject H0 : p = 0.08 when p = 0.11.

If the claim is true, or p = 0.08, we would reject the null for any sample proportion above 0.0999 and fail to reject the null for any sample value below 0.0999. Since we always assume the claim is true, we will make these conclusions regardless of the true value of p.

Now let’s look at what would happen if the true value was p = 0.11. Since we always assume the null is true, we will still reject if p̂>0.0999 and fail to reject if p̂<0.0999, but this boundary is at a different spot since we are looking at a new sampling distribution-one centered at p = 0.11. If we reject the null, we made the correct decision. This has probability 0.7517 and is called the power of the test. If we fail to reject the null, we have made a Type II error. This has probability 0.2483.

12

Page 13: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

Power and Type II Error

How to Increase the Power of a Test

13

Page 14: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

Homework: pg. 572-574 #53-57 odd, 59-62

9.3: Tests about a Population MeanLearning Objectives-State and check the Random, 10%, and Normal/Large Counts conditions for performing a significance test about a population mean.-Perform a significance test about a population mean.-Use a confidence interval to draw a conclusion for a two-sided test about a population mean.-Perform a significance test about a mean difference using paired data.

Conditions For Performing a Significance Test About a Mean-Random: The data come from a well-designed random sample or randomized experiment.-Normal (Large Sample): The population has a Normal distribution or the sample size is large (n ≥ 30). If the population distribution has unknown shape and n < 30, use a graph of the sample data to assess the Normality of the population. Do not use t procedures if the graph shows strong skewness or outliers.-Independent (10% Rule): When sampling without replacement, check that n ≤ (1/10)N.

One Sample t-Test for a MeanChoose an SRS of size n from a large population that contains an unknown mean µ. To test the hypothesis H0 : µ = µ0, compute the one-sample t statistic

Find the P-value by calculating the probability of getting a t statistic this large or larger in the direction specified by the alternative hypothesis Ha in a t-distribution with df = n – 1.

14

Page 15: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

ON YOUR CALCULATOR: STAT→TESTS2: T-Test Inpt: Data (Use if data is entered in list) OR Stats (Use if you know the mean, st. dev, etc.) μ0: (null hyp) μ0: (null hyp) List: (Where data is entered) x: (sample mean) Freq: 1 sx: (sample standard deviation) μ :(Choose ≠ ,<,>¿) n: (sample size) Calculate (ENTER) μ :(Choose ≠ ,<,>¿) Calculate (ENTER)

A battery company claims to have developed a new AAA battery that lasts longer than its regular AAA batteries. Based on years of experience, the company knows that its regular AAA batteries last for 30 hours of continuous use, on average. An SRS of 45 new batteries lasted an average of 33.9 hours with a standard deviation of 9.8 hours. Do these data give convincing evidence that the new batteries last longer on average? Use α=0.05.

15

Page 16: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

Two-Sided Tests and Confidence IntervalsThe connection between two-sided tests and confidence intervals is even stronger for means than it was for proportions. That’s because both inference methods for means use the standard error of the sample mean in the calculations.

-A two-sided test at significance level α (say, α = 0.05) and a 100(1 – α)% confidence interval (a 95% confidence interval if α = 0.05) give similar information about the population parameter. -When the two-sided significance test at level α rejects H0: µ = µ0, the 100(1 – α)% confidence interval for µ will not contain the hypothesized value µ0 .-When the two-sided significance test at level α fails to reject the null hypothesis, the confidence interval for µ will contain µ0 .

Example: At the Hawaii Pineapple Company, managers are interested in the sizes of the pineapples grown in the company’s fields. Last year, the mean weight of the pineapples harvested from one large field was 31 ounces. A new irrigation system was installed in this field after the growing season. Managers wonder whether this change will affect the mean weight of future pineapples grown in the field. To find out, they select and weigh a random sample of 50 pineapples from this year’s crop. The following MiniTab output shows the results of a confidence interval and significance test. Assume α=0.05.

a) Interpret the 95% confidence interval for μ.

16

Page 17: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

b) Based on the confidence interval, make a conclusion for the significance test. Explain.

c) Name the test statistic and p-value. Using these, do you end up with the same conclusion for your test as you did in part b)?

Paired t-TestWhen paired data result from measuring the same quantitative variable twice, as in the job satisfaction study, we can make comparisons by analyzing the differences in each pair. If the conditions for inference are met, we can use one-sample t procedures to perform inference about the mean difference µd.

Example: Researchers designed an experiment to study the effects of caffeine withdrawal. They recruited 11 volunteers who were diagnosed as being caffeine dependent to serve as subjects. Each subject was barred from coffee, colas, and other substances with caffeine for the duration of the experiment. During one two-day period, subjects took capsules containing their normal caffeine intake. During another two-day period, they took placebo capsules. The order in which subjects took caffeine and the placebo was randomized. At the end of each two-day period, a test for depression was given to all 11 subjects. Researchers wanted to know if being deprived of caffeine would lead to an increase in depression.

a) Why did researchers randomly assign the order in which subjects received placebo and caffeine?

17

Page 18: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

b) Carry out a test to investigate the researchers’ question.

18

Page 19: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

Using Tests WiselyHow Large a Sample Do I Need?

• A smaller significance level requires stronger evidence to reject the null hypothesis.• Higher power gives a better chance of detecting a difference when it really exists.• At any significance level and desired power, detecting a small difference between the null and alternative

parameter values requires a larger sample than detecting a large difference.Statistical Significance and Practical ImportanceWhen a null hypothesis (“no effect” or “no difference”) can be rejected at the usual levels (α = 0.05 or α = 0.01), there is good evidence of a difference. But that difference may be very small. When large samples are available, even tiny deviations from the null hypothesis will be significant.Beware of Multiple AnalysesStatistical significance ought to mean that you have found a difference that you were looking for. The reasoning behind statistical significance works well if you decide what difference you are seeking, design a study to search for it, and use a significance test to weigh the evidence you get. In other settings, significance may have little meaning.

Homework: pg. 599-601 #87-93 odd, 95-102

Monty Hall ProblemThis material in this article was originally published in PARADE magazine in 1990 and 1991.

Question: Suppose you’re on a game show, and you’re given the choice of three doors. Behind one door is a car, behind the others, goats. You pick a door, say #1, and the host, who knows what’s behind the doors, opens another door, say #3, which has a goat. He says to you, "Do you want to pick door #2?" Is it to your advantage to switch your choice of doors?Craig F. WhitakerColumbia, Maryland

Marilyn Vos Savant's Response: Yes; you should switch. The first door has a 1/3 chance of winning, but the second door has a 2/3 chance. Here’s a good way to visualize what happened. Suppose there are a million doors, and you pick door #1. Then the host, who knows what’s behind the doors and will always avoid the one with the prize, opens them all except door #777,777. You’d switch to that door pretty fast, wouldn’t you?

Let’s set up a significance test!!

19

Page 20: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claim

Chapter 9 Learning Objectives Section Related Exampleon Page(s)

RelevantChapter Review

Exercise(s)

Can I do this?

State the null and alternative hypotheses for a significance test about a population parameter. 9.1 540 R9.1

Interpret a P-value in context. 9.1 543, 544 R9.5

Determine if the results of a study are statistically significant and draw an appropriate conclusion using a significance level.

9.1 546 R9.5

Interpret a Type I and a Type II error in context, and give a consequence of each. 9.1 548 R9.3, R9.4

State and check the Random, 10%, and Large Counts conditions for performing a significance test about a population proportion.

9.2 555 R9.4

Perform a significance test about a population 9.2 559, 562 R9.4

20

Page 21: Mrs. Venneman · Web viewChapter 9: Testing a Claim 11 9.1: Significance Tests: The Basics Learning Objectives-State the null and alternative hypotheses for a significance test about

Chapter 9: Testing a Claimproportion.Interpret the power of a test and describe what factors affect the power of a test. 9.2 565, discussion on

568 R9.3

Describe the relationship among the probability of a Type I error (significance level), the probability of a Type II error, and the power of a test.

9.2 565 R9.3

State and check the Random, 10%, and Normal/Large Sample conditions for performing a significance test about a population mean.

9.3 575 R9.2, R9.6, R9.7

Perform a significance test about a population mean. 9.3 580, 583 R9.6Use a confidence interval to draw a conclusion for a two-sided test about a population parameter. 9.2, 9.3 563, 585 R9.5, R9.6

Perform a significance test about a mean difference using paired data. 9.3 586 R9.7

Plan of Action:

21