Assumptions and Conditions –Randomization Condition: The data arise from a random sample or...

13

description

Assumptions and Conditions (cont.) Normal Population Assumption: –We can never be certain that the data are from a population that follows a Normal model, but we can check the –Nearly Normal Condition: The data come from a distribution that is unimodal and symmetric. Check this condition by making a histogram or Normal probability plot OR assume that it is true.

Transcript of Assumptions and Conditions –Randomization Condition: The data arise from a random sample or...

Page 1: Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.
Page 2: Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.

Assumptions and Conditions

– Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly from an SRS) are ideal.

– 10% (Independence) Condition: When a sample is drawn without replacement, the sample should be no more than 10% of the population.

Page 3: Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.

Assumptions and Conditions (cont.)

• Normal Population Assumption:– We can never be certain that the data are

from a population that follows a Normal model, but we can check the

– Nearly Normal Condition: The data come from a distribution that is unimodal and symmetric.

• Check this condition by making a histogram or Normal probability plot OR assume that it is true.

Page 4: Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.

Assumptions and Conditions (cont.)

– Nearly Normal Condition: • The smaller the sample size (n < 15 or so), the

more closely the data should follow a Normal model.

• For moderate sample sizes (n between 15 and 40 or so), the t works well as long as the data are unimodal and reasonably symmetric.

• For larger sample sizes, the t methods are safe to use even if the data are skewed.

Page 5: Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.

One-Sample t-test for the Mean• The conditions for the one-sample t-test for the mean are the

same as for the one-sample t-interval. • We test the hypothesis H0: = 0 using the statistic

• The standard error of the sample mean is

• When the conditions are met and the null hypothesis is true, this statistic follows a Student’s t model with n – 1 df. We use that model to obtain a P-value.

0

1nx

tSE x

sSE xn

Page 6: Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.

Example:Is the mean weight of college students still 132 pounds? To test this, you take a random sample of 20 students, finding a mean of 137 pounds with a standard deviation of 14.2 pounds. Use a significance level of 0.1.

1. Hypothesis

= population mean weight of college students

Ho: = 132

Ha: 132

Page 7: Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.

Example:Is the mean weight of college students still 132 pounds? To test this, you take a random sample of 20 students, finding a mean of 137 pounds with a standard deviation of 14.2 pounds. Use a significance level of 0.1.

2. Check Assumptions/Conditions

• Random sample is stated• Assume population of college students > 200• Assume population is approx. normally distributed

is unknown, use t-distribution

Page 8: Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.

Example:Is the mean weight of college students still 132 pounds? To test this, you take a random sample of 20 students, finding a mean of 137 pounds with a standard deviation of 14.2 pounds. Use a significance level of 0.1.

0 137 132 1.57514.2

20

xtSE x

0.066

3. Calculate Test

t-critical = 1.729p-value = 0.066*2 = 0.132

Page 9: Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.

Example:Is the mean weight of college students still 132 pounds? To test this, you take a random sample of 20 students, finding a mean of 137 pounds with a standard deviation of 14.2 pounds. Use a significance level of 0.1.

4. Conclusion

Since P-value is greater than alpha, we fail to reject the population mean weight of students is 132 pounds.

Since t-statistic is less than t-critical, we fail to reject the population mean weight of students is 132 pounds. --OR--

Thus, there is NO evidence to support a claim that the true mean weight of college students has changed.

Page 10: Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.

Example:A father is concerned that his teenage son is watching too much television each day, since his son watches an average of 2 hours per day. His son says that his TV habits are no different than those of his friends. Since this father has taken a stats class, he knows that he can actually test to see whether or not his son is watching more TV than his peers. The father collects a random sample of television watching times from boys at his son's high school and gets the following data

1.9 2.3 2.2 1.9 1.6 2.6 1.4 2.0 2.0 2.2Is the father right? That is, is there evidence that other boys average less than 2 hours of television per day?

1. Hypothesis

= population mean number of hours boys at the high school spend watching TV

Ho: = 2

Ha: < 2

Page 11: Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.

Example:A father is concerned that his teenage son is watching too much television each day, since his son watches an average of 2 hours per day. His son says that his TV habits are no different than those of his friends. Since this father has taken a stats class, he knows that he can actually test to see whether or not his son is watching more TV than his peers. The father collects a random sample of television watching times from boys at his son's high school and gets the following data

1.9 2.3 2.2 1.9 1.6 2.6 1.4 2.0 2.0 2.2Is the father right? That is, is there evidence that other boys average less than 2 hours of television per day?

2. Check Assumptions/ConditionsRandom sample is statedAssume boy population at the high school > 100Based on the linearity of the normal probability (quantile) plot, we have approx normal data.

is unknown, use t-distribution

Page 12: Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.

Example:A father is concerned that his teenage son is watching too much television each day, since his son watches an average of 2 hours per day. His son says that his TV habits are no different than those of his friends. Since this father has taken a stats class, he knows that he can actually test to see whether or not his son is watching more TV than his peers. The father collects a random sample of television watching times from boys at his son's high school and gets the following data

1.9 2.3 2.2 1.9 1.6 2.6 1.4 2.0 2.0 2.2Is the father right? That is, is there evidence that other boys average less than 2 hours of television per day?

0 2.01 2 0.09180.345

10

xtSE x

2.01 2 0.5356P x

0.5356

3. Calculate Test

t-critical = -1.833

Page 13: Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.

Example:A father is concerned that his teenage son is watching too much television each day, since his son watches an average of 2 hours per day. His son says that his TV habits are no different than those of his friends. Since this father has taken a stats class, he knows that he can actually test to see whether or not his son is watching more TV than his peers. The father collects a random sample of television watching times from boys at his son's high school and gets the following data

1.9 2.3 2.2 1.9 1.6 2.6 1.4 2.0 2.0 2.2Is the father right? That is, is there evidence that other boys average less than 2 hours of television per day?

4. Conclusion = 0.05

Since the t-statistic is greater than the t-critical, we fail to reject that the population mean number of hours watching TV by the HS boys is 2. --OR—Since the p-value is greater than alpha, we fail to reject that the population mean number of hours watching TV by the HS boys is 2.

Thus, we do not have evidence that supports the father’s claim.