Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Assumptions 1) Sample is...

Copyright © 1998, Triola, Elementary Statistics

Addison Wesley Longman 1

Assumptions 1) Sample is large (n > 30)

a) Central limit theorem applies

b) Can use normal distribution

2) Can use sample standard deviation s as estimate for if is unknown

For testing a claim about the mean of a single population



Three Methods Discussed1) Traditional method

2) P-value method

3) Confidence intervals

Note: These three methods are equivalent, I.e., they will provide the same conclusions.



Procedure1. Identify the specific claim or hypothesis to be tested, and

put it in symbolic form.

2. Give the symbolic form that must be true when the original claim is false.

3. Of the two symbolic expressions obtained so far, put the one you plan to reject in the null hypothesis H0 (make the formula with equality). H1 is the other statement.

Or, One simplified rule suggested in the textbook: let null hypothesis H0 be the one that contains the condition of equality. H1 is the other statement.

Figure 7-4



4. Select the significant level based on the seriousness of a type I error. Make small if the consequences of rejecting a true H0 are severe. The values of 0.05 and 0.01 are very common.

5. Identify the statistic that is relevant to this test and its sampling distribution.

6. Determine the test statistic, the critical values, and the critical region. Draw a graph and include the test statistic, critical value(s), and critical region.

7. Reject H0 if the test statistic is in the critical region. Fail to reject H0 if the test statistic is not in the critical region.

8. Restate this previous decision in simple non-technical terms. (See Figure 7-2)



In a P-value method, procedure is the same except for steps 6 and 7Step 6: Find the P-value

Step 7: Report the P-value

Reject the null hypothesis if the P-value is less than or equal to the significance level

Fail to reject the null hypothesis if the P-value is greater than the significance level



Testing Claims with Confidence Intervals

• A confidence interval estimate of a population parameter contains the likely values of that parameter. We should therefore reject a claim that the population parameter has a value that is not included in the confidence interval.



Testing Claims with Confidence Intervals

95% confidence interval of 106 body temperature data (that is, 95% of samples would

contain true value µ )

98.08º < µ < 98.32º

98.6º is not in this interval

Therefore it is very unlikely that µ = 98.6º

Thus we reject claim µ = 98.6

Claim: mean body temperature = 98.6°, where n = 106, x = 98.2° and s = 0.62°



Underlying Rationale of Hypotheses Testing

• When testing a claim, we make an assumption (null hypothesis) that contains equality. We then compare the assumption and the sample results and we form one of the following conclusions:

If the sample results can easily occur when the assumption is true, we attribute the relatively small discrepancy between the assumption and the sample results to chance.

If the sample results cannot easily occur when the assumption is true, we explain the relatively large discrepancy between the assumption and the sample by concluding that the assumption is not true.



Testing a Claim about Testing a Claim about a Mean: Small a Mean: Small

SamplesSamplesSection 7-4Section 7-4

M A R I O F. T R I O L ACopyright © 1998, Triola, Elementary Statistics,

Addison Wesley Longman



Figure 7-10 Choosing between the Normal and Student t-Distributions when Testing a Claim about a Population Mean µ

Is n > 30

?

Is thedistribution of

the population essentiallynormal ? (Use a

histogram.)

No

Yes

Yes

No

No

Start

Is known

?

Use normal distribution with

x – µx

/ nZ

(If is unknown use s instead.)

Use non-parametric methods, which don’t require a normal distribution.

Use normal distribution with

x – µx

/ nZ

(This case is rare.)

Use the Student t-distributionwith x – µx

s/ nt



a) For any population distribution (shape) where, n > 30 (use s for if is unknown)

b) For a normal population distribution where

1) is known (n any size), or

2) is unknown and n > 30 (use s for )

The distribution of sample means will be NORMAL and you can use Table A-2 :



The Student t-distribution and Table A-3 should be used when:

Population distribution is essentially normal

is unknown

n 30



Test Statistic (TS)for a Student t-distribution

Critical Values (CV)Found in Table A-3

Formula card, back book cover, or Appendix

Degrees of freedom (df) = n – 1

Critical t values to the left of the mean are negative

t = x – µxs n



Important Properties of the Student t Distribution

1. The Student t distribution is different for different sample sizes (see Figure 6-5 in Section 6-3).

2. The Student t distribution has the same general bell shape as the normal distribution; its wider shape reflects the greater variability that is expected with small samples.

3. The Student t distribution has a mean of t = 0 (just as the standard normal distribution has a mean of z = 0).

4. The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the standard normal distribution, which has a = 1).

5. As the sample size n gets larger, the Student t distribution get closer to the normal distribution. For values of n > 30, the differences are so small that we can use the z values instead of developing a much larger table of critical t values. (The values in the bottom row of Table A-3 are equal to the corresponding critical z values from the standard normal distribution.)



All three methods 1) Traditional method 2) P-value method 3) Confidence intervals and the testing procedure Step 1 to Step 8 are still valid, except that the test statistic (therefore corresponding Table) is different.



The larger Student t-distribution values shows that with small

samples the sample evidence must be more extreme before the

difference is significant.



P-Value MethodTable A-3 includes only selected values of Specific P-values usually cannot be foundUse Table to identify limits that contain the

P-valueSome calculators and computer programs

will find exact P-values



ExampleConjecture: “the average starting salary for a computer science gradate is $30,000 per year”.

For a randomly picked group of 25 computer science graduates, their average starting salary is $36,100 and the sample standard deviation is $8,000.



ExampleSolution

Step 1: µ = 30k

Step 2: µ > 30k (if believe to be no less than 30k)

Step 3: H0: µ = 30k versus H1: µ > 30k

Step 4: Select = 0.05 (significance level)

Step 5: The sample mean is relevant to this test and its sampling distribution is t-distribution with (25 - 1 ) = 24 degrees of freedom.



t-distribution (DF = 24)Assume the conjecture is true!

t = x – µxS

nTest Statistic:

Critical value = 1.71 * 8000/5 + 30000 = 32736

30 K( t = 0)

Fail to reject H0 Reject H0

32.7 k( t = 1.71 )

(Step 6)




t = x – µxs

nTest Statistic:

Critical value = 1.71 * 8000/5 + 30000 = 32736

30 K( t = 0)

Fail to reject H0 Reject H0

32.7 k( t = 1.71 )

Sample data: t = 3.8125

x = 36.1k or

(Step 7)



Example

Conclusion: Based on the sample set, there is sufficient evidence to warrant rejection of the claim that “the average starting salary for a computer science gradate is $30,000 per year”.

Step 8:




t = x – µxS n

Test Statistic:

30 K 36.1 k

(Step 6)

P-value = areato the right of the test statistic

t = 36.1 - 308 / 5 = 3.8125 P-value = .0004225

(by a computer program)



t-distribution (DF = 24)

30 K 36.1 k

(Step 7)

P-value = areato the right of the test statistic

t = 36.1 - 308 / 5 = 3.8125 P-value = .0004225

P-value < 0.01

Highly statistically significant (Very strong evidence against the null hypothesis)

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Assumptions 1) Sample is...

Documents

Transcript of Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Assumptions 1) Sample is...