STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of...

21
STA291 Statistical Methods Lecture 25

Transcript of STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of...

Page 1: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

STA291Statistical Methods

Lecture 25

Page 2: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Goodness-of-Fit Tests

Given the following…

1) Counts of items in each of several categories

2) A model that predicts the distribution of the relative frequencies

…this question naturally arises:

“Does the actual distribution differ from the model because of random error, or do the differences mean that the model does not fit the data?”

In other words, “How good is the fit?”

Page 3: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Null Hypothesis: The distribution of types of credit card applications is no different from the historic distribution.

Test the hypothesis with a chi-square goodness-of-fit test.

Example : Credit Cards

At a major credit card bank, the percentages of people who historically apply for the Silver, Gold, and Platinum cards are 60%, 30%, and 10% respectively. In a recent sample of customers, 110 applied for Silver, 55 for Gold, and 35 for Platinum. Is there evidence to suggest the percentages have changed?

Goodness-of-Fit Tests

Page 4: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Assumptions and Condition

Counted Data Condition – The data must be counts for the categories of a categorical variable.

Independence Assumption – The counts should be independent of each other. Think about whether this is reasonable.

Randomization Condition – The counted individuals should be a random sample of the population. Guard against auto-correlated samples.

Goodness-of-Fit Tests

Page 5: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Sample Size Assumption

There must be enough data so check the following condition:

Expected Cell Frequency Condition – must be at least 5 individuals per cell.

Goodness-of-Fit Tests

Page 6: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Chi-Square Model

To decide if the null model is plausible, look at the differences between the observed values and the values expected if the model were true.

Note that c2 “accumulates” the relative squared deviation of each cell from its expected value.

So, c2 gets “big” when i) the data set is large and/or ii) the model is a poor fit.

Goodness-of-Fit Tests

cellsall

cellsall Expected

ExpectedObserved

e

eo

f

ff 222

Page 7: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

The Chi-Square Calculation

1.Find the expected values. These come from the null hypothesis value.

2.Compute the residuals,3.Square the residuals,4.Compute the components. Find

for each cell.5.Find the sum of the components,

6.Find the degrees of freedom (no. of cells – 1)

7.Test the hypothesis, finding the p-value or comparing the test statistic from 5 to the appropriate critical value.

eo ff

2eo ff

e

eo

f

ff 2

cellsall e

eo

f

ff 22

Goodness-of-Fit Tests

Page 8: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Example : Credit Cards

At a major credit card bank, the percentages of people who historically apply for the Silver, Gold, and Platinum cards are 60%, 30%, and 10% respectively. In a recent sample of customers, 110 applied for Silver, 55 for Gold, and 35 for Platinum. Is there evidence to suggest the percentages have changed?

What type of test do you conduct?

What are the expected values?

Find the test statistic and p-value.

State conclusions.

Goodness-of-Fit Tests

Page 9: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Example : Credit Cards

At a major credit card bank, the percentages of people who historically apply for the Silver, Gold, and Platinum cards are 60%, 30%, and 10% respectively. In a recent sample of customers, 110 applied for Silver, 55 for Gold, and 35 for Platinum. Is there evidence to suggest the percentages have changed?

What type of test do you conduct?

This is a goodness-of-fit test comparing a single sample to previous information (the null model).

What are the expected values?

Silver Gold Platinum

Observed 110 55 35

Expected 120 60 20

Goodness-of-Fit Tests

Page 10: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Example : Credit Cards

At a major credit card bank, the percentages of people who historically apply for the Silver, Gold, and Platinum cards are 60%, 30%, and 10% respectively. In a recent sample of customers, 110 applied for Silver, 55 for Gold, and 35 for Platinum. Is there evidence to suggest the percentages have changed?

Find the test statistic

and p-value. ???????

2

2

2 2 2110 120 55 60 35 20

120 60 2012.499

all cells

Obs Exp

Exp

Goodness-of-Fit Tests

Page 11: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Interpreting Chi-Square ValuesThe Chi-Square Distribution

The c2 distribution is right-skewed and becomes broader with increasing degrees of freedom:

The c2 test is a one-sided test.

Page 12: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Goodness-of-Fit TestsExample : Credit Cards

Is there evidence to suggest the percentages have changed?

With the test statistic c2 = 12.499, find the p-value:

Using df = 2 and technology (Excel: “=1 - CHISQ.DIST(12.499, 2, TRUE)”, the p-value = 0.001931

State conclusions.

Reject the null hypothesis. There is sufficient evidence customers are not applying for cards in the traditional proportions.

Page 13: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

When we reject a null hypothesis, we can examine the residuals in each cell to discover which values are extraordinary.

Because we might compare residuals for cells with very different counts, we should examine standardized residuals:

Examining the Residuals

Note that standardized residuals from goodness-of-fit tests are distributed as z-scores (which we already know how to interpret and analyze).

e

eo

f

ff

Page 14: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Standardized residuals for the credit card data:

• Neither of the Silver nor Gold values is remarkable.

• The largest, Platinum, at 3.35, is where the difference from historic values lies.

Examining the Residuals

Card Type

Standardized

Residual

Silver -0.91287

Gold -0.6455

Platinum 3.354102

Page 15: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Assumptions and Conditions

The Chi-Square Test for Homogeneity

Counted Data Condition – Data must be counts

Independence Assumption – Counts need to be independent from each other. Check for randomization

Randomization Condition – Random samples /stratified sample needed

Sample Size Assumption – There must be enough data so check the following condition.

Expected Cell Frequency Condition – Expect at least 5 individuals per cell.

Page 16: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Following the pattern of the goodness-of-fit test, compute the component for each cell:

Then, sum the components:

The degrees of freedom are 1 1 .R C

The Chi-Square Test for Homogeneity

e

eo

f

ff 2Component

cellsall e

eo

f

ff 22

Page 17: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Example: More Credit Cards

A market researcher for the credit card bank wants to know if the distribution of applications by card is the same for the past 3 mailings. She takes a random sample of 200 from each mailing and counts the number of applications for each type of card.

Type of Card

Silver GoldPlatinu

m Total

Mailing 1 120 50 30 200

Mailing 2 115 50 35 200

Mailing 3 105 55 40 200

Total 340 155 105 600

The Chi-Square Test for Homogeneity

Page 18: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Example: More Credit Cards

A market researcher for the credit card bank wants to know if the distribution of applications by card is the same for the past 3 mailings.

But, are the differences real or just natural sampling variation?

Our null hypothesis is that the relative frequency distributions are the same (homogeneous) for each country.

Test the hypothesis with a chi-square test for homogeneity.

The Chi-Square Test for Homogeneity

Mailing 1 Mailing 2 Mailing 30

50

100

150

200

250

PlatinumGoldSilver

Page 19: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Example: More Credit Cards

A market researcher for the credit card bank wants to know if the distribution of applications by card is the same for the past 3 mailings.

Type of Card

Silver GoldPlatinum Total

Mailing 1 113.33 51.67 35 200Mailing 2 113.33 51.67 35 200Mailing 3 113.33 51.67 35 200Total 340 155 105 600

Use the total % to determine the expected counts for each table column (type of card):

The Chi-Square Test for Homogeneity

Page 20: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Example : More Credit Cards

A market researcher for the credit card bank wants to know if the distribution of applications by card is the same for the past 3 mailings. She takes a random sample of 200 from each mailing and counts the number of applications for each type of card.

Find the test statistic.

Given p-value = 0.5952,state conclusions.

Fail to reject the null. There is insufficient evidence to suggest that the distributions are different for the three mailings.

2

2

2 2 2120 113.33 50 51.67 40 35

...113.33 51.67 35

2.7806

all cells

Obs Exp

Exp

The Chi-Square Test for Homogeneity

Page 21: STA291 Statistical Methods Lecture 25. Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts.

Looking back

oRecognize when a chi-square test of goodness of fit or homogeneity is appropriate.oFor each test, find the expected cell frequencies.oFor each test, check the assumptions and corresponding conditions and know how to complete the test.oInterpret a chi-square test.oExamine the standardized residuals