Chi sequare

42
CHI-SQUARE To: Sir Shahid Mahmood By: Abuzar Tabassum M.Sc Zoology 3 rd semester 10040814-017 University Of Gujrat

description

 

Transcript of Chi sequare

Page 1: Chi sequare

CHI-SQUARE

To:Sir Shahid Mahmood

By:Abuzar Tabassum

M.Sc Zoology 3rd semester10040814-017

University Of Gujrat

Page 2: Chi sequare

Chi-Square Test

• A fundamental problem in genetics is determining whether the experimentally determined data fits the results expected from theory (i.e. Mendel’s laws as expressed in the Punnett square).

• A statistical method used to determine GOODNESS OF FIT– Goodness of fit refers to how close the

observed data are to those predicted from a hypothesis

Page 3: Chi sequare

Goodness of Fit

• Mendel has no way of solving this problem. Shortly after the rediscovery of his work in 1900, Karl Pearson and R.A. Fisher developed the “chi-square” test for this purpose.

• The chi-square test is a “goodness of fit” test: it answers the question of how well do experimental data fit expectations.

• We start with a theory for how the offspring will be distributed: the “null hypothesis”. We will discuss the offspring of a self-pollination of a heterozygote. The null hypothesis is that the offspring will appear in a ratio of 3/4 dominant to 1/4 recessive.

Page 4: Chi sequare

Formula

• To calculate the chi-square statistic following formula is used.

• The “Χ” is the Greek letter chi; the “∑” is a sigma; it means to sum the following terms for all phenotypes.

• “obs” is the number of individuals of the given phenotype observed;

• “exp” is the number of that phenotype expected from the null hypothesis.

exp

exp)( 22 obs

Page 5: Chi sequare

• How can you tell if an observed set of offspring counts is legitimately the result of a given underlying simple ratio?

• For example, you do a cross and see 290 purple flowers and 110 white flowers in the offspring. This is pretty close to a 3/4 : 1/4 ratio, but how do you formally define "pretty close"? What about 250:150?

Page 6: Chi sequare
Page 7: Chi sequare

Example• As an example, you count F2 offspring, and get 290 purple and 110 white

flowers. This is a total of 400 (290 + 110) offspring.

• We expect a 3/4 : 1/4 ratio. We need to calculate the expected numbers, this is done by multiplying the total offspring by the expected proportions. This we expect 400 * 3/4 = 300 purple, and 400 * 1/4 = 100 white.

• Thus, for purple, obs = 290 and exp = 300. For white, obs = 110 and exp = 100.

• Now it's just a matter of plugging into the formula:

2 = (290 - 300)2 / 300 + (110 - 100)2 / 100

= (-10)2 / 300 + (10)2 / 100

= 100 / 300 + 100 / 100

= 0.333 + 1.000

= 1.333.

• This is our chi-square value: now we need to see what it means and how to use it.

Page 8: Chi sequare

The Critical Question• Using the example here, how can you tell if your 290: 110 offspring

ratio really fits a 3/4 : 1/4 ratio (as expected from selfing a heterozygote).You can’t be certain, but you can at least determine whether your result is reasonable.

Page 9: Chi sequare

Reasonable• What is a “reasonable” result is subjective and arbitrary.• For most work a result is said to not differ significantly from

expectations if it could happen at least 1 time in 20. That is, if the difference between the observed results and the expected results is small enough that it would be seen at least 1 time in 20 over thousands of experiments, we “fail to reject” the null hypothesis.

• For technical reasons, we use “fail to reject” instead of “accept”.

• “1 time in 20” can be written as a probability value p = 0.05, because 1/20 = 0.05.

• Another way of putting this. If your experimental results are worse than 95% of all similar results, they get rejected because you may have used an incorrect null hypothesis.

Page 10: Chi sequare

• The test statistic is compared to a theoretical probability distribution

• In order to use this distribution properly you need to determine the degrees of freedom

• If the level of significance read from the table is greater than .05 or 5% then your hypothesis is accepted and the data is useful

• The hypothesis is termed the null hypothesis which states that there is no substantial statistical deviation between observed and expected data.

Page 11: Chi sequare

Degrees of Freedom

• A critical factor in using the chi-square test is the “degrees of freedom”.

• Degrees of freedom is the number of phenotypic possibilities in your cross minus one.Or

• Degrees of freedom is simply the number of classes of offspring minus 1.

• For our example, there are 2 classes of offspring: purple and white. Thus, degrees of freedom (d.f.) = 2 -1 = 1.

Page 12: Chi sequare

Critical Chi-Square

• Critical values for chi-square are found on tables, sorted by degrees of freedom and probability levels. Be sure to use p = 0.05.

• If your calculated chi-square value is greater than the critical value from the table, you “reject the null hypothesis”.

• If your chi-square value is less than the critical value, you “fail to reject” the null hypothesis (that is, you accept that your genetic theory about the expected ratio is correct).

Page 13: Chi sequare

Chi-Square Table

Page 14: Chi sequare

Using the Table

• In our example of 290 purple to 110 white, we calculated a chi-square value of 1.333, with 1 degree of freedom.

• Looking at the table, 1 d.f. is the first row, and p = 0.05 is the sixth column. Here we find the critical chi-square value, 3.841.

• Since our calculated chi-square, 1.333, is less than the critical value, 3.841, we “fail to reject” the null hypothesis. Thus, an observed ratio of 290 purple to 110 white is a good fit to a 3/4 to 1/4 ratio.

Page 15: Chi sequare

Chi-square with more than one degree of freedom

Page 16: Chi sequare

9 : 3 : 3 : 1

Page 17: Chi sequare

phenotype observed expected proportion

expected number

round yellow

315 9/16 312.75

round green

101 3/16 104.25

wrinkled yellow

108 3/16 104.25

wrinkled green

32 1/16 34.75

total 556 556

Page 18: Chi sequare

Finding the Expected Numbers• You are given the observed numbers, and you determine the

expected proportions from a Punnett square.• To get the expected numbers of offspring, first add up the

observed offspring to get the total number of offspring. In this case, 315 + 101 + 108 + 32 = 556.

• Then multiply total offspring by the expected proportion: --expected round yellow = 9/16 * 556 = 312.75 --expected round green = 3/16 * 556 = 104.25 --expected wrinkled yellow = 3/16 * 556 = 104.25 --expected wrinkled green = 1/16 * 556 = 34.75• Note that these add up to 556, the observed total offspring.

Page 19: Chi sequare

Calculating the Chi-Square Value

• Use the formula.• X2 = (315 - 312.75)2 / 312.75

+ (101 - 104.25)2 / 104.25

+ (108 - 104.25)2 / 104.25

+ (32 - 34.75)2 / 34.75

= 0.016 + 0.101 + 0.135 + 0.218

= 0.470.

exp

exp)( 22 obs

Page 20: Chi sequare

D.F. and Critical Value• Degrees of freedom is 1 less than the number of

classes of offspring. Here, 4 - 1 = 3 d.f.• For 3 d.f. and p = 0.05, the critical chi-square value is

7.815.• Since the observed chi-square (0.470) is less than the

critical value, we fail to reject the null hypothesis. We accept Mendel’s conclusion that the observed results for a 9/16 : 3/16 : 3/16 : 1/16 ratio.

• It should be mentioned that all of Mendel’s numbers are unreasonably accurate.

Page 21: Chi sequare

Chi-Square Table

Page 22: Chi sequare

Consider Another example:For Drosophila melanogaster

• Gene affecting wing shape– c+ = Normal wing– c = Curved wing

• Gene affecting body color– e+ = Normal (gray)– e = ebony

• Note:– The wild-type allele is designated with a + sign– Recessive mutant alleles are designated with lowercase

letters

• The Cross:– A cross is made between two true-breeding flies (c+c+e+e+ and

ccee). The flies of the F1 generation are then allowed to mate with each other to produce an F2 generation.

Page 23: Chi sequare

• The outcome– F1 generation

• All offspring have straight wings and gray bodies– F2 generation

• 193 straight wings, gray bodies• 69 straight wings, ebony bodies• 64 curved wings, gray bodies• 26 curved wings, ebony bodies• 352 total flies

Page 24: Chi sequare

Applying the chi square test

Step 1: Propose a null hypothesis that allows us to calculate the expected values based on Mendel’s laws• The two traits are independently assorting

Page 25: Chi sequare

Step 2: Calculate the expected values of the four phenotypes, based on the hypothesis• According to our hypothesis, there should be a 9:3:3:1 ratio

on the F2 generation

Phenotype Expected probability

Expected number

Observed number

straight wings, gray bodies

9/16 9/16 X 352 = 198 193

straight wings, ebony bodies

3/16 3/16 X 352 = 66 64

curved wings, gray bodies

3/16 3/16 X 352 = 66 62

curved wings, ebony bodies

1/16 1/16 X 352 = 22 24

Page 26: Chi sequare

Step 3: Apply the chi square formula

c2 = (O1 – E1)2

E1

(O2 – E2)2

E2

(O3 – E3)2

E3

(O4 – E4)2

E4 + + +

c2 = (193 – 198)2

198 (69 – 66)2

66 (64 – 66)2

66 (26 – 22)2

22 + + +

c2 = 0.13 + 0.14 + 0.06 + 0.73

c2 = 1.06

Expected number

Observed number

198 193

66 64

66 62

22 24

Page 27: Chi sequare

Step 4: Interpret the chi square value– The calculated chi square value can be used to obtain

probabilities, or P values, from a chi square table• These probabilities allow us to determine the likelihood that the

observed deviations are due to random chance alone

– Low chi square values indicate a high probability that the observed deviations could be due to random chance alone

– High chi square values indicate a low probability that the observed deviations are due to random chance alone

– If the chi square value results in a probability that is less than 0.05 (ie: less than 5%) it is considered statistically significant• The hypothesis is rejected

Page 28: Chi sequare

Step 4: Interpret the chi square value

– Before we can use the chi square table, we have to determine the degrees of freedom (df)• The df is a measure of the number of categories that are

independent of each other• If you know the 3 of the 4 categories you can deduce the• df = n – 1

– where n = total number of categories• In our experiment, there are four phenotypes/categories

– Therefore, df = 4 – 1 = 3

– Refer to Table

Page 29: Chi sequare

1.06

Page 30: Chi sequare

Step 4: Interpret the chi square value

– With df = 3, the chi square value of 1.06 is slightly greater than 1.005 (which corresponds to P-value = 0.80)

– P-value = 0.80 means that Chi-square values equal to or greater than 1.005 are expected to occur 80% of the time due to random chance alone; that is, when the null hypothesis is true.

– Therefore, it is quite probable that the deviations between the observed and expected values in this experiment can be explained by random sampling error and the null hypothesis is not rejected. What was the null hypothesis?

Page 31: Chi sequare

If your hypothesis is supported by data • you are claiming that mating is random and so is

segregation and independent assortment.

If your hypothesis is not supported by data• you are seeing that the deviation between observed

and expected is very far apart…something non-random must be occurring….

Page 32: Chi sequare

Let’s look at an other fruit fly cross

x

Black body, eyeless wild

F1: all wild

Page 33: Chi sequare

F1 x F1

56101881

1896 622

Page 34: Chi sequare

Analysis of the results

• Once the numbers are in, you have to determine the cross that you were using.

• What is the expected outcome of this cross?• 9/16 wild type: 3/16 normal body eyeless:

3/16 black body wild eyes: 1/16 black body eyeless.

Page 35: Chi sequare

Now Conduct the Analysis:

Phenotype

Observed

Hypothesis

Wild

5610

Eyeless

1881

Black body

1896

Eyeless, black body

622

Total

10009

To compute the hypothesis value take 10009/16 = 626

Page 36: Chi sequare

Now Conduct the Analysis:

To compute the hypothesis value take 10009/16 = 626

Phenotype

Observed

Hypothesis

Wild

5610

5634

Eyeless

1881

1878

Black body

1896

1878

Eyeless, black body

622

626

Total

10009

Page 37: Chi sequare

• Using the chi square formula compute the chi square total for this cross:

• (5610 - 5630)2/ 5630 = .07• (1881 - 1877)2/ 1877 = .01• (1896 - 1877 )2/ 1877 = .20• (622 - 626) 2/ 626 = .02• 2= .30• How many degrees of freedom?

exp

exp)( 22 obs

Page 38: Chi sequare

• Using the chi square formula compute the chi square total for this cross:

• (5610 - 5630)2/ 5630 = .07• (1881 - 1877)2/ 1877 = .01• (1896 - 1877 )2/ 1877 = .20• (622 - 626) 2/ 626 = .02• 2= .30• How many degrees of freedom? 3

exp

exp)( 22 obs

Page 39: Chi sequare

CHI-SQUARE DISTRIBUTION TABLE

Accept Hypothesis Reject Hypothesis

Probability (p)

Degrees of Freedom

0.95 0.90 0.80 0.70 0.50 0.30 0.20 0.10 0.05 0.01 0.001

1 0.004 0.02 0.06 0.15 0.46 1.07 1.64 2.71 3.84 6.64 10.83

2 0.10 0.21 0.45 0.71 1.39 2.41 3.22 4.60 5.99 9.21 13.82

3 0.35 0.58 1.01 1.42 2.37 3.66 4.64 6.25 7.82 11.34 16.27

4 0.71 1.06 1.65 2.20 3.36 4.88 5.99 7.78 9.49 13.38 18.47

5 1.14 1.61 2.34 3.00 4.35 6.06 7.29 9.24 11.07 15.09 20.52

6 1.63 2.20 3.07 3.83 5.35 7.23 8.56 10.64 12.59 16.81 22.46

7 2.17 2.83 3.82 4.67 6.35 8.38 9.80 12.02 14.07 18.48 24.32

8 2.73 3.49 4.59 5.53 7.34 9.52 11.03 13.36 15.51 20.09 26.12

9 3.32 4.17 5.38 6.39 8.34 10.66 12.24 14.68 16.92 21.67 27.88

10 3.94 4.86 6.18 7.27 9.34 11.78 13.44 15.99 18.31 23.21 29.59

Page 40: Chi sequare

• When reporting chi square data use the following formula sentence….With degrees of freedom, my chi square

value is , which gives me a p value between % and %, I therefore

my null hypothesis.• This sentence would go in the “reults” section

of your formal lab.• Your explanation of the significance of this

data would go in the “discussion” section of the formal lab.

Page 41: Chi sequare

• Looking this statistic up on the chi square distribution table tells us the following:

• the P value read off the table places our chi square number of .30 close to .95 or 95%

• This means that 95% of the time when our observed data is this close to our expected data, this deviation is due to random chance.

• We therefore accept our null hypothesis.

Page 42: Chi sequare

• What is the critical value at which we would reject the null hypothesis?

• For three degrees of freedom this value for our chi square is > 7.815

• What if our chi square value was 8.0 with 4 degrees of freedom, do we accept or reject the null hypothesis?

• Accept, since the critical value is >9.48 with 4 degrees of freedom.