The logic behind a statistical test. A statistical test is the comparison of the probabilities in...
-
Upload
heaven-ohara -
Category
Documents
-
view
215 -
download
2
Transcript of The logic behind a statistical test. A statistical test is the comparison of the probabilities in...
The logic behind a statistical test.
A statistical test is the comparison of the probabilities in favour of a hypothesis H1with the respective probabilities of an appropriate null hypothesis H0.
Hypothesis correct
1-
1-
Hypothesis wrong
Hypothesis rejected
Hypothesis accepted
Type I error
Type II error Power of a test
Accepting the wrong hypothesis H1 is termed type I error.Rejecting the correct hypothesis H1 is termed ttype II error.
Lecture 11Parametric hypothesis testing
Testing simple hypotheses
Karl Pearson threw 24000 times a coin and wanted to see whether in the real world deviations from the expectation of 12000 numbers and 12000 eagles occur. He got
12012 time the numbers. Does this result deviate from our expectation?
2400024000
12012
24000( 12012) 0.5
i
p Xi
12012 12000( 12012) 1 ( 12012) 1 1 0.438
6000
XX X
The exact solution of the binomial
The normal approximation
60005.0*5.0*24000)1(
24000*5.02
pnpnpq
np
0.95 1.96 12000 1.96 6000 12000 151.8CL x s
2 22 (12000 11988) (12000 12012)
0.02412000 12000
c2 test
Assume a sum of variances of Z-transformed variables
n
i
in nx
Ex
Ex
Ex
Ex
E1
22222322222122 ])[(])[(...])[(])[(])[(
Each variance is one. Thus the expected value of c2 is n
The c2 distribution is a group of distributions of variances in dependence on the number of elements n.
Observed values of c2 can be compared to predicted and allow for statistical hypthesis testing.
Pearson’s coin example
Probability of H0
9 times green, yellow seed3 times green, green seed3 times yellow, yellow seed1 time yellow, green seed
Combination Ratio Observed PredictedGY 9 61 65.25GG 3 16 21.75YY 3 28 21.75YG 1 11 7.25Sum 16 116 116
010203040506070
GY GG YY YG
# ob
serv
ation
s
Character combination
Observed
Predicted
Does the observation confirm the prediction?
2K2
1
(expected value - observed value)
expected value
25.7)1125.7(
75.21)2875.21(
75.21)1675.21(
25.65)6125.65( 2222
2
The Chi2 test has K-1 degrees of freedom.
2K2
1
(expected value - observed value)
expected value
25.7)1125.7(
75.21)2875.21(
75.21)1675.21(
25.65)6125.65( 2222
2
All statistical programs give the probability of the null hypothesis, H0.
Advices for applying a χ2-test
• χ2-tests compare observations and expectations. Total numbers of observations and expectations must be equal.
• The absolute values should not be too small (as a rule the smallest expected value should be larger than 10). At small event numbers the Yates correction should be used.
• The classification of events must be unequivocal.• χ2-tests were found to be quite robust. That means they are conservative and rather
favour H0, the hypothesis of no deviation. • The applicability of the χ2-test does not depend on the underlying distributions. They
need not to be normally of binomial distributed.
2K2
1
(expected frequency - observed frequency)
expected frequencyN
Dealing with frequencies
2K2
1
( expected value - observed value 0.5)
expected value
G-test or log likelihood test
c2 relies on absolute differences between observed and expected frequencies. However, it is also possible to take the quotient L = observed / expected as a measure
of goodness of fit
observed
expected
p2ln( ) 2 ln
pG L
1
2 lnk
i
OG O
E
G is approximately c2 distributed with k - 1 degrees of freedom
A species - area relation is expected to follow a power function of the form S = 10A0..3. Do the following data points (Area, species number) confirm this expectations:
A1 (1,12), A2 (2,18), A3 (4,14), A4 (8,30), A5 (16,35), A6 (32,38), A7 (64,33), A8 (128,35), A9 (256,56), A10 (512,70)? We try different tests.
Area Richness Estimate Chi2 Test G1 12 10 0.4 2.1878592 18 12.31144 2.628422 6.8371654 14 15.15717 0.088343 -1.111838 30 18.66066 6.890466 14.24339
16 35 22.97397 6.295189 14.7345232 38 28.28427 3.337381 11.2206564 33 34.82202 0.095335 -1.7735
128 35 42.87094 1.445074 -7.09961256 56 52.78032 0.196406 3.315948512 70 64.98019 0.387787 5.208893
Sum 21.7644 95.52699df 9 9
Chi2 distribution 0.009656 1.26E-16
01020304050607080
0 100 200 300 400 500 600
Spec
ies
richn
ess
Area
Both tests indicate that the regression line doesn’t fit
1
10
100
1 100
Spec
ies
richn
ess
Area
The pattern is better seen in a double log plot.
We have seven points above and 3 points below the regression line.
Is there a systematic error?
1
10
100
1 100
Spec
ies
richn
ess
Area
Tests for systematic errors.
The binomial
17.02110
)3(3
0
10
i ixp
The c2 test
6.15
)53(5
)57( 222
21.0)1;6.1( p
Area Richness Estimate Chi2 Test G1 12 13.584 0.184707 -1.487832 18 16.17099 0.206868 1.9287474 14 19.25067 1.432132 -4.458848 30 22.91684 2.189268 8.079756
16 35 27.28122 2.183902 8.72022832 38 32.47677 0.939318 5.96831664 33 38.66179 0.829135 -5.22536
128 35 46.0247 2.640844 -9.58406256 56 54.78984 0.026729 1.223428512 70 65.22425 0.349683 4.946478
Sum 10.98258 20.22174df 9 9
Chi2 distribution 0.276905 0.016592
y = 13.584x0.2515
01020304050607080
0 100 200 300 400 500 600
Spec
ies
richn
ess
Area
Spec
ies
richn
ess
Now we try the best fit model
the G-test identified even the best fit model as having larger deviations than expected from a
simple normal random sample model.
The best fit model
Observation and expectation can be compared by a Kolmogorov-Smirnov test.
The test compares the maximum cumulative deviation with that expected from a normal
distribution.
Area Richness Estimate Kolmogorov-Smirnov1 12 13.584 -1.584 -1.5842 18 16.17099 1.829006 0.2450064 14 19.25067 -5.25067 -5.005668 30 22.91684 7.083156 2.077496
16 35 27.28122 7.718776 9.79627232 38 32.47677 5.523225 15.319564 33 38.66179 -5.66179 9.657709
128 35 46.0247 -11.0247 -1.36699256 56 54.78984 1.210161 -0.15683512 70 65.22425 4.775754 4.618923
Maximum 15.3195df 9
Chi2 distribution 0.082526Probability of difference 0.917474
Kolmogorov-Smirnov test
Both results are qualitatively identical but differ quantitatively.
The programs use different algorithms
110 475
90 325
Curled
Normal
A B
200 800
585
415
1000
2x2 contingency table
1000 Drosophila flies with normal and curled wings and two alleles A and B
suposed to influence wing form.
Do flies with allele have more often curled wings than fiels with allele B?
Combination Observed Predicted Chi2A-curled 110 117 0.418803A-normal 90 83 0.590361B-cureled 475 468 0.104701B-normal 325 332 0.14759Sum 1000 1000 1.261456 0.73830541
Sum curled 585Sum normal 415Sum A 200Sum B 800
Chi2 distribution
26.1332
)325332(468
)475468(83
)9083(117
)110117( 22222
A contingency table chi2 test with n rows and m columns has (n-1) * (m-1)
degrees of freedom.
The 2x2 table has 1 degree of freedom
Predicted number of allele A and curled wings
1171000200
585)( curledAP
Relative abundance distributions
0
0.1
0.2
0.3
0.4
0.5
0 20 40 60
Re
lativ
e a
bu
nd
ance
Species rank order
0.00001
0.0001
0.001
0.01
0.1
1
0 10 20 30 40 50 60
log
rela
tive
ab
un
da
nce
Species rank order
Dominant species
Rare species
Intermediate species
The hollow curve
Evenness
Abundance is the total number of individuals in a
population.Density refers to the number of
individuals in a unit of measurement.
The log-normal distribution
log
rela
tive
ab
un
da
nce
Species rank order
The distribution of species abundance distributions across vertebrates and invertebrates
3 types of distributions: log-
series, power function, lognormal.
We compare 99 such distributions from all over the world.
Distribution Number
Good fit Intermediate fit
Bad fit
Lognormal 59 29 21 9Logseries 59 17 14 28Power function 59 13 24 22
Good fitIntermediate
fitBad fit
Lognormal 40 19 12 9Logseries 40 9 14 17Power function 40 12 14 14
Vertebrates
Invertebrates
Row and column sums are identical due to our classification. We expect
equal entries for each cell:
67.193
59
)595959(
)92129()131729()(
FitclassDistrP
Distribution Number
Good fit Intermediate fit
Bad fit
Lognormal 59 29 21 9Logseries 59 17 14 28Power function 59 13 24 22
Good fitIntermediate
fitBad fit
Lognormal 40 19 12 9Logseries 40 9 14 17Power function 40 12 14 14
Vertebrates
Invertebrates Do vertebrates and invertebrates differ in abundance distributions?
29
2840
59*19
Vert
Inv
Obs
Exp
But if we take the whole pattern we get
Number of log-normal best fits only:
Student’s t-test for equal sample sizes and similar variances
Welch t-test for unequal variances and sample sizes
Bivariate comparisons of means
F-test
2122
F
2
22
1
21
11
ns
ns
xxt
22
21
11
ss
xxnt
F
s
s
ss
xxn
ns
ns
xxn
tSum
Difference
n
i
n
i
2
2
22
21
1
221
22
21
1
2212
2
11
1ndf
11 2
2
2
22
1
2
1
21
2
2
22
1
21
nns
nns
ns
ns
df
1
1
22
11
ndf
ndf
In a physiological experiment mean metabolism rates had been measured. A first treatment gave mean = 100, variance = 45, a second treatment mean = 120, variance = 55.
In the first case 30 animals in the second case 50 animals had been tested. Do means and variances differ?
N1+N2-2Degrees of freedom
The probability level for the null hypothesis2
22
1
21
11
ns
ns
xxt
4.12
5055
3045
100120
t
2122
F
22.14555
)30;50( F
The comparison of variances
Degrees of freedom: N-1
The probability for the null hypothesis of
no difference, H0.
1-0.287=0.713: probability that the first variance (50) is
larger than the second (30).
One sided test
0.57 2*0.287Past gives the probability for a two sided test that one variance is either larger or smaller
than the second.
Two sided test
1 2
2 21 2
t N
Power analysis
Nt
2
21
22
212
Effect size In an experiment you estimated two means
Each time you took 20 replicates. Was this sample size large enough to confirm differences between both means?
20;150
50;180
11
11
sx
sx
We use the t-distribution with 19 degrees of freedom.
15)150180(
205009.2 2
222
N
You needed 15 replicates to confirm a
difference at the 5% error level.
The t-test can be used to estimate the number of observations to detect a significant signal for a given effect size.
From a physiological experiment we want to test whether a certain medicament enhances short time memory.
How many persons should you test (with and without the treatment) to confirm a difference in memory of about 5%?
2 2 21 1 1
2
2
1.05 0.05 0.05 0.05
2.05 2.051.05 1.05
2.05820
0.05
t N N N N
tN t
We don’t know the variances and assume a Poisson random sample.Hences2 = m
We don’t know the degrees of freedom:
We use a large number and get t:
3150)96.1(*820 2 N
Home work and literature
Refresh:
• c2 test• Mendel rules• t-test• F-test• Contingency table• G-test
Prepare to the next lecture:
• Coefficient of correlation• Maximum, minimum of functions• Matrix multiplication• Eigenvalue
Literature:
Łomnicki: Statystyka dla biologów