Kolmogorov-Smirnov and the others - UNIGEobseyer/CAFSTAT/paltani_060315.pdf · Correct usage of...
Transcript of Kolmogorov-Smirnov and the others - UNIGEobseyer/CAFSTAT/paltani_060315.pdf · Correct usage of...
1/23
How to compare distributions?
or rather
Has my sample been drawn from this distribution?
or even
Kolmogorov-Smirnov and the others...
Stéphane Paltani
ISDC/Observatoire de Genève
2/23
The Problem... (I)
Sample of observed values (random variable): {Xi}, i=1,...,N
Expected distribution: f(x), Xmin ≤ x ≤ Xmax
→ Is my {Xi}, i=1,...,N sample a probable outcome from a draw of N values
from a distribution f(X), Xmin ≤ X≤ Xmax ?
3/23
Example... (I)
An X-ray detector collects photons at specific times: {Xi}≡{ti}, i=1,...,N
Is the source constant ?
Expected distribution: tmin ≤ t ≤ tmax f t = 1 tmax−tmin
,
4/23
The Problem... (II)
Sample of observed values (random variable): {Xi}, i=1,...,N
Sample of observed values (random variable): {Yj}, j=1,...,M
→ Are the {Xi}, i=1,...,N distributed the same way as the {Yj}, j=1,...,M ?
5/23
Example... (II)
We measure the amplitude of variability of different types of active galactic nuclei.
Seyfert 1 galaxies: {σiS}, i=1,...,NQSOs: {σjQ}, j=1,...,MBL Lacs: {σkB}, k=1,...,PDo these different types of objects have different variability properties ?
6/23
Distribution Cumulatives
For continuous distributions, we can easily compare visually how well two distributions agree by building the cumulatives of these distributions. The cumulative of a distribution is simply its integral:
F x=∫X min
xf ydy
C {X i } x=∑i
H x−X i
Different approaches can then be followed to obtain a quantitative assessment
The cumulative of a sample can be calculated in a similar way:
(Heaviside function)
7/23
Statistique de Kolmogorov-Smirnov
The simplest one: We determine the maximum separation between the two cumulatives :
D = maxX min≤ x≤X max
∣C X i−F x−X min∣
8/23
Virtues of Kolmogorov-Smirnov (I)
It is possible to estimate the expected distribution of D in the case of a uniform distribution:
P D=2∑j=1
∞
−1 j−1 e−2j22
, =N0.120.11
N D
P(λ>D) is a one-sided probability distribution, so the smaller D, the larger the probability
If one compares two samples, their variance must be added, and:
N e=N⋅M
NM
9/23
Virtues of Kolmogorov-Smirnov (II)
D is preserved under any transformation x → y = ψ(x), where ψ(x) is an arbitrary strictly monotonic function
→ P(λ>D) does not depend on the shape of the underlying distributions !
10/23
Correct usage of Kolmogorov-Smirnov
● Decide for a threshold α, for instance 5%
● If P(λ>D)>α, then one cannot exclude that the sample has been drawn from the given distribution
● One can never be sure that the sample has been actually drawn from the given distribution
DO NOT SAY:
The probability that the sample has been drawn from this distribution is X%
SAY:
If P(λ>D)>α:We cannot exclude that the sample has been drawn from this distribution
If P(λ>D)<α:The probability of such a high KS value to be obtained is only X%
11/23
Caveats (I)
No correlation in the sample is allowed: Sample must be in the Poisson regime
There is a general problem among non-frequentists (the Bayesians) with the notion of null hypothesis: Such frequentist tests implicitly specify classes of alternative hypotheses and exclude others...
Furthermore, one may reject the hypothesis if the model parameters are slightly wrong
12/23
Caveats (II)
In cases where the model has parameters, if one estimates them using the same data,one cannot use KS statistics anymore!
→ Use Monte Carlo simulations
→ For Gaussian distribution, use Shapiro-Wilk normality test
13/23
Caveats (III)
Sensitivity of the Kolmogorov-Smirnov tes is not constant between Xmin and Xmaxsince the variance of C{Xi} is proportional to F(x-Xmin).(1-F(x-Xmin)), which
reaches a maximum for F(x-Xmin) = 0.5
80 photons/1000 addedin the tails
80 photons/1000 addedin the center
14/23
Non-uniformity Correction
Anderson-Darling's statistics:
D∗= maxX min≤x≤X max
∣C X i−F x−X min∣
F x−X min1−F x−X min
Or:
D∗∗= ∫X min≤ x≤X max
∣C X i−F x−X min∣
F x−X min1−F x−X mindF
One can think of other ones...
However, none of them provides a simple (or even workable) expression for
P(λ>D*(*))...
15/23
The simplest modification to KS is the best one!
V=DD−= maxX min≤x≤X max
C {X i}−F x−X min max
X min≤x≤X max
F x−X min−C {X i }
And:
P V =2∑j=1
∞
4j22−1e−2j22
, =N0.1550.24
N V
Kuiper's Statistics
16/23
Kuiper on a Circle
If the distribution is defined over a periodic domain, Kuiper's statistics does not depend on the choice of the origin, contrarily to KS.
60 photons/1000 addedin the tails
60 photons/1000 addedin the center
17/23
Standard test is Rayleigh test, which is nothing else than the Fourier power spectrum
for signals expressed in the form:
Application of Kuiper's statistics to the search of periodic signals
Rayleigh test Kuiper test
0 1
S t =∑i t−t i
18/23
0 1
Interrupted Observations
But X-ray observations are often interrupted because of various instrumental problems,
so the expected distribution is not uniform!
Kuiper test can take into account very naturally the non-uniformity of the phase
exposure map
“good time intervals” Phase exposure map Expected “uniform” distribution
19/23
Examples with simulated data
Continuouscase
Real GTIs Real GTIswith signal
20/23
Examples with real data
EX Hya
UW Pic
21/23
Ad nauseam: Kuiper for the fanatical... (I)
α,β :
22/23
Ad nauseam: Kuiper for the fanatical... (II)
Tail probabilities
Expected
Best
Asymptoticequation, instead of analytical expression
23/23
References
● Any text book in statistics...
● Press W. et al., Numerical Recipes in C,F777, ..., Sect. 14
● Kuiper N.H., 1962, Proc. Koninklijke Nederlandse Akad. Wetenshappen A 63, 38
● Stephens M.A., 1965, Biometrika 70, 11
● Paltani S., 2004, A&A 420, 789