K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem...
-
date post
15-Jan-2016 -
Category
Documents
-
view
216 -
download
0
Transcript of K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem...
K. Desch – Statistical methods of data analysis SS10
3. Distributions 3.6 Central limit theorem
Central limit theorem:
If is a sum of n independent randomly distributed variables with
arbitrary probability densities with expectation values <x> and a variances 2,
as , w becomes Gaussian with E[w] = n<x> and V[w] = n 2
n
ii 1
w x
n
a) Expectation:
ii
n
1ii xnnE[x]]E[xxEwE[w]
b) Variance:
2
i iii
2 xxE)w-(wEV[w]
ki
kkii
2
iii
2
iii )x)(xx(xxxExxE
0]x,cov[x ki 22
ii
2ii nσσ)x(xE
(remember V[x+y] = V[x] + V[y] + 2 cov[x,y])
K. Desch – Statistical methods of data analysis SS10
3. Distributions 3.6 Central limit theorem
c) Take instead of xi
then:
n
xxy ii
i
0,]E[yi n
σ]E[y
22i
The characteristic function of yi is:
m m m
iky mi i im
m 0 m 00
k (ik)(k) e f (y )dy E[y ]
k m! m!i
i i
k
32 3i i2
3/ 2
E[(x x ) ]k ik1-0+ σ ...
2n 3! n
For sum : iyz
...
n
])xE[(x
3!
ikσ
2n
k-1(k)(k)
3/2
3ii
32
2
iz
22
n
22
1
22
kσ2
1expσ
2n
k-1σ
2n
k-1
n
i
→0
K. Desch – Statistical methods of data analysis SS10
3. Distributions 3.6 Central limit theorem
Notes:
- CLT depends crucially on convergence of the Taylor series integral of p.d.f. slow convergence in case of long tails (Breit-Wigner distribution, Landau distribution)
-generalization for sum of differently distributed p.d.f. mathematically different – convergence under certain conditions (generally met in physical problems)
K. Desch – Statistical methods of data analysis SS10
3. Distributions 3.6 Central limit theorem
Example:
energy loss per unit length of ionizing particles dE/dxapproximated by Landau distribution
No precise measurement of dE/dxfrom single measurement i.e.charge measurements in one detector layer
Many layers: approaches Gaussian distribution
- get rid of outliers by forming „truncated mean“ (reject the Fup (~70%) largest single measurements)
K. Desch – Statistical methods of data analysis SS10
3. Distributions 3.6 Central limit theorem
(~159 Messungen desselben Teilchens)
K. Desch – Statistical methods of data analysis SS10
3. Distributions 3.6 Central limit theorem
K. Desch – Statistical methods of data analysis SS10
3. Distributions 3.7 Uniform distribution
1a x b
f(x) b a0 otherwise
a bE[x]
2
22 (b a)
V[x]12
Example: digital readout of Si-strip detector
a b pitch d=b-a
d = 50 μm
μm 14.412
dσx
K. Desch – Statistical methods of data analysis SS10
3. Distributions 3.7 Uniform distribution
Each continuous probability density function f(x) with a known cumulative probability distribution F(x) could be transformed into a new variable, which will be distributed uniformly
1
dx daa:=F(x) g(a) f(x) f(x) 1
da dx 1,0a
Important part of Monte Carlo methods
a
K. Desch – Statistical methods of data analysis SS10
3. Distributions 3.8 Breit-Wigner (Cauchy, Lorentz) distribution
2
2 2
1f(E;M, )
2 (E M) ( / 4)
: „Width“ = FWHM
Distribution has a long tail – not integrable
Mean, Variance, higher Moments not defined !
Physics: any resonance phenomenon
Sum BW-distributed variables:BW-distribution(CLT does not apply)
“Relativistic BW”:
x1,…,xn Gauss distributed random variables with i=0, i=1. Then the
Sum of squares, , follows to 2 – distribution with n degrees of
freedom
K. Desch – Statistical methods of data analysis SS10
3. Distributions 3.9 2 - distribution
n2i
i 1
x
n / 2 1u / 2
n
1 ue
2 2f (u)
(n / 2)
2 – distribution plays an important role in statistical tests
fn(u) has a maximum by n-2
Mean E[u] = n
Variance V[u] = 2n
Approaches Gaussian for large n (CLT for xi2)
(x): Euler Gamma Function (interpolation of factorial function)
n
1i
2i
2 xu
2
2 2n
0
Prob( ) 1 F ( ) 1 (u)dun f
K. Desch – Statistical methods of data analysis SS10
3. Distributions 3.9 2 - distribution
• Monte Carlo methods are important statistical tools• Simulation of random processes
4. MC Methods 4.1 Random generators
MC simulation is a method for iteratively evaluating a deterministic model using sets of random numbers as input (D.E. Knuth, ‘The art of computer programming’)
Random generators: quick, machine independent
Example: Linear congruential generator – pseudo-random algorithm Ij = (a • Ij-1 + c ) mod m uj = Ij / m
a, c, m – constants; I0 is a ‘seed’Period is at most m will be achieved when:1. c ≠ 02. c and m are relatively prime3. (a-1) is divisible by all prime factors of m4. (a-1) is a multiple of 4 if m is a multiple of 4 Example: a = 205 c = 29573 m = 139968
K. Desch – Statistical methods of data analysis SS10
‘Fibonacci generators’ based on generalization of Fibonacci sequence;for example: Un = (Un-24 + Un-55) mod m Need to be initialized !Generators in root TRandom P ≈ 109 (linear congruential – don´t use!!)TRandom1 P ≈ 10171
TRandom2 P ≈ 1026
TRandom3 P ≈ 106000
4. MC Methods 4.1 Random generators
K. Desch – Statistical methods of data analysis SS10
Tests for random generators
1. Uniform distribution: [0,1] ???
2. Correlation test:
Fill successive pairs in a 2D-histogram (see Blobel)
k
1i
2i2
N/k
N/kN on!distributi is 2