K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem...

13
K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem limit theorem: is a sum of n independent randomly distributed variabl ry probability densities with expectation values <x> and a varianc , w becomes Gaussian with E[w] = n<x> and V[w] = n 2 n i i1 w x n a) Expectation: i i n 1 i i x n nE[x] ] E[x x E w E[w] b) Variance: 2 i i i i 2 x x E ) w - (w E V[w] k i k k i i 2 i i i 2 i i i ) x )(x x (x x x E x x E 0 ] x , cov[x k i 2 2 i i 2 i i σ ) x (x E (remember V[x+y] = V[x] + V[y] + 2 cov[x,y
  • date post

    15-Jan-2016
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem...

Page 1: K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem Central limit theorem: If is a sum of n independent randomly.

K. Desch – Statistical methods of data analysis SS10

3. Distributions 3.6 Central limit theorem

Central limit theorem:

If is a sum of n independent randomly distributed variables with

arbitrary probability densities with expectation values <x> and a variances 2,

as , w becomes Gaussian with E[w] = n<x> and V[w] = n 2

n

ii 1

w x

n

a) Expectation:

ii

n

1ii xnnE[x]]E[xxEwE[w]

b) Variance:

2

i iii

2 xxE)w-(wEV[w]

ki

kkii

2

iii

2

iii )x)(xx(xxxExxE

0]x,cov[x ki 22

ii

2ii nσσ)x(xE

(remember V[x+y] = V[x] + V[y] + 2 cov[x,y])

Page 2: K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem Central limit theorem: If is a sum of n independent randomly.

K. Desch – Statistical methods of data analysis SS10

3. Distributions 3.6 Central limit theorem

c) Take instead of xi

then:

n

xxy ii

i

0,]E[yi n

σ]E[y

22i

The characteristic function of yi is:

m m m

iky mi i im

m 0 m 00

k (ik)(k) e f (y )dy E[y ]

k m! m!i

i i

k

32 3i i2

3/ 2

E[(x x ) ]k ik1-0+ σ ...

2n 3! n

For sum : iyz

...

n

])xE[(x

3!

ikσ

2n

k-1(k)(k)

3/2

3ii

32

2

iz

22

n

22

1

22

kσ2

1expσ

2n

k-1σ

2n

k-1

n

i

→0

Page 3: K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem Central limit theorem: If is a sum of n independent randomly.

K. Desch – Statistical methods of data analysis SS10

3. Distributions 3.6 Central limit theorem

Notes:

- CLT depends crucially on convergence of the Taylor series integral of p.d.f. slow convergence in case of long tails (Breit-Wigner distribution, Landau distribution)

-generalization for sum of differently distributed p.d.f. mathematically different – convergence under certain conditions (generally met in physical problems)

Page 4: K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem Central limit theorem: If is a sum of n independent randomly.

K. Desch – Statistical methods of data analysis SS10

3. Distributions 3.6 Central limit theorem

Example:

energy loss per unit length of ionizing particles dE/dxapproximated by Landau distribution

No precise measurement of dE/dxfrom single measurement i.e.charge measurements in one detector layer

Many layers: approaches Gaussian distribution

- get rid of outliers by forming „truncated mean“ (reject the Fup (~70%) largest single measurements)

Page 5: K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem Central limit theorem: If is a sum of n independent randomly.

K. Desch – Statistical methods of data analysis SS10

3. Distributions 3.6 Central limit theorem

(~159 Messungen desselben Teilchens)

Page 6: K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem Central limit theorem: If is a sum of n independent randomly.

K. Desch – Statistical methods of data analysis SS10

3. Distributions 3.6 Central limit theorem

Page 7: K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem Central limit theorem: If is a sum of n independent randomly.

K. Desch – Statistical methods of data analysis SS10

3. Distributions 3.7 Uniform distribution

1a x b

f(x) b a0 otherwise

a bE[x]

2

22 (b a)

V[x]12

Example: digital readout of Si-strip detector

a b pitch d=b-a

d = 50 μm

μm 14.412

dσx

Page 8: K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem Central limit theorem: If is a sum of n independent randomly.

K. Desch – Statistical methods of data analysis SS10

3. Distributions 3.7 Uniform distribution

Each continuous probability density function f(x) with a known cumulative probability distribution F(x) could be transformed into a new variable, which will be distributed uniformly

1

dx daa:=F(x) g(a) f(x) f(x) 1

da dx 1,0a

Important part of Monte Carlo methods

a

Page 9: K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem Central limit theorem: If is a sum of n independent randomly.

K. Desch – Statistical methods of data analysis SS10

3. Distributions 3.8 Breit-Wigner (Cauchy, Lorentz) distribution

2

2 2

1f(E;M, )

2 (E M) ( / 4)

: „Width“ = FWHM

Distribution has a long tail – not integrable

Mean, Variance, higher Moments not defined !

Physics: any resonance phenomenon

Sum BW-distributed variables:BW-distribution(CLT does not apply)

“Relativistic BW”:

Page 10: K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem Central limit theorem: If is a sum of n independent randomly.

x1,…,xn Gauss distributed random variables with i=0, i=1. Then the

Sum of squares, , follows to 2 – distribution with n degrees of

freedom

K. Desch – Statistical methods of data analysis SS10

3. Distributions 3.9 2 - distribution

n2i

i 1

x

n / 2 1u / 2

n

1 ue

2 2f (u)

(n / 2)

2 – distribution plays an important role in statistical tests

fn(u) has a maximum by n-2

Mean E[u] = n

Variance V[u] = 2n

Approaches Gaussian for large n (CLT for xi2)

(x): Euler Gamma Function (interpolation of factorial function)

n

1i

2i

2 xu

2

2 2n

0

Prob( ) 1 F ( ) 1 (u)dun f

Page 11: K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem Central limit theorem: If is a sum of n independent randomly.

K. Desch – Statistical methods of data analysis SS10

3. Distributions 3.9 2 - distribution

Page 12: K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem Central limit theorem: If is a sum of n independent randomly.

• Monte Carlo methods are important statistical tools• Simulation of random processes

4. MC Methods 4.1 Random generators

MC simulation is a method for iteratively evaluating a deterministic model using sets of random numbers as input (D.E. Knuth, ‘The art of computer programming’)

Random generators: quick, machine independent

Example: Linear congruential generator – pseudo-random algorithm Ij = (a • Ij-1 + c ) mod m uj = Ij / m

a, c, m – constants; I0 is a ‘seed’Period is at most m will be achieved when:1. c ≠ 02. c and m are relatively prime3. (a-1) is divisible by all prime factors of m4. (a-1) is a multiple of 4 if m is a multiple of 4 Example: a = 205 c = 29573 m = 139968

K. Desch – Statistical methods of data analysis SS10

Page 13: K. Desch – Statistical methods of data analysis SS10 3. Distributions 3.6 Central limit theorem Central limit theorem: If is a sum of n independent randomly.

‘Fibonacci generators’ based on generalization of Fibonacci sequence;for example: Un = (Un-24 + Un-55) mod m Need to be initialized !Generators in root TRandom P ≈ 109 (linear congruential – don´t use!!)TRandom1 P ≈ 10171

TRandom2 P ≈ 1026

TRandom3 P ≈ 106000

4. MC Methods 4.1 Random generators

K. Desch – Statistical methods of data analysis SS10

Tests for random generators

1. Uniform distribution: [0,1] ???

2. Correlation test:

Fill successive pairs in a 2D-histogram (see Blobel)

k

1i

2i2

N/k

N/kN on!distributi is 2