880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P...

11
880.P20 Winter 2006 Richard Kass Binomial Probability Distribution m N m q p m N m N p N m P )! ( ! ! ) , , ( For the binomial distribution P is the probability of m successes out of N trials. Here p is probability of a success and q=1-p is probability of a failure only two choices in a binomial process. Tossing a coin N times and asking for m heads is a binomial process. The binomial coefficient keeps track of the number of ways (“combinations”) we can get the desired outcome. 2 heads in 4 tosses: HHTT, HTHT, HTTH, THHT, THTH, TTHH D o e s th is fo rm u la m ak e sen se, e .g . if w e su m o v e r all p o ssib ilities d o w e g et 1 ? T o sh o w th at th is d istrib u tio n is n o rm aliz e d p ro p e rly , first rem em b er th e B inom ial T heorem : ( a b ) k k l a k l l 0 k b l F o r th is ex am ple a = q = 1 - p and b = p , an d (b y d efin itio n) a + b = 1. P (m ,N ,p ) m 0 N N m p m q N m m 0 N ( p q ) N 1 T h u s th e d istrib u tio n is n o rm a liz ed p ro p erly . W hatisthe m ean ofthisdistribution? mP ( m , N, p ) m 0 N P ( m , N, p ) m0 N mP ( m , N, p) m 0 N m N m m0 N p m q N m A cute w ay ofevaluating the above sum isto take the derivative: p N m m0 N p m q N m 0 m N m m 0 N p m 1 q N m N m m 0 N p m ( N m )(1 p) N m 1 m N m m0 N p m 1 q N m N m m0 N p m ( N m )(1 p ) N m 1 p 1 m N m m0 N p m q N m (1 p ) 1 N N m m0 N p m (1 p ) N m (1 p) 1 N m m0 N mp m (1 p ) N m p 1 (1 p ) 1 N(1) (1 p ) 1 : t coefficie binomial , m N C m N =Np

Transcript of 880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P...

Page 1: 880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P is the probability of m successes out of N trials. Here.

880.P20 Winter 2006 Richard Kass

Binomial Probability Distribution

mNmqpmNm

NpNmP

)!(!

!),,(

For the binomial distribution P is the probability of m successes out of N trials. Here p is probability of a success and q=1-p is probability of a failure only two choices in a binomial process. Tossing a coin N times and asking for m heads is a binomial process.The binomial coefficient keeps track of the number of ways (“combinations”) we can get the desired outcome. 2 heads in 4 tosses: HHTT, HTHT, HTTH, THHT, THTH, TTHH

D o e s t h i s f o r m u l a m a k e s e n s e , e . g . i f w e s u m o v e r a l l p o s s i b i l i t i e s d o w e g e t 1 ? T o s h o w t h a t t h i s d i s t r i b u t i o n i s n o r m a l i z e d p r o p e r l y , f i r s t r e m e m b e r t h e B i n o m i a l T h e o r e m :

( a b ) k k

l

a k l

l 0

k

b l

F o r t h i s e x a m p l e a = q = 1 - p a n d b = p , a n d ( b y d e f i n i t i o n ) a + b = 1 .

P ( m , N , p ) m 0

N

N

m

p m q N m

m 0

N

( p q ) N 1

T h u s t h e d i s t r i b u t i o n i s n o r m a l i z e d p r o p e r l y .

What is the mean of this distribution?

mP(m, N, p)

m0

N

P(m, N, p)m0

N

mP(m,N, p)

m0

N

mN

m

m0

N

pmqN m

A cute way of evaluating the above sum is to take the derivative:

p

N

m

m0

N

pmqN m

0 m

N

m

m0

N

pm 1qN m N

m

m0

N

pm(N m)(1 p)N m 1

mN

m

m0

N

pm 1qN m N

m

m0

N

pm(N m)(1 p)N m 1

p 1 mN

m

m0

N

pmqN m (1 p) 1 NN

m

m0

N

pm(1 p)N m (1 p) 1N

m

m0

N

mpm(1 p)N m

p 1 (1 p) 1 N(1) (1 p) 1

:tcoefficienbinomial

,

m

NC mN

=Np

Page 2: 880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P is the probability of m successes out of N trials. Here.

880.P20 Winter 2006 Richard Kass

Binomial Probability Distribution

Suppose you observed m special events (or successes) in a sample of N events. The measured probability (sometimes called “efficiency”) for a special event to occur is m / N . What is the error ( standard deviation or ) in ? Since N is a fixed quantity it is plausible (we will show it soon) that the error in is related to the error ( standard deviation or m) in m by:

m / N . This leads to:

m / N Npq / N N (1 ) / N (1 ) / N This is sometimes called "error on the efficiency". Thus you want to have a sample (N) as large as possible to reduce the uncertainty in the probability measurement!

What’s the variance of a binomial distribution?Using a trick similar to the one used for the average we find:

NpqpNmP

pNmPm

N

m

N

m

0

0

2

2

),,(

),,()(

Note: , the “error in the efficiency” 0 as 0 or 1.(This is NOT a gaussian so don’t stick it into a Gaussian pdf to calculate probability)

Detection efficiency and its “error”:

G G

Page 3: 880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P is the probability of m successes out of N trials. Here.

880.P20 Winter 2006 Richard Kass

Binomial Probability DistributionsWhen a -ray goes though material there is chance that it will convert into an electron-positron pair, e+e-. Let’s assume the probability for conversion is 10%. If 100 ’s go through this material on average how many will convert to e+e-? = Np = 100(0.1) = 10 conversions Consider the case where the ’s come from 0’s. most (98.8%) of the time. We can ask the following: What is the probability that both ’s will convert? P(2)=Probability of 2/2 = (0.1)2 =0.01= 1% What is the probability that one will convert? P(1)=Probability of 1/2 = [2!/(1!1!)](0.1)1(0.9)1 = 18% What is the probability that both ’s will not convert? P(0)=Probability of 0/2 =[2!/(0!2!)](0.1)0(0.9)2 = 81% Note: P(2)+P(1)+P(0)=100% Finally, the probability of at least one conversion is: P(1)=1- P(0) = 19%

Page 4: 880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P is the probability of m successes out of N trials. Here.

880.P20 Winter 2006 Richard Kass

Poisson Probability DistributionA n o t h e r i m p o r t a n t d i s c r e t e d i s t r i b u t i o n i s t h e P o i s s o n d i s t r i b u t i o n . C o n s i d e r t h e f o l l o w i n g c o n d i t i o n s :

a ) p i s v e r y s m a l l a n d a p p r o a c h e s 0 . F o r e x a m p l e s u p p o s e w e h a d a 1 0 0 s i d e d d i c e i n s t e a d o f a 6 s i d e d d i c e . H e r e p = 1 / 1 0 0 i n s t e a d o f 1 / 6 . S u p p o s e w e h a d a 1 0 0 0 s i d e d d i c e , p = 1 / 1 0 0 0 . . . e t c

b ) N i s v e r y l a r g e , i t a p p r o a c h e s . F o r e x a m p l e , i n s t e a d o f t h r o w i n g 2 d i c e , w e c o u l d t h r o w 1 0 0 o r 1 0 0 0 d i c e . c ) T h e p r o d u c t N p i s f i n i t e . A g o o d e x a m p l e o f t h e a b o v e c o n d i t i o n s o c c u r s w h e n o n e c o n s i d e r s r a d i o a c t i v e d e c a y . S u p p o s e w e h a v e 2 5 m g o f a n e l e m e n t . T h i s i s 1 0 2 0 a t o m s . S u p p o s e t h e l i f e t i m e ( ) o f t h i s e l e m e n t = 1 0 1 2 y e a r s 5 x 1 0 1 9 s e c o n d s . T h e p r o b a b i l i t y o f a g i v e n n u c l e u s t o d e c a y i n o n e s e c o n d = 1 / = 2 x 1 0 - 2 0 / s e c . F o r t h i s e x a m p l e : N = 1 0 2 0 ( v e r y l a r g e ) p = 2 x 1 0 - 2 0 ( v e r y s m a l l ) N p = 2 ( f i n i t e ! ) W e c a n d e r i v e a n e x p r e s s i o n f o r t h e P o i s s o n d i s t r i b u t i o n b y t a k i n g t h e a p p r o p r i a t e l i m i t s o f t h e b i n o m i a l d i s t r i b u t i o n .

P ( m , N , p ) N !

m ! ( N m ) !p m q N m

U s i n g c o n d i t i o n b ) w e o b t a i n :

N !

( N m ) !

N ( N 1 ) ( N m 1 ) ( N m ) !

( N m ) ! N m

q N m ( 1 p ) N m 1 p ( N m ) p 2 ( N m ) ( N m 1 )

2 ! . . . 1 p N

( p N ) 2

2 ! e p N

P u t t i n g t h i s a l t o g e t h e r w e o b t a i n :

P ( m , N , p ) N m p m e p N

m !

e m

m !

H e r e w e ' v e l e t = p N . I t i s e a s y t o s h o w t h a t : = N p = m e a n o f a P o i s s o n d i s t r i b u t i o n 2 = N p = = v a r i a n c e o f a P o i s s o n d i s t r i b u t i o n .

N o t e : m i s a l w a y s a n i n t e g e r 0 h o w e v e r , d o e s n o t h a v e t o b e a n i n t e g e r .

radioactive decaynumber of Prussian soldiers kicked to death by horses per year per army corps! quality control, failure rate predictions

N>>m

In a counting experiment if you observe m events:

m

Page 5: 880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P is the probability of m successes out of N trials. Here.

880.P20 Winter 2006 Richard Kass

Poisson Probability DistributionRadioactivity Example: a) What’s the probability of zero decays in one second if the average = 2 decays/sec?

P(0,2) e 2 20

0!

e 21

1e 2 0.135 13.5%

b) What’s the probability of more than one decay in one second if the average = 2 decays/sec?

P(1,2) 1 P(0,2) P(1,2) 1 e 2 20

0!

e 2 21

1!1 e 2 2e 2 0.594 59.4%

c) Estimate the most probable number of decays/sec?

We want: m

P(m,)m*0

To solve this problem its convenient to maximize lnP(m, ) instead of P(m, ).

ln P(m,) ln(e m

m!) m ln lnm!

In order to handle the factorial when take the derivative we use Stirling's Approximation: ln(m!) mln(m)-m

m

lnP(m,) m

( m * ln lnm *!) m

( m * ln m * lnm *m*) ln lnm * 11 0

m* = In this example the most probable value for m is just the average of the distribution. Therefore if you observed m events in an experiment, the error on m is m . Caution: The above derivation is only approximate since we used Stirlings Approximation which is only valid for large m. Another subtle point is that strictly speaking m can only take on integer values while is not restricted to be an integer.

0

0.1

0.2

0.3

0.4

0.5

Prob

abili

ty

0 1 2 3 4 5m

poissonbinomial

N=3, p=1/3

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Prob

abili

ty

0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0

m

binomialpoisson

N=10,p=0.1

Comparison of Binomial and Poissondistributions with mean=1.

Not much differencebetween them here!

ln10!=15.10 10ln10-10=13.03 14%ln50!=148.48 50ln50-50=145.601.9%

Page 6: 880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P is the probability of m successes out of N trials. Here.

880.P20 Winter 2006 Richard Kass

Poisson Probability Distribution

2 4 6 8 10 12

5

10

15

20

number of cosmic rays in a 15 sec. interval

number ofoccurrences

poisson with =5.4

Counting the numbers of cosmic rays that pass through a detector in a 15 sec interval

counts occurrences

0 0

1 2

2 9

3 11

4 8

5 10

6 17

7 6

8 8

9 6

10 3

11 0

12 0

13 1

Data is compared with a poisson using the measured average number of cosmicrays passing through the detector in eighty one 15 sec. intervals (=5.4)

Error bars are (usually) calculated using ni (ni=number in a bin) Why? Assume we have N total counts and the probability to fall in bin i is pi. For a given bin we have a binomial distribution (you’re either in or out).The expected average number in a given bin is: Npi and the variance is Npi(1-pi)=ni(1-pi)If we have a lot of bins then the probability of a event falling into a bin is small so (1-pi) 1

In our example the largest pi =17/81=0.21correction=(1-.21)1/2=0.88

Page 7: 880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P is the probability of m successes out of N trials. Here.

880.P20 Winter 2006 Richard Kass

Gaussian Probability DistributionT h e G a u s s i a n p r o b a b i l i t y d i s t r i b u t i o n ( o r “ b e l l s h a p e d c u r v e ” o r N o r m a l d i s t r i b u t i o n ) i s p e r h a p s t h e m o s t u s e d d i s t r i b u t i o n i n a l l o f s c i e n c e . U n l i k e t h e b i n o m i a l a n d P o i s s o n d i s t r i b u t i o n t h e G a u s s i a n i s a c o n t i n u o u s d i s t r i b u t i o n . I t i s g i v e n b y :

p ( y ) 1

2 e

y 2

2 2

w i t h = m e a n o f d i s t r i b u t i o n ( a l s o a t t h e s a m e p l a c e a s m o d e a n d m e d i a n ) 2 = v a r i a n c e o f d i s t r i b u t i o n y i s a c o n t i n u o u s v a r i a b l e ( - y T h e p r o b a b i l i t y ( P ) o f y b e i n g i n t h e r a n g e [ a , b ] i s g i v e n b y a n i n t e g r a l :

P ( a y b ) 1

2 e

( y ) 2

2 2

a

b dy

S i n c e t h i s i n t e g r a l c a n n o t b e e v a l u a t e d i n c l o s e d f o r m f o r a r b i t r a r y a a n d b ( a t l e a s t n o o n e ' s f i g u r e d o u t h o w t o d o i t i n t h e l a s t c o u p l e o f h u n d r e d y e a r s ) t h e v a l u e s o f t h e i n t e g r a l h a v e t o b e l o o k e d u p i n a t a b l e .

x

It is very unlikely (<0.3%) that a measurement taken at random from a gaussian pdf will be more than from the true mean of the distribution.

T h e t o t a l a r e a u n d e r t h e c u r v e i s n o r m a l i z e d t o o n e . I n t e r m s o f t h e p r o b a b i l i t y i n t e g r a l w e h a v e :

P ( y ) 1

2 e

( y ) 2

2 2

dy 1

Q u i t e o f t e n w e t a l k a b o u t a m e a s u r e m e n t b e i n g a c e r t a i n n u m b e r o f s t a n d a r d d e v i a t i o n s ( ) a w a y f r o m t h e m e a n ( ) o f t h e G a u s s i a n . W e c a n a s s o c i a t e a p r o b a b i l i t y f o r a m e a s u r e m e n t t o b e | - n | f r o m t h e m e a n j u s t b y c a l c u l a t i n g t h e a r e a o u t s i d e o f t h i s r e g i o n . n P r o b . o f e x c e e d i n g ± n 0 . 6 7 0 . 5 1 0 . 3 2 2 0 . 0 5 3 0 . 0 0 3 4 0 . 0 0 0 0 6

Page 8: 880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P is the probability of m successes out of N trials. Here.

880.P20 Winter 2006 Richard Kass

Central Limit Theorem

Why is the gaussian pdf so important ?“ T h i n g s t h a t a r e t h e r e s u l t o f t h e a d d i t i o n o f l o t s o f s m a l l e f f e c t s t e n d t o b e c o m e G a u s s i a n ” T h e a b o v e i s a c r u d e s t a t e m e n t o f t h e C e n t r a l L i m i t T h e o r e m : A m o r e e x a c t s t a t e m e n t i s : L e t Y 1 , Y 2 , . . . Y n b e a n i n f i n i t e s e q u e n c e o f i n d e p e n d e n t r a n d o m v a r i a b l e s e a c h w i t h t h e

s a m e p r o b a b i l i t y d i s t r i b u t i o n . S u p p o s e t h a t t h e m e a n ( ) a n d v a r i a n c e ( 2) o f t h i s

d i s t r i b u t i o n a r e b o t h f i n i t e . T h e n f o r a n y n u m b e r s a a n d b :

limn

P a Y 1 Y 2 Y n n

n b

1

2 e 1 / 2 y 2

a

b d y

T h u s t h e C . L . T . t e l l s u s t h a t u n d e r a w i d e r a n g e o f c i r c u m s t a n c e s t h e p r o b a b i l i t y d i s t r i b u t i o n t h a t d e s c r i b e s t h e s u m o f r a n d o m v a r i a b l e s t e n d s t o w a r d s a G a u s s i a n d i s t r i b u t i o n a s t h e n u m b e r o f t e r m s i n t h e s u m .

A l t e r n a t i v e l y , limn

P a Y

/ n b

lim

n P a

Y m

b

1

2 e 1 / 2 y 2

a

b d y

N o t e : m i s s o m e t i m e s c a l l e d “ t h e e r r o r i n t h e m e a n ” ( m o r e o n t h a t l a t e r ) .

For CLT to be valid: and of pdf must be finiteNo one term in sum should dominate the sum

Actually, the Y’s canbe from different pdf’s!

Page 9: 880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P is the probability of m successes out of N trials. Here.

880.P20 Winter 2006 Richard Kass

Central Limit TheoremBest illustration of the CLT.

a) Take 12 numbers (ri) from your computer’s random number generatorb) add them together c) Subtract 6d) get a number that is from a gaussian pdf !

Computer’s random number generator gives numbers distributed uniformly in the interval [0,1]A uniform pdf in the interval [0,1] has =1/2 and 2=1/12

0-6 +6

6

12)12/1(

)2/1(1266

12)12/1(

)2/1(126

1212

1

12

1

12

1321 ii

ii

ii

nr

Pr

Pbn

raPb

n

nYYYYaP

dyerP y

ii

6

6

)2/(12

1

2

2

1666

Thus the sum of 12 uniform random numbersminus 6 is distributed as if it came from a gaussian pdfwith =0 and =1.

E

A) 5000 random numbers B) 5000 pairs (r1+ r2)of random numbers

C) 5000 triplets (r1+ r2+ r3)of random numbers

D) 5000 12-plets (r1+ ++r12) of random numbers.

E) 5000 12-plets

(r1+ ++r12-6) of random numbers.

Gaussian =0 and =1In this case, 12 is close to .

Page 10: 880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P is the probability of m successes out of N trials. Here.

880.P20 Winter 2006 Richard Kass

Central Limit Theorem

Example: An electromagnetic calorimeter is being made out of a sandwich of lead and plastic scintillator. There are 25 pieces of lead and 25 pieces of plastic, each piece is nominally 1 cm thick. The spec on the thickness is 0.5 mm and is uniform in [-0.5,0.5] mm. The calorimeter has to fit inside an opening of 51 cm. What is the probability that it won’t will fit?Since the machining errors come from a uniform distribution with a well defined mean and variance

the Central Limit Theorem is applicable:

The upper limit corresponds to many large machining errors, all +0.5 mm:

The lower limit corresponds to a sum of machining errors of 1 cm.

The probability for the stack to be greater than cm is:

There’s a 31% chance the calorimeter won’t fit inside the box!

limn

P a Y1Y2 ...Yn n n

b

1

2e 1

2y2

a

b dy

2.1250

050)5.0(50...

121

21

n

nYYYb n

31.02

1 2.12

49.0

221

dyePy

49.050

0501...

121

21

n

nYYYa n

(and a 100% chance someone will get fired if it doesn’t fit inside the box…)

Page 11: 880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P is the probability of m successes out of N trials. Here.

880.P20 Winter 2006 Richard Kass

When Doesn’t the Central Limit Theorem Apply?Case I) PDF does not have a well defined mean or variance. The Breit-Wigner distribution does not have a well defined variance!

220 )2/()(

1

2)(

mm

mBW

1)(:normalized

dmmBW

0)(:average defined well mdmmmBW

dmmBWm )(:since varianceundefined 2

Case II) Physical process where one term in the sum dominates the sum. i) Multiple scattering: as a charged particle moves through material it undergoes many elastic (“Rutherford”) scatterings. Most scattering produce small angular deflections (dd~-4) but every once in a while a scattering produces a very large deflection. If we neglect the large scatterings the angle plane is gaussian distributed. The mean depends on the material thickness & particle’s charge & momentum

ii) The spread in range of a stopping particle (straggling).A small number of collisions where the particle loses a lot of its energy dominates the sum.iii) Energy loss of a charged particle going through a gas. Described by a “Landau” distribution (very long “tail”).

Describes the shape of a resonance, e.g. K*