Winter’2014’ Chem%350:%Statistical%Mechanics%and%Chemical...

Winter 2014 Chem 350: Statistical Mechanics and Chemical Kinetics

Chapter 2: Probability and Statistics 17

Introduction to statistics ........................................................................................................................ 17

Continuous Distributions ....................................................................................................................... 19

Gaussian Distribution (1D) ..................................................................................................................... 20

Counting events to determine probabilities .......................................................................................... 21

Binomial Coefficients (Distribution) ....................................................................................................... 23

Stirling’s Appoximation .......................................................................................................................... 24

Gaussian Approximation to binomial distribution for large n ............................................................... 25

Derivation of the Gaussian Distribution ................................................................................................ 26

Chapter 2: Probability and Statistics The essential argument in statistical mechanic depends on probabilities. A particular configuration is found with a certain probability, and we find properties of a sample by an averaging procedure. Because the number of molecules is so large ( 20~10 ) averages are near certainties,

deviations from the average are exceedingly small (e.g. relative error1/ N ~ 10−10 ). In this set of lectures I will discuss: -‐ Introduction to statistics: averages and standard deviations -‐ Continuous distributions, the normal or Gaussian distribution -‐ Counting possibilities to arrive at probabilities -‐ A relevant stat-‐mech example: the binomial distribution Introduction to statistics To introduce basic issues let us discuss a simple example: Throwing dice

If you throw a die, you will get the result of 1, 2, 3, 4, 5, 6 in each throw. Suppose we throw the dice many times (6000)

You get

Possible results: ia 1,2,3,4,5,6ia =

For a fair dice, the draw of each number has equal chance, and the probabilities to throw a ‘4’ is 1/6

ia in 1 1010 2 980 3 995 4 1025 5 1030 6 980

Total 6000



We can define the fractional occurrence as if 10006000

ii

tot

nfn

= ≈ ie. 1 21010 980, 6000 6000

f f= = etc.

In the limit of a large number of throws, this number will approach the probability iP of 16

Hence 1lim 6tot

i inf P

→∞→ =

We would expect i tot in n P≈ But the actual numbers would fluctuate around the expected in Average:

If the possible outcomes for an individual experiment are ia , and the number of events is in then

1

1 bi

i i i i ii i itot tot

na n a a f an n=

= = =∑ ∑ ∑

lim =tot

i i i in i if a Pa a

→∞=∑ ∑ a A a= = all notations of average

For a dice: average = ( )1 21 11 2 3 4 5 6 36 6 2

+ + + + + = =

Note that the average value may not be a possible result ia ! Variance: We are also interested in the spread around the mean.. let us define

Variance: σ a

2 =ni

ntot

ai − a( )2

i∑

( ) ( )2 22 lim tot

a i i i ini if a a P a aσ

→∞= − → −∑ ∑ 2

a aσ σ=

( )22 a i iiP a aσ = −∑

( )2 22 i i iiP a a a a= − +∑

2 22i i i i ii i iPa a Pa a P= − +∑ ∑ ∑

Use 1iiP =∑

Piai = a

i∑ 1 1i tot

i ii i itot tot tot

n nP nn n n

⎛ ⎞= = = =⎜ ⎟

⎝ ⎠∑ ∑ ∑

2 2 22i iiPa a a= − +∑

22a a= −

( )22 0a a= − ≥



This average of 2a minus the average of a squared is always greater than 0. This is easily seen

from ( )2i iiP a a−∑ , 0iP ≥ and ( )2 0ia a− ≥ . The spread is 0 only if every experiment yields

the same value ia You can easily verify (for the example given) that both ways of calculating, ( )22 a a− and

( )2i iiP a a−∑ , yields the same result. Here we have proven that they are always the same.

Standard Deviation:

22

a a aσ = −

The above results considered discrete outcomes of a certain experiment ia , 1,2,3...i = , but the

analysis can be generalized to continuous distributions. Continuous Distributions Let’s consider a continuous distribution ( )p x dx . This might represent for example a mass distribution along a 1D line.

Mass density of “leaves” ( )x dxρ = mass between x and x dx+

( )x dx Mρ∞

−∞=∫ the total mass and ( )b

ax dxρ∫ is the mass between a and b

For our current purposes it is more convenient to normalize

( ) ( )1 1x dx P x dxM

ρ = =∫ ∫

Such that ( )b

aP x dx∫ is the fraction of the total mass lying in the interval [a,b]

( )P x dx is a dimensionless quantity and is called the distribution function over the variable , it is

analogous to Pi = P(ai ) . Here x serves as our variable ia .

( ) 1P x dx∞

−∞=∫ analogous to 1i

iP =∑

a ( )xP x dx∞

−∞∫ analogous to i i

iPa∑



2a

( )2x P x dx∞

−∞∫ →

2i i

iPa∑

22 2a a aσ = − → 22x x−

22 2a a x xσ σ= = −

Example:

( ) a x x= → position

( )x xp x dx∞

−∞= ∫

( )2 2x x p x dx∞

−∞= ∫

Gaussian Distribution (1D) A famous distribution that we will encounter more often is the Gaussian or normal distribution ( ) 2 2/2x aG x Ce−= a : the width of the distribution (will be shown to be xσ ) C : normalization constant

( ) 2 2/21 x aG x dx C e∞ ∞ −

−∞ −∞= =∫ ∫

2

1 1 22

Caa ππ

→ = = (see below for derivation)

Also 2 2/21

2x ax xe dx

a π−= ∫

0= ( ) ( )g x g x= − ( )x g x⋅ is an odd function

2 22 2 /21

2x ax x e dx

a π∞ −

−∞= ∫

2 22 /2

0

1 22

x ax e dxa π

∞ −= ⋅ ⋅ ∫

2 2

21 2222

a a aa

ππ

= ⋅ ⋅ = ( shown below).

22 2x x x a aσ = − = = as claimed before

Useful Gaussian Integral formula (was used above)

( )

( )22

10

1 3 5..... 2 1

2k y

k k

ky e dyα π

αα∞ −

+

⋅ ⋅ −=∫ 1,2,3....k =

In this case 2

11, 2

ka

α= =



(Please note the integration range. For even integrands (w.r.t. 0) one can take twice the result for the full integration range.)

Let us evaluate carefully:

x2 = 1

2πa2x2e− x2 /2a2

dx−∞

∞

∫ = 2 ⋅ 1

2πa2x2e− x2 /2a2

dx0

∞

∫ = 2

2πa2⋅ 122 ⋅2a2 ⋅ 2πa2 = a2

Nothing essential changes by shifting the maximum in the distribution away from 0

y − x( )2k

e−α y−x( )2 dy = 2.1⋅3⋅5.... 2k −1( )

2k+1α k−∞

∞

∫πα

Counting events to determine probabilities A basic strategy to determine probabilities is as follows

Pi =

# of events of interesttotal # of possible events

Here we assume each event itself to be equally likely. Eg. Throw a coin or dice or drawing a card from a deck

To illustrate I will use examples using a pack of cards: 4 suits: clubs, diamonds, hearts, spades 13 cards: 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K, A 52 cards in total

1) Draw a sequence of 5 cards (a poker hand) where the order does matter

# of 1st card possibilities → 52 # of 2nd card possibilities → 51

3rd → 50 : :

Hence the number of possibilities of poker hand where the order does matter is 52!52 51 50 49 4847!

⋅ ⋅ ⋅ ⋅ =

2) What about the number of permutations the 5 cards where the order does matter

# of 1st card possibilities → 5 # of 2nd card possibilities → 4

3rd → 3 4th → 2 5th → 1

# of permutations (different combinations) = 5 4 3 2 1 5!⋅ ⋅ ⋅ ⋅ =

3) From the previous 2 results: drawing a sequence of 5 cards, where order does not matter



# of sequences = 52 51 50 49 48 52! 1 52!5 4 3 2 1 47! 5! 47!5!⋅ ⋅ ⋅ ⋅ = ⋅ =⋅ ⋅ ⋅ ⋅

This can also be written as 525

⎛ ⎞⎜ ⎟⎝ ⎠

or (52,5)C → 52 choose 5

So in general, if we are to choose m objects from N total objects, and the order of the combinations does matter

Number of combinations = ( )

!!

NN m−

The more important case for us: If the order of the combination does not matter then

Number of combinations = ( )!

! !N Nm m N m

⎛ ⎞=⎜ ⎟ −⎝ ⎠

Some more advanced examples

a) How many combinations of 3 Queens + 2 non Queen are there? There is , , ,s h c dQ Q Q Q .

4 possibilities choose 3, 4 4! 4 43 3!1! 1

⎛ ⎞= = =⎜ ⎟

⎝ ⎠

So there are 4 ways to draw 3 queens out of 4

And the non queens? There are 48 other cards (if you omit the last queen) and we pick 2, so we

get 482

⎛ ⎞⎜ ⎟⎝ ⎠

-‐ now what’s the probability to draw 3 queens and 2 non queens?

Prob =

4 483 2525

⎛ ⎞ ⎛ ⎞⋅⎜ ⎟ ⎜ ⎟

⎝ ⎠ ⎝ ⎠⎛ ⎞⎜ ⎟⎝ ⎠

= ( )( )

# of 3 Q's (# of 2 other)# of draw 5

⋅

b) what about any triple (xxx + z + y)? note that z=y is included but x=y and x=z is excluded

Prob =

4 48133 2525

⎛ ⎞ ⎛ ⎞⋅⎜ ⎟ ⎜ ⎟

⎝ ⎠ ⎝ ⎠⎛ ⎞⎜ ⎟⎝ ⎠

-‐ a full house? (xxx + yy)



Prob =

4 413 123 2525

⎛ ⎞ ⎛ ⎞⋅⎜ ⎟ ⎜ ⎟

⎝ ⎠ ⎝ ⎠⎛ ⎞⎜ ⎟⎝ ⎠

You can check these results on a good poker wiki page! Binomial Coefficients (Distribution) ( ) ( )( )( ) ( )....na b a b a b a b a b+ = + + + +

( )0

nn m n m

m

na b a b

m−

=

⎛ ⎞+ = ⎜ ⎟

⎝ ⎠∑

The binomial / “choose” coefficients form a so called Pascal triangle 1 ( )0a b+

1 1 ( )1a b+

1 2 1 ( )2 2 22a b a ab b+ = + +

1 3 3 1 ( )3 3 2 2 33 3a b a a b ab b+ = + + + 1 4 6 4 1

1 5 10 10 5 1 You can discern the pattern starting from the top row

Rationalization: ( )0

nn m n m

mm

a b C a b −

=

+ =∑

Draw ‘a’ m times out of n , the order does not matter m

nC

m⎛ ⎞

→ = ⎜ ⎟⎝ ⎠

Special case a p= , 0 1p≤ ≤ , 1b q p= = −

( ) ( )

01 1

nn n m n m

m

np q p q

m−

=

⎛ ⎞+ = = = ⎜ ⎟

⎝ ⎠∑

( ) m n mn

nP m p q

m−⎛ ⎞

= ⎜ ⎟⎝ ⎠

Pn m( ) = the probability to draw ‘p’ m times when drawing n times in total



Application of the binomial

m = # of particles on “ p ” side ( LV ) N m− = # of particles on “ q ” side ( RV ) m Np= which happens to also have the highest probability

Stirling’s Appoximation !n quickly becomes a very large number. In stat-‐mech n might be 2510 ! For large enough number Stirling’s approximation is accurate

where n is an integer (discrete value)

ln ! lnn n n n≈ − for discrete values ln ! lnx x x x≈ − for continuous values

( ) 1ln ! ln ln 1 lnd dx x x x x x xdx dx x

≈ − = + − =

“Deriving” Stirling’s Approximation ( )( )( )ln ! ln 1 2 ....1n n n n= − −

( ) ( )ln ln 1 ln 2 ..... ln1n n n= + − + − +

1ln

n

mm

=

=∑ for discrete n

If we go to a continuous n we can replace the sum with an integral



ln m =m=1

n∑ 1⋅ ln m =m=1

n∑ Δm ⋅ ln m =m=1

n∑ ≈ ln x dx = x ln x − x1

n

∫ 1

n

( )ln lnd x x x x

dx− =

ln 1ln1 1n n n= − − + lnn n n≈ − (if n is big, this approximation works very well) In summary ln n!≈ n ln n− n

ddn

ln n!≈ ln n

Gaussian Approximation to binomial distribution for large n (not so important, tedious. Will show using Matlab).

For 1p q+ = and 0 1p≤ ≤ , we had the binomial distribution ( ) m n mn

nP m p q

m−⎛ ⎞

= ⎜ ⎟⎝ ⎠

It is fairly easy to show (see next page) that for large N, this approaches the Gaussian distribution where m Np= and Npqσ =

( ) ( ) ( )2 22 2/2 /2

2 2

1 12 2

m m m NpNP m e eσ σ

πσ πσ− − − −≈ =

This distribution becomes increasingly narrow or highly peaked as n

increases. By this we mean that

σ N

m=

NpqNp

= qp

1N

For example if 12

p q= = , and we have 2010n = particles when distributing the particles over 2

boxes

→ average number of particles in the left box : 201 102⋅

Standard deviation:

Npq = 12

1020

101 102

= ⋅

So we expect the number of particles to be 20 101 110 10

2 2±



To be precise, for a Gaussian distribution to find a result to be within the mean ±σ is 66.6 %, while there is a 99.9 % probability to find the result between the mean ±3σ . The important point is that

1010

mσ −= !!!

We would make very small errors assuming exactly 201102

particles in each box. Fluctuations are of

order 1010 ~ n Derivation of the Gaussian Distribution from the binomial distribution Let us assume a Taylor series expansion of ( )ln m

NP around its maximum

( ) ( ) ( ) ( ) ( ) ( )2

22

1ln ln ln ln ..2N N N Nm m

d dP m P m P m m m P m m mdm dm

= + − + − +

since ( ) ( )!

! !m N m m N m

N

N NP m p q p qm m N m

− −⎛ ⎞= =⎜ ⎟ −⎝ ⎠

ln PN (m) = ln N ! - ln m! − ln N − m( )! + mln p + N − m( )lnq

We know from Stirling’s ln ! lnm m m m= −

ln ! lnd m mdm

=

( ) ( )( ) ( ) ( )1

ln ! ln !dd N m N m N m

dm d N m−

− = − = − −−

So the 1st and 2nd derivatives are

( ) ( )ln ln ln ln ln 0Nd P m m N m p qdm

= − + − + − = at the maximum

Note: ( )ln ! 0d Ndm

=

( ) ( ) ( )2

2

1 1ln Nd N m m NP mdm m N m m N m m N m

− − + −⎛ ⎞= − + − = =⎜ ⎟− − −⎝ ⎠

At the maximum of the distribution the first derivative goes to 0

d

dmln PN (m) = − ln m+ ln N − m( ) + ln p − lnq = 0

ln 0N m pm q

⎛ ⎞− =⎜ ⎟⎝ ⎠

1N m pm q− =

Np mp mq− =



( )Np m p q m= + = → m m Np= = So the maximum is found to be m Np= . This is the expected average. Looking at the second derivative

Using m m Np= = → ( ) ( )

1 11

NNp N Np Np p Npq

− − −= =− −

d 2

dm2 PN (m)m=m

= −1Npq

Going back to the taylor expansion ( )m

nP

( ) ( ) ( ) ( ) ( )2

22

1ln ln ln ln ( ) ...2N N N N mm

d dP m P m P m m m P m m mdm dm

= + − + − +

ln PN m( ) = ln PN m( )− 1

21

Npqm− m( )2

+ ...

PN m( ) ≈ eln PN (m)e− m−m( )2 /2 Npq

( ) ( )2 /2m m NpqNP m Ce− −= where 1

2C

Npq π= (normalization constant)

Average = m Np= , variance = Npqσ =

Note added: This proof (widely quoted in text books) is pretty bad. You could show this way (going to second order only) that any distribution with a maximum is a Gaussian distribution, which is nonsense. One really has to show that the higher derivatives are negligible. We didn’t. In the computer lab, we will make the comparison on a computer, and you will see that this approximation is excellent. As so often the result is correct, the correct derivation is lacking. I didn’t fix it in my notes. Perhaps, if you understand why this proof is bad, this itself is a useful thing to learn!

Winter’2014’ Chem%350:%Statistical%Mechanics%and%Chemical...

Documents

Transcript of Winter’2014’ Chem%350:%Statistical%Mechanics%and%Chemical...