Winter’2014’ Chem%350:%Statistical%Mechanics%and%Chemical...
Transcript of Winter’2014’ Chem%350:%Statistical%Mechanics%and%Chemical...
Winter 2014 Chem 350: Statistical Mechanics and Chemical Kinetics
Chapter 2: Probability and Statistics 17
Introduction to statistics ........................................................................................................................ 17
Continuous Distributions ....................................................................................................................... 19
Gaussian Distribution (1D) ..................................................................................................................... 20
Counting events to determine probabilities .......................................................................................... 21
Binomial Coefficients (Distribution) ....................................................................................................... 23
Stirling’s Appoximation .......................................................................................................................... 24
Gaussian Approximation to binomial distribution for large n ............................................................... 25
Derivation of the Gaussian Distribution ................................................................................................ 26
Chapter 2: Probability and Statistics The essential argument in statistical mechanic depends on probabilities. A particular configuration is found with a certain probability, and we find properties of a sample by an averaging procedure. Because the number of molecules is so large ( 20~10 ) averages are near certainties,
deviations from the average are exceedingly small (e.g. relative error1/ N ~ 10−10 ). In this set of lectures I will discuss: -‐ Introduction to statistics: averages and standard deviations -‐ Continuous distributions, the normal or Gaussian distribution -‐ Counting possibilities to arrive at probabilities -‐ A relevant stat-‐mech example: the binomial distribution Introduction to statistics To introduce basic issues let us discuss a simple example: Throwing dice
If you throw a die, you will get the result of 1, 2, 3, 4, 5, 6 in each throw. Suppose we throw the dice many times (6000)
You get
Possible results: ia 1,2,3,4,5,6ia =
For a fair dice, the draw of each number has equal chance, and the probabilities to throw a ‘4’ is 1/6
ia in 1 1010 2 980 3 995 4 1025 5 1030 6 980
Total 6000
Winter 2014 Chem 350: Statistical Mechanics and Chemical Kinetics
Chapter 2: Probability and Statistics 18
We can define the fractional occurrence as if 10006000
ii
tot
nfn
= ≈ ie. 1 21010 980, 6000 6000
f f= = etc.
In the limit of a large number of throws, this number will approach the probability iP of 16
Hence 1lim 6tot
i inf P
→∞→ =
We would expect i tot in n P≈ But the actual numbers would fluctuate around the expected in Average:
If the possible outcomes for an individual experiment are ia , and the number of events is in then
1
1 bi
i i i i ii i itot tot
na n a a f an n=
= = =∑ ∑ ∑
lim =tot
i i i in i if a Pa a
→∞=∑ ∑ a A a= = all notations of average
For a dice: average = ( )1 21 11 2 3 4 5 6 36 6 2
+ + + + + = =
Note that the average value may not be a possible result ia ! Variance: We are also interested in the spread around the mean.. let us define
Variance: σ a
2 =ni
ntot
ai − a( )2
i∑
( ) ( )2 22 lim tot
a i i i ini if a a P a aσ
→∞= − → −∑ ∑ 2
a aσ σ=
( )22 a i iiP a aσ = −∑
( )2 22 i i iiP a a a a= − +∑
2 22i i i i ii i iPa a Pa a P= − +∑ ∑ ∑
Use 1iiP =∑
Piai = a
i∑ 1 1i tot
i ii i itot tot tot
n nP nn n n
⎛ ⎞= = = =⎜ ⎟
⎝ ⎠∑ ∑ ∑
2 2 22i iiPa a a= − +∑
22a a= −
( )22 0a a= − ≥
Winter 2014 Chem 350: Statistical Mechanics and Chemical Kinetics
Chapter 2: Probability and Statistics 19
This average of 2a minus the average of a squared is always greater than 0. This is easily seen
from ( )2i iiP a a−∑ , 0iP ≥ and ( )2 0ia a− ≥ . The spread is 0 only if every experiment yields
the same value ia You can easily verify (for the example given) that both ways of calculating, ( )22 a a− and
( )2i iiP a a−∑ , yields the same result. Here we have proven that they are always the same.
Standard Deviation:
22
a a aσ = −
The above results considered discrete outcomes of a certain experiment ia , 1,2,3...i = , but the
analysis can be generalized to continuous distributions. Continuous Distributions Let’s consider a continuous distribution ( )p x dx . This might represent for example a mass distribution along a 1D line.
Mass density of “leaves” ( )x dxρ = mass between x and x dx+
( )x dx Mρ∞
−∞=∫ the total mass and ( )b
ax dxρ∫ is the mass between a and b
For our current purposes it is more convenient to normalize
( ) ( )1 1x dx P x dxM
ρ = =∫ ∫
Such that ( )b
aP x dx∫ is the fraction of the total mass lying in the interval [a,b]
( )P x dx is a dimensionless quantity and is called the distribution function over the variable , it is
analogous to Pi = P(ai ) . Here x serves as our variable ia .
( ) 1P x dx∞
−∞=∫ analogous to 1i
iP =∑
a ( )xP x dx∞
−∞∫ analogous to i i
iPa∑
Winter 2014 Chem 350: Statistical Mechanics and Chemical Kinetics
Chapter 2: Probability and Statistics 20
2a
( )2x P x dx∞
−∞∫ →
2i i
iPa∑
22 2a a aσ = − → 22x x−
22 2a a x xσ σ= = −
Example:
( ) a x x= → position
( )x xp x dx∞
−∞= ∫
( )2 2x x p x dx∞
−∞= ∫
Gaussian Distribution (1D) A famous distribution that we will encounter more often is the Gaussian or normal distribution ( ) 2 2/2x aG x Ce−= a : the width of the distribution (will be shown to be xσ ) C : normalization constant
( ) 2 2/21 x aG x dx C e∞ ∞ −
−∞ −∞= =∫ ∫
2
1 1 22
Caa ππ
→ = = (see below for derivation)
Also 2 2/21
2x ax xe dx
a π−= ∫
0= ( ) ( )g x g x= − ( )x g x⋅ is an odd function
2 22 2 /21
2x ax x e dx
a π∞ −
−∞= ∫
2 22 /2
0
1 22
x ax e dxa π
∞ −= ⋅ ⋅ ∫
2 2
21 2222
a a aa
ππ
= ⋅ ⋅ = ( shown below).
22 2x x x a aσ = − = = as claimed before
Useful Gaussian Integral formula (was used above)
( )
( )22
10
1 3 5..... 2 1
2k y
k k
ky e dyα π
αα∞ −
+
⋅ ⋅ −=∫ 1,2,3....k =
In this case 2
11, 2
ka
α= =
Winter 2014 Chem 350: Statistical Mechanics and Chemical Kinetics
Chapter 2: Probability and Statistics 21
(Please note the integration range. For even integrands (w.r.t. 0) one can take twice the result for the full integration range.)
Let us evaluate carefully:
x2 = 1
2πa2x2e− x2 /2a2
dx−∞
∞
∫ = 2 ⋅ 1
2πa2x2e− x2 /2a2
dx0
∞
∫ = 2
2πa2⋅ 122 ⋅2a2 ⋅ 2πa2 = a2
Nothing essential changes by shifting the maximum in the distribution away from 0
y − x( )2k
e−α y−x( )2 dy = 2.1⋅3⋅5.... 2k −1( )
2k+1α k−∞
∞
∫πα
Counting events to determine probabilities A basic strategy to determine probabilities is as follows
Pi =
# of events of interesttotal # of possible events
Here we assume each event itself to be equally likely. Eg. Throw a coin or dice or drawing a card from a deck
To illustrate I will use examples using a pack of cards: 4 suits: clubs, diamonds, hearts, spades 13 cards: 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K, A 52 cards in total
1) Draw a sequence of 5 cards (a poker hand) where the order does matter
# of 1st card possibilities → 52 # of 2nd card possibilities → 51
3rd → 50 : :
Hence the number of possibilities of poker hand where the order does matter is 52!52 51 50 49 4847!
⋅ ⋅ ⋅ ⋅ =
2) What about the number of permutations the 5 cards where the order does matter
# of 1st card possibilities → 5 # of 2nd card possibilities → 4
3rd → 3 4th → 2 5th → 1
# of permutations (different combinations) = 5 4 3 2 1 5!⋅ ⋅ ⋅ ⋅ =
3) From the previous 2 results: drawing a sequence of 5 cards, where order does not matter
Winter 2014 Chem 350: Statistical Mechanics and Chemical Kinetics
Chapter 2: Probability and Statistics 22
# of sequences = 52 51 50 49 48 52! 1 52!5 4 3 2 1 47! 5! 47!5!⋅ ⋅ ⋅ ⋅ = ⋅ =⋅ ⋅ ⋅ ⋅
This can also be written as 525
⎛ ⎞⎜ ⎟⎝ ⎠
or (52,5)C → 52 choose 5
So in general, if we are to choose m objects from N total objects, and the order of the combinations does matter
Number of combinations = ( )
!!
NN m−
The more important case for us: If the order of the combination does not matter then
Number of combinations = ( )!
! !N Nm m N m
⎛ ⎞=⎜ ⎟ −⎝ ⎠
Some more advanced examples
a) How many combinations of 3 Queens + 2 non Queen are there? There is , , ,s h c dQ Q Q Q .
4 possibilities choose 3, 4 4! 4 43 3!1! 1
⎛ ⎞= = =⎜ ⎟
⎝ ⎠
So there are 4 ways to draw 3 queens out of 4
And the non queens? There are 48 other cards (if you omit the last queen) and we pick 2, so we
get 482
⎛ ⎞⎜ ⎟⎝ ⎠
-‐ now what’s the probability to draw 3 queens and 2 non queens?
Prob =
4 483 2525
⎛ ⎞ ⎛ ⎞⋅⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠⎛ ⎞⎜ ⎟⎝ ⎠
= ( )( )
# of 3 Q's (# of 2 other)# of draw 5
⋅
b) what about any triple (xxx + z + y)? note that z=y is included but x=y and x=z is excluded
Prob =
4 48133 2525
⎛ ⎞ ⎛ ⎞⋅⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠⎛ ⎞⎜ ⎟⎝ ⎠
-‐ a full house? (xxx + yy)
Winter 2014 Chem 350: Statistical Mechanics and Chemical Kinetics
Chapter 2: Probability and Statistics 23
Prob =
4 413 123 2525
⎛ ⎞ ⎛ ⎞⋅⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠⎛ ⎞⎜ ⎟⎝ ⎠
You can check these results on a good poker wiki page! Binomial Coefficients (Distribution) ( ) ( )( )( ) ( )....na b a b a b a b a b+ = + + + +
( )0
nn m n m
m
na b a b
m−
=
⎛ ⎞+ = ⎜ ⎟
⎝ ⎠∑
The binomial / “choose” coefficients form a so called Pascal triangle 1 ( )0a b+
1 1 ( )1a b+
1 2 1 ( )2 2 22a b a ab b+ = + +
1 3 3 1 ( )3 3 2 2 33 3a b a a b ab b+ = + + + 1 4 6 4 1
1 5 10 10 5 1 You can discern the pattern starting from the top row
Rationalization: ( )0
nn m n m
mm
a b C a b −
=
+ =∑
Draw ‘a’ m times out of n , the order does not matter m
nC
m⎛ ⎞
→ = ⎜ ⎟⎝ ⎠
Special case a p= , 0 1p≤ ≤ , 1b q p= = −
( ) ( )
01 1
nn n m n m
m
np q p q
m−
=
⎛ ⎞+ = = = ⎜ ⎟
⎝ ⎠∑
( ) m n mn
nP m p q
m−⎛ ⎞
= ⎜ ⎟⎝ ⎠
Pn m( ) = the probability to draw ‘p’ m times when drawing n times in total
Winter 2014 Chem 350: Statistical Mechanics and Chemical Kinetics
Chapter 2: Probability and Statistics 24
Application of the binomial
m = # of particles on “ p ” side ( LV ) N m− = # of particles on “ q ” side ( RV ) m Np= which happens to also have the highest probability
Stirling’s Appoximation !n quickly becomes a very large number. In stat-‐mech n might be 2510 ! For large enough number Stirling’s approximation is accurate
where n is an integer (discrete value)
ln ! lnn n n n≈ − for discrete values ln ! lnx x x x≈ − for continuous values
( ) 1ln ! ln ln 1 lnd dx x x x x x xdx dx x
≈ − = + − =
“Deriving” Stirling’s Approximation ( )( )( )ln ! ln 1 2 ....1n n n n= − −
( ) ( )ln ln 1 ln 2 ..... ln1n n n= + − + − +
1ln
n
mm
=
=∑ for discrete n
If we go to a continuous n we can replace the sum with an integral
Winter 2014 Chem 350: Statistical Mechanics and Chemical Kinetics
Chapter 2: Probability and Statistics 25
ln m =m=1
n∑ 1⋅ ln m =m=1
n∑ Δm ⋅ ln m =m=1
n∑ ≈ ln x dx = x ln x − x1
n
∫ 1
n
( )ln lnd x x x x
dx− =
ln 1ln1 1n n n= − − + lnn n n≈ − (if n is big, this approximation works very well) In summary ln n!≈ n ln n− n
ddn
ln n!≈ ln n
Gaussian Approximation to binomial distribution for large n (not so important, tedious. Will show using Matlab).
For 1p q+ = and 0 1p≤ ≤ , we had the binomial distribution ( ) m n mn
nP m p q
m−⎛ ⎞
= ⎜ ⎟⎝ ⎠
It is fairly easy to show (see next page) that for large N, this approaches the Gaussian distribution where m Np= and Npqσ =
( ) ( ) ( )2 22 2/2 /2
2 2
1 12 2
m m m NpNP m e eσ σ
πσ πσ− − − −≈ =
This distribution becomes increasingly narrow or highly peaked as n
increases. By this we mean that
σ N
m=
NpqNp
= qp
1N
For example if 12
p q= = , and we have 2010n = particles when distributing the particles over 2
boxes
→ average number of particles in the left box : 201 102⋅
Standard deviation:
Npq = 12
1020
101 102
= ⋅
So we expect the number of particles to be 20 101 110 10
2 2±
Winter 2014 Chem 350: Statistical Mechanics and Chemical Kinetics
Chapter 2: Probability and Statistics 26
To be precise, for a Gaussian distribution to find a result to be within the mean ±σ is 66.6 %, while there is a 99.9 % probability to find the result between the mean ±3σ . The important point is that
1010
mσ −= !!!
We would make very small errors assuming exactly 201102
particles in each box. Fluctuations are of
order 1010 ~ n Derivation of the Gaussian Distribution from the binomial distribution Let us assume a Taylor series expansion of ( )ln m
NP around its maximum
( ) ( ) ( ) ( ) ( ) ( )2
22
1ln ln ln ln ..2N N N Nm m
d dP m P m P m m m P m m mdm dm
= + − + − +
since ( ) ( )!
! !m N m m N m
N
N NP m p q p qm m N m
− −⎛ ⎞= =⎜ ⎟ −⎝ ⎠
ln PN (m) = ln N ! - ln m! − ln N − m( )! + mln p + N − m( )lnq
We know from Stirling’s ln ! lnm m m m= −
ln ! lnd m mdm
=
( ) ( )( ) ( ) ( )1
ln ! ln !dd N m N m N m
dm d N m−
− = − = − −−
So the 1st and 2nd derivatives are
( ) ( )ln ln ln ln ln 0Nd P m m N m p qdm
= − + − + − = at the maximum
Note: ( )ln ! 0d Ndm
=
( ) ( ) ( )2
2
1 1ln Nd N m m NP mdm m N m m N m m N m
− − + −⎛ ⎞= − + − = =⎜ ⎟− − −⎝ ⎠
At the maximum of the distribution the first derivative goes to 0
d
dmln PN (m) = − ln m+ ln N − m( ) + ln p − lnq = 0
ln 0N m pm q
⎛ ⎞− =⎜ ⎟⎝ ⎠
1N m pm q− =
Np mp mq− =
Winter 2014 Chem 350: Statistical Mechanics and Chemical Kinetics
Chapter 2: Probability and Statistics 27
( )Np m p q m= + = → m m Np= = So the maximum is found to be m Np= . This is the expected average. Looking at the second derivative
Using m m Np= = → ( ) ( )
1 11
NNp N Np Np p Npq
− − −= =− −
d 2
dm2 PN (m)m=m
= −1Npq
Going back to the taylor expansion ( )m
nP
( ) ( ) ( ) ( ) ( )2
22
1ln ln ln ln ( ) ...2N N N N mm
d dP m P m P m m m P m m mdm dm
= + − + − +
ln PN m( ) = ln PN m( )− 1
21
Npqm− m( )2
+ ...
PN m( ) ≈ eln PN (m)e− m−m( )2 /2 Npq
( ) ( )2 /2m m NpqNP m Ce− −= where 1
2C
Npq π= (normalization constant)
Average = m Np= , variance = Npqσ =
Note added: This proof (widely quoted in text books) is pretty bad. You could show this way (going to second order only) that any distribution with a maximum is a Gaussian distribution, which is nonsense. One really has to show that the higher derivatives are negligible. We didn’t. In the computer lab, we will make the comparison on a computer, and you will see that this approximation is excellent. As so often the result is correct, the correct derivation is lacking. I didn’t fix it in my notes. Perhaps, if you understand why this proof is bad, this itself is a useful thing to learn!