10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did...

29
10: The Normal (Gaussian) Distribution Lisa Yan and Jerry Cain October 5, 2020 1

Transcript of 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did...

Page 1: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

10: The Normal (Gaussian) DistributionLisa Yan and Jerry Cain

October 5, 2020

1

Page 2: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Quick slide reference

2

3 Normal RV 10a_normal

15 Normal RV: Properties 10b_normal_props

21 Normal RV: Computing probability 10c_normal_prob

30 Exercises LIVE

Page 3: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Normal RV

3

10a_normal

Page 4: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Today’s the Big Day

4

Today

Page 5: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

def An Normal random variable 𝑋 is defined as follows:

Other names: Gaussian random variable

Normal Random Variable

5

𝑓 π‘₯ =1

𝜎 2πœ‹π‘’βˆ’ π‘₯βˆ’πœ‡ 2/2𝜎2

𝑋~𝒩(πœ‡, 𝜎2)Support: βˆ’βˆž,∞

Variance

Expectation

PDF

𝐸 𝑋 = πœ‡

Var 𝑋 = 𝜎2

𝑋~𝒩(πœ‡, 𝜎2)mean

variance

π‘₯

𝑓π‘₯

Page 6: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Carl Friedrich Gauss

Carl Friedrich Gauss (1777-1855) was a remarkably influentialGerman mathematician.

Did not invent Normal distribution but rather popularized it6

Page 7: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Why the Normal?

β€’ Common for natural phenomena: height, weight, etc.

β€’ Most noise in the world is Normal

β€’ Often results from the sum of many random variables

β€’ Sample means are distributed normally

7

That’s what they

want you to believe…

Page 8: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Why the Normal?

β€’ Common for natural phenomena: height, weight, etc.

β€’ Most noise in the world is Normal

β€’ Often results from the sum of many random variables

β€’ Sample means are distributed normally

8

Actually log-normal

Just an assumption

Only if equally weighted

(okay this one is true, we’ll see

this in 3 weeks)

Page 9: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

0

0.05

0.1

0.15

0.2

0.25

0 … 44 48 52 56 60 64 … 900 … 44 48 52 56 60 64 … 90

Okay, so why the Normal?

Part of CS109 learning goals:

β€’ Translate a problem statement into a random variable

In other words: model real life situations with probability distributions

9

value

How do you model student heights?

β€’ Suppose you have data from one classroom.

Fits perfectly!

But what about in

another classroom?

Page 10: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

A Gaussian maximizes entropy for a

given mean and variance.

Part of CS109 learning goals:

β€’ Translate a problem statement into a random variable

In other words: model real life situations with probability distributions

0

0.05

0.1

0.15

0.2

0.25

0 … 44 48 52 56 60 64 … 900 … 44 48 52 56 60 64 … 90

Okay, so why the Normal?

10

Occam’s Razor:

β€œNon sunt multiplicanda

entia sine necessitate.”

Entities should not be multiplied

without necessity.

value

How do you model student heights?

β€’ Suppose you have data from one classroom.

β€’ Same mean/var

β€’ Generalizes well

Page 11: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

I encourage you to stay critical of how

to model real-world phenomena.

Why the Normal?

β€’ Common for natural phenomena: height, weight, etc.

β€’ Most noise in the world is Normal

β€’ Often results from the sum of many random variables

β€’ Sample means are distributed normally

11

Actually log-normal

Just an assumption

Only if equally weighted

(okay this one is true, we’ll see

this in 3 weeks)

Page 12: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Anatomy of a beautiful equation

Let 𝑋~𝒩 πœ‡, 𝜎2 .

The PDF of 𝑋 is defined as:

12

𝑓 π‘₯ =1

𝜎 2πœ‹π‘’βˆ’

π‘₯ βˆ’ πœ‡ 2

2𝜎2

normalizing constantexponential

tail

symmetric

around πœ‡

variance 𝜎2

manages spread

π‘₯

𝑓π‘₯

Page 13: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Campus bikes

You spend some minutes, 𝑋, travelingbetween classes.

β€’ Average time spent: πœ‡ = 4 minutes

β€’ Variance of time spent: 𝜎2 = 2 minutes2

Suppose 𝑋 is normally distributed. What is the probability you spend β‰₯ 6 minutes traveling?

13

𝑋~𝒩(πœ‡ = 4, 𝜎2 = 2)

𝑃 𝑋 β‰₯ 6 = ΰΆ±6

∞

𝑓(π‘₯)𝑑π‘₯ = ΰΆ±6

∞ 1

𝜎 2πœ‹π‘’βˆ’

π‘₯ βˆ’ πœ‡ 2

2𝜎2 𝑑π‘₯

(call me if you analytically solve this)Loving, not scary…except this time

Page 14: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Computing probabilities with Normal RVs

For a Normal RV 𝑋~𝒩 πœ‡, 𝜎2 , its CDF has no closed form.

𝑃 𝑋 ≀ π‘₯ = 𝐹 π‘₯ = ΰΆ±βˆ’βˆž

π‘₯ 1

𝜎 2πœ‹π‘’βˆ’

𝑦 βˆ’ πœ‡ 2

2𝜎2 𝑑𝑦

However, we can solve for probabilities numerically using a function Ξ¦:

𝐹 π‘₯ = Ξ¦π‘₯ βˆ’ πœ‡

𝜎

14

Cannot be

solved

analytically

⚠️

CDF of

𝑋~𝒩 πœ‡, 𝜎2A function that has been

solved for numerically

To get here, we’ll first

need to know some

properties of Normal RVs.

Page 15: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Normal RV: Properties

15

10b_normal_props

Page 16: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Properties of Normal RVs

Let 𝑋~𝒩 πœ‡, 𝜎2 with CDF 𝑃 𝑋 ≀ π‘₯ = 𝐹 π‘₯ .

1. Linear transformations of Normal RVs are also Normal RVs.

If π‘Œ = π‘Žπ‘‹ + 𝑏, then π‘Œ~𝒩(π‘Žπœ‡ + 𝑏, π‘Ž2𝜎2).

2. The PDF of a Normal RV is symmetric about the mean πœ‡.

𝐹 πœ‡ βˆ’ π‘₯ = 1 βˆ’ 𝐹 πœ‡ + π‘₯

16

Page 17: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

1. Linear transformations of Normal RVs

Let 𝑋~𝒩 πœ‡, 𝜎2 with CDF 𝑃 𝑋 ≀ π‘₯ = 𝐹 π‘₯ .

Linear transformations of X are also Normal.

If π‘Œ = π‘Žπ‘‹ + 𝑏, then π‘Œ~𝒩 π‘Žπœ‡ + 𝑏, π‘Ž2𝜎2

Proof:

β€’ 𝐸 π‘Œ = 𝐸 π‘Žπ‘‹ + 𝑏 = π‘ŽπΈ 𝑋 + 𝑏 = π‘Žπœ‡ + 𝑏

β€’ Var π‘Œ = Var π‘Žπ‘‹ + 𝑏 = π‘Ž2Var 𝑋 = π‘Ž2𝜎2

β€’ π‘Œ is also Normal

17

Proof in Ross,

10th ed (Section 5.4)

Linearity of Expectation

Var π‘Žπ‘‹ + 𝑏 = π‘Ž2Var 𝑋

Page 18: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

2. Symmetry of Normal RVs

Let 𝑋~𝒩 πœ‡, 𝜎2 with CDF 𝑃 𝑋 ≀ π‘₯ = 𝐹 π‘₯ .

The PDF of a Normal RV is symmetric about the mean πœ‡.

𝐹 πœ‡ βˆ’ π‘₯ = 1 βˆ’ 𝐹 πœ‡ + π‘₯

18

𝑓(π‘₯)

π‘₯πœ‡

Page 19: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Using symmetry of the Normal RV

19

1. 𝑃 𝑍 ≀ 𝑧

2. 𝑃 𝑍 < 𝑧

3. 𝑃 𝑍 β‰₯ 𝑧

4. 𝑃 𝑍 ≀ βˆ’π‘§

5. 𝑃 𝑍 β‰₯ βˆ’π‘§

6. 𝑃(𝑦 < 𝑍 < 𝑧) πŸ€”

A. 𝐹 𝑧

B. 1 βˆ’ 𝐹(𝑧)

C. 𝐹 𝑧 βˆ’ 𝐹(𝑦)

= 𝐹 𝑧

𝑧

𝑓(𝑧)

𝐹 πœ‡ βˆ’ π‘₯ = 1 βˆ’ 𝐹 πœ‡ + π‘₯

πœ‡ = 0

Let 𝑍~𝒩 0,1 with CDF 𝑃 𝑍 ≀ 𝑧 = 𝐹 𝑧 .

Suppose we only knew numeric valuesfor 𝐹 𝑧 and 𝐹 𝑦 , for some 𝑧, 𝑦 β‰₯ 0.

How do we compute the following probabilities?

Page 20: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Using symmetry of the Normal RV

20

1. 𝑃 𝑍 ≀ 𝑧

2. 𝑃 𝑍 < 𝑧

3. 𝑃 𝑍 β‰₯ 𝑧

4. 𝑃 𝑍 ≀ βˆ’π‘§

5. 𝑃 𝑍 β‰₯ βˆ’π‘§

6. 𝑃(𝑦 < 𝑍 < 𝑧)

A. 𝐹 𝑧

B. 1 βˆ’ 𝐹(𝑧)

C. 𝐹 𝑧 βˆ’ 𝐹(𝑦)

= 𝐹 𝑧

= 𝐹 𝑧

= 1 βˆ’ 𝐹(𝑧)

= 1 βˆ’ 𝐹(𝑧)

= 𝐹 𝑧

= 𝐹 𝑧 βˆ’ 𝐹(𝑦)

Symmetry is particularly useful when

computing probabilities of zero-mean

Normal RVs.

𝑧

𝑓(𝑧)

𝐹 πœ‡ βˆ’ π‘₯ = 1 βˆ’ 𝐹 πœ‡ + π‘₯

πœ‡ = 0

Let 𝑍~𝒩 0,1 with CDF 𝑃 𝑍 ≀ 𝑧 = 𝐹 𝑧 .

Suppose we only knew numeric valuesfor 𝐹 𝑧 and 𝐹 𝑦 , for some 𝑧, 𝑦 β‰₯ 0.

How do we compute the following probabilities?

Page 21: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Normal RV:Computing probability

21

10c_normal_probs

Page 22: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Computing probabilities with Normal RVs

Let 𝑋~𝒩 πœ‡, 𝜎2 .

To compute the CDF, 𝑃 𝑋 ≀ π‘₯ = 𝐹 π‘₯ :

β€’ We cannot analytically solve the integral (it has no closed form)

β€’ …but we can solve numerically using a function Ξ¦:

𝐹 π‘₯ = Ξ¦π‘₯ βˆ’ πœ‡

𝜎

22

CDF of the

Standard Normal, 𝑍

Page 23: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

The Standard Normal random variable 𝑍 is defined as follows:

Other names: Unit Normal

CDF of 𝑍 defined as:

Standard Normal RV, 𝑍

23

𝑍~𝒩(0, 1) Variance

Expectation 𝐸 𝑍 = πœ‡ = 0

Var 𝑍 = 𝜎2 = 1

𝑃 𝑍 ≀ 𝑧 = Ξ¦(𝑧)

Note: not a new distribution; just

a special case of the Normal

Page 24: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Ξ¦ has been numerically computed

24

𝑃 𝑍 ≀ 1.31 = Ξ¦(1.31)

𝑓𝑧

𝑧

Ξ¦(𝑧)

Standard Normal Table only has

probabilities Ξ¦(𝑧) for 𝑧 β‰₯ 0.

Page 25: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

History fact: Standard Normal Table

25

The first Standard Normal Table was computed by Christian Kramp, French astronomer (1760–1826), in Analysedes RΓ©fractions Astronomiques et Terrestres, 1799

Used a Taylor series expansion to the third power

Page 26: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Probabilities for a general Normal RV

Let 𝑋~𝒩 πœ‡, 𝜎2 . To compute the CDF 𝑃 𝑋 ≀ π‘₯ = 𝐹 π‘₯ ,we use Ξ¦, the CDF for the Standard Normal 𝑍~𝒩(0, 1):

𝐹 π‘₯ = Ξ¦π‘₯ βˆ’ πœ‡

𝜎Proof:

26

𝐹 π‘₯ = 𝑃 𝑋 ≀ π‘₯

= 𝑃 𝑋 βˆ’ πœ‡ ≀ π‘₯ βˆ’ πœ‡ = 𝑃𝑋 βˆ’ πœ‡

πœŽβ‰€π‘₯ βˆ’ πœ‡

𝜎

= 𝑃 𝑍 ≀π‘₯ βˆ’ πœ‡

𝜎

Algebra + 𝜎 > 0

Definition of CDF

β€’π‘‹βˆ’πœ‡

𝜎=

1

πœŽπ‘‹ βˆ’

πœ‡

𝜎is a linear transform of 𝑋.

β€’ This is distributed as 𝒩1

πœŽπœ‡ βˆ’

πœ‡

𝜎,1

𝜎2𝜎2 =𝒩 0,1

β€’ In other words, π‘‹βˆ’πœ‡

𝜎= 𝑍~𝒩 0,1 with CDF Ξ¦.= Ξ¦

π‘₯ βˆ’ πœ‡

𝜎

Page 27: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Probabilities for a general Normal RV

Let 𝑋~𝒩 πœ‡, 𝜎2 . To compute the CDF 𝑃 𝑋 ≀ π‘₯ = 𝐹 π‘₯ ,we use Ξ¦, the CDF for the Standard Normal 𝑍~𝒩(0, 1):

𝐹 π‘₯ = Ξ¦π‘₯ βˆ’ πœ‡

𝜎Proof:

27

𝐹 π‘₯ = 𝑃 𝑋 ≀ π‘₯

= 𝑃 𝑋 βˆ’ πœ‡ ≀ π‘₯ βˆ’ πœ‡ = 𝑃𝑋 βˆ’ πœ‡

πœŽβ‰€π‘₯ βˆ’ πœ‡

𝜎

= 𝑃 𝑍 ≀π‘₯ βˆ’ πœ‡

𝜎

Algebra + 𝜎 > 0

Definition of CDF

β€’π‘‹βˆ’πœ‡

𝜎=

1

πœŽπ‘‹ βˆ’

πœ‡

𝜎is a linear transform of 𝑋.

β€’ This is distributed as 𝒩1

πœŽπœ‡ βˆ’

πœ‡

𝜎,1

𝜎2𝜎2 =𝒩 0,1

β€’ In other words, π‘‹βˆ’πœ‡

𝜎= 𝑍~𝒩 0,1 with CDF Ξ¦.= Ξ¦

π‘₯ βˆ’ πœ‡

𝜎

1. Compute 𝑧 = π‘₯ βˆ’ πœ‡ /𝜎.

2. Look up Ξ¦ 𝑧 in Standard Normal table.

Page 28: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Campus bikes

You spend some minutes, 𝑋, traveling between classes.β€’ Average time spent: πœ‡ = 4 minutesβ€’ Variance of time spent: 𝜎2 = 2 minutes2

Suppose 𝑋 is normally distributed. What is the probability you spend β‰₯ 6 minutes traveling?

28

𝑋~𝒩(πœ‡ = 4, 𝜎2 = 2) 𝑃 𝑋 β‰₯ 6 = ΰΆ±6

∞

𝑓(π‘₯)𝑑π‘₯ (no analytic solution)

1. Compute 𝑧 =π‘₯βˆ’πœ‡

𝜎2. Look up Ξ¦(𝑧) in table

𝑃 𝑋 β‰₯ 6 = 1 βˆ’ 𝐹π‘₯(6)

= 1 βˆ’ Ξ¦6 βˆ’ 4

2

Γ—

β‰ˆ 1 βˆ’Ξ¦ 1.41

1 βˆ’ Ξ¦ 1.41β‰ˆ 1 βˆ’ 0.9207= 0.0793

Page 29: 10: The Normal (Gaussian) Distributionweb.stanford.edu/class/cs109/lectures/10_normal_gaussian...Did not invent Normal distribution but rather popularized it 6 Lisa Yan, CS109, 2020

Lisa Yan and Jerry Cain, CS109, 2020

Is there an easier way? (yes)

Let 𝑋~𝒩 πœ‡, 𝜎2 . What is 𝑃 𝑋 ≀ π‘₯ = 𝐹 π‘₯ ?

β€’ Use Python

β€’ Use website tool

29

from scipy import statsX = stats.norm(mu, std)X.cdf(x)

SciPy reference:https://docs.scipy.org/doc/scipy/refere

nce/generated/scipy.stats.norm.html

Website tool: https://web.stanford.edu/class/cs109

/handouts/normalCDF.html