t4003specialdistributions_16

32
Course 003: Basic Econometrics, 2015-2016 Topic 4: Some Special Distributions Rohini Somanathan Course 003, 2015-2016 Page 0 Rohini Somanathan

description

DSE India Pakistan Probability Delhi School Economics

Transcript of t4003specialdistributions_16

Page 1: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016

Topic 4: Some Special Distributions

Rohini Somanathan

Course 003, 2015-2016

Page 0 Rohini Somanathan

Page 2: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Parametric Families of Distributions

• Some families of probability distributions are frequently used because they have a small

number of parameters and are good approximations for the experiments and events we are

interested in analyzing.

• Examples:

– For modeling the distribution of income or consumption expenditure, we want a density

which is skewed to the right ( gamma, weibull, lognormal..)

– IQs, heights, weights, arm circumference are quite symmetric around a mode (normal

or truncated normal)

– number of successes in a given number of trials (binomial)

– the time to failure for a machine or person (gamma, exponential)

• We refer to these probability density functions by f(x;θ) where θ refers to a parameter

vector.

• A given choice of θ therefore leads to a given probability density function.

• Ω is used to denote the parameter space.

Page 1 Rohini Somanathan

Page 3: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The discrete uniform distribution

• Parameter: N

• Probability function: f(x;N) = 1N I(1,2,...,N)(x)

• Moments:

µ =∑xf(x) =

1

N

N(N+ 1)

2=(N+ 1)

2

σ2 =∑x2f(x) − µ2 =

1

N

N(N+ 1)(2N+ 1)

6−((N+ 1)

2

)2

=N2 − 1

12

• MGF:∑Nj=1

ejt

N

• Applications: experiments with equally likely outcomes (dice, coins..) Can you think of

applications in economics?

Page 2 Rohini Somanathan

Page 4: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The Bernoulli distribution

• Parameter: p ∈Ω, 0≤ p≤ 1

• Probability function: f(x;p) = px(1 −p)1−x I(0,1)(x)

• Moments:

µ =∑xf(x) = 1.p1(1 −p)0 + 0.p0((1 −p)1 = p

σ2 =∑x2f(x)−µ2 = p(1 −p)

• MGF: etp+ e0(1 −p) = pet+(1 −p)

• Applications: experiments with two possible outcomes: success or failure, defective or not

defective, male or female, etc.

Page 3 Rohini Somanathan

Page 5: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The Binomial distribution.

• Parameters: (n,p) ∈Ω, 0≤ p≤ 1 and n is a positive integer

• Probability function: An observed sequence of n Bernoulli trials can be represented by an

n-tuple of zeros and ones. The number of ways to achieve x ones is given by(nx

)= n!x!(n−x)! .

The probability of x successes in n trials is therefore:

f(x;n,p) =

(nx

)px(1 −p)n−x x = 0, 1, 2, . . .n

0 otherwise

Notice that sincen∑x=0

(nx

)axbn−x = (a+b)n,

n∑x=0f(x) = [(p+(1 −p)]n = 1 so we have a valid

density function.

• MGF:The MGF is given by:∑xetxf(x) =

n∑x=0etx

(nx

)px(1 −p)n−x =

n∑x=0

(nx

)(pet)x(1 −p)n−x = [(1 −p)+pet]n

• Moments: The MGF can be used to derive µ = np and σ2 = np(1 −p)

• Result: If X1, . . .Xk are independent random variables and if each Xi has a binomial

distribution with parameters ni and p, then the sum X1 + · · ·+Xk has a binomial

distribution with parameters n = n1 + · · ·+nk and p.

Page 4 Rohini Somanathan

Page 6: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The Multinomial distributionSuppose there are a small number of different outcomes (methods of public transport, water

purification etc. ) The Multinomial distribution gives us the probability associated with a

particular vector of these outcomes:

• Parameters: (n,p1, . . .pm) ∈Ω, 0≤ pi ≤ 1,∑ipi = 1 and n is a positive integer

• Probability function:

f(x1, . . .xm;n,p1, . . .pm) =

n!m∏i=1xi!

m∏i=1pxii x = 0, 1, 2, . . .n,

∑mi xi = n

0 otherwise

• MGF: MX(t) =(m∑i=1pie

ti

)n• Moments: µi = npi, σ

2i = npi(1 −pi)

Page 5 Rohini Somanathan

Page 7: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Geometric and Negative Binomial distributions

• The Negative Binomial (or Pascal) distribution gives us the probability that x failures will

occur before r successes are achieved. This means that the rth success occurs on the

(x+ r)th trial.

– Parameters: (r,p) ∈Ω, 0≤ pi ≤ 1,∑ipi = 1 and r is a positive integer

– Density: For the rth success occurs on the (x+ r)th trial, we require (r− 1) successes in

the first (x+ r− 1) trials. We therefore obtain the density:

f(x;r,p) =

(r+ x− 1

x

)prqx, x = 0, 1, 2, 3...

• The geometric distribution is a special case of the negative binomial with r = 1.

– The density in this case takes the form f(x|1,p) = pqx over all natural numbers x

– the MGF is given by E(etX) = p∑∞x=0(qe

t)x = p1−qet

for t < log( 1q)

– We can use this function to get the mean and variance, µ = qp and σ2 = q

p2

– The negative binomial is just a sum of r geometric variables, and the MGF is therefore

( p1−qet

)r and the corresponding mean and variance is µ = rqp and σ2 = rq

p2

– The geometric distribution is memory-less, so the conditional probability of k+ t

failures given k failures is the unconditional probability of t failures,

P(X = k+ t|X≥ k) = P(X = t)

Page 6 Rohini Somanathan

Page 8: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Discrete Distributions: Poisson• Parameter: λ ∈Ω,λ > 0

• Probability function:

f(x;λ) =

e−λλx

x! , x = 0, 1, 2, . . . ,

0 otherwise

Using the result that the series 1 + λ+ λ2

2! + λ3

3! + . . . converges to eλ,∑xf(x) =

∞∑x=0

e−λλx

x! = e−λ∞∑x=0

λx

x! = e−λeλ = 1 so we have a valid density.

• Moments: µ = λ = σ2

• MGF: E(etX) =∞∑x=0

etxe−λλx

x! = e−λ∞∑x=0

(λet)x

x! = eλ(et−1)

• The MGF can be used to get the first and second moments about the origin, λ and λ2 + λ

so the mean and the variance are both λ.

• We can also use the product of k identical MGFs to show that the sum of k independently

distributed Poisson variables has a Poisson distribution with mean λ1 + . . .λk.

Page 7 Rohini Somanathan

Page 9: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

A Poisson processSuppose that the number of type A outcomes that occur over a fixed interval of time, [0, t]

follows a process in which

1. The probability that precisely one type A outcome will occur in a small interval of time ∆t

is approximately proportional to the length of the interval:

g(1,∆t) = γ∆t+o(∆t)

where o(∆t) denotes a function of ∆t having the property that lim∆t→0o(∆t)∆t = 0.

2. The probability that two or more type A outcomes will occur in a small interval of time ∆t

is negligible: ∞∑x=2

g(x,∆t) = o(∆t)

3. The numbers of type A outcomes that occur in nonoverlapping time intervals are

independent events.

These conditions imply a process which is stationary over the period of observation, i.e the

probability of an occurrence must be the same over the entire period with neither busy nor quiet

intervals.

Page 8 Rohini Somanathan

Page 10: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Poisson densities representing poisson processes

RESULT: Consider a Poisson process with the rate γ per unit of time. The number of events in

a time interval t is a Poisson density with mean λ = γt.

Applications:

• the number of weaving defects in a yard of handloom cloth or stitching defects in a shirt

• the number of traffic accidents on a motorway in an hour

• the number of particles of a noxious substance that come out of chimney in a given period

of time

• the number of times a machine breaks down each week

Example:

• let the probability of exactly one blemish in a foot of wire be 11000 and that of two or more

blemishes be zero.

• we’re interested in the number of blemishes in 3, 000 feet of wire.

• if the numbers of blemishes in non-overlapping intervals are assumed to be independently

distributed, then our random variable X follows a poisson process with λ = γt = 3 and

P(X = 5) =35e−3

5!

• you can plug this into a computer, or alternatively use tables to compute f(5; 3) = .101

Page 9 Rohini Somanathan

Page 11: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The Poisson as a limiting distributionWe can show that a binomial distribution with large n and small p can be approximated by a

Poisson ( which is computationally easier).

• useful result: ev = limn→∞(1 + vn)n

• We can rewrite the binomial density for non-zero values as

f(x;n,p) =

x∏i=1

(n−i+1)

x! px(1 −p)n−x. If np = λ, we can subsitute for p by(λn

)to get

limn→∞f(x;n,p) = limn→∞x∏i=1

(n− i+ 1)

x!

(λn

)x(1 −

λ

n

)n−x

= limn→∞x∏i=1

(n− i+ 1)

nxλx

x!

(1 −

λ

n

)n(1 −

λ

n

)−x= limn→∞[n

n.(n− 1)

n. . . .

(n− x+ 1)

n

λx

x!

(1 −

λ

n

)n(1 −

λ

n

)−x]=e−λλx

x!

(using the above result and the property that the limit of a product is the product of the

limits)

Page 10 Rohini Somanathan

Page 12: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Poisson as a limiting distribution...example

• We have a 300 page novel with 1, 500 letters on each page.

• Typing errors are as likely to occur for one letter as for another, and the probability of such

an error is given by p = 10−5.

• The total number of letters n = (300) ∗ (1500) = 450, 000

• Using λ = np, the poisson distribution function gives us the probability of the number of

errors being less than or equal to 10 as:

P(x≤ 10)≈10∑x=0

e−4.5(4.5)x

x!= .9933

Rules of Thumb: close to binomial probabilities when n≥ 20 and p≤ .05, excellent when n≥ 100

and np≤ 10.

Page 11 Rohini Somanathan

Page 13: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Discrete distributions: Hypergeometric

• Suppose, like in the case of the binomial, there are two possible outcomes and we’re

interested in the probability of x values of a particular outcome, but we are drawing

randomly without replacement so our trials are not independent.

• In particular, suppose there are A+B objects from which we pick n, A of the total number

available are of one type (red balls) and the rest are of the other (blue balls).

• If the random variable is the total number of red balls selected, then, for appropriate values

of x, we have f(x;A,B,n) =(Ax)(

Bn−x)

(A+Bn )

• Over what values of x is this defined? max0,n−B, ≤ X≤minn,A

• The multivariate extension is (for xi ∈ 0, 1, 2..n,n∑i=1xi = n and

m∑i=1Ki =M ):

f(x1 . . .xm;K1 . . .Km,n) =

m∏j=1

(Kjxj

)(Mn

)

Page 12 Rohini Somanathan

Page 14: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Continuous distributions: uniform or rectangular

• Parameters: (a,b) ∈Ω, (a,b) : ∞ < a < b <∞• Density: f(x;a,b) = 1

b−a I(a,b)(x) (hence the name rectangular)

• Moments: µ =(a+b)

2 , σ2 =(b−a)2

12

• Applications:

– to construct the probability space of an experiment in which any outcome in the

interval [a,b] is equally likely.

– to generate random samples from other distributions (based on the probability integral

transformation). This is part of your first lab assignment.

Page 13 Rohini Somanathan

Page 15: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The gamma functionThe gamma function is a special mathematical function that is widely used in statistics. The

gamma function of α is defined as

Γ(α) =

∞∫0

yα−1e−ydy (1)

• If α = 1, Γ(α) =∞∫0

e−ydy = −e−y∣∣∣∞0

= 1

• If α > 1, we can integrate (1) by parts, setting u = yα−1 and dv = e−y and using the formula∞∫0

udv = uv∣∣∣∞0

−∞∫0

vdu to get: −yα−1

ey

∣∣∣∞0

+(α− 1)∞∫0

yα−2e−ydy

• The first term in the above expression is zero because the exponential function goes to zero

faster than any polynomial and we obtain

Γ(α) = (α− 1)Γ(α− 1)

and for any integer α > 1, we have

Γ(α) = (α− 1)(α− 2)(α− 3) . . . (3)(2)(1)Γ(1) = (α− 1)!

Page 14 Rohini Somanathan

Page 16: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The gamma distribution

Define the variable x by y = xβ , where β > 0. Then dy = 1

βdx and can rewrite Γ(α) as

Γ(α) =

∞∫0

( xβ

)α−1e− xβ

( 1

β

)dx

or as

1 =

∞∫0

1

Γ(α)βαxα−1e

− xβdx

This shows that for α,β > 0,

f(x;α,β) =1

Γ(α)βαxα−1e

− xβ I(0,∞)(x)

is a valid density and is known as a gamma-type probability density function.

Page 15 Rohini Somanathan

Page 17: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Features of the gamma density

This is a valuable distribution because it can take a variety of shapes depending on the values of

the parameters α and β

• It is skewed to the right

• It is strictly decreasing when α≤ 1

• If α = 1, we have the exponential density, which is memory-less.

• For α > 1 the density attains it maximum at x = (α− 1)β320 Chapter 5 Special Distributions

Figure 5.7 Graphs of thep.d.f.’s of several differentgamma distributions withcommon mean of 1.

Gam

ma

p.d.

f.

x

a ! 0.1, b ! 0.1a ! 1, b ! 1a ! 2, b ! 2a ! 3, b ! 3

0.2

0

0.4

0.6

0.8

1.0

1.2

1 2 3 4 5

Theorem5.7.5

Moments. Let X have the gamma distribution with parameters α and β. For k =1, 2, . . . ,

E(Xk) = #(α + k)

βk#(α)= α(α + 1) . . . (α + k − 1)

βk.

In particular, E(X) = αβ , and Var(X) = α

β2 .

Proof For k = 1, 2, . . . ,

E(Xk) =! ∞

0xkf (x|α, β) dx = βα

#(α)

! ∞

0xα+k−1e−βx dx

= βα

#(α). #(α + k)

βα+k= #(α + k)

βk#(α). (5.7.14)

The expression for E(X) follows immediately from (5.7.14). The variance can becomputed as

Var(X) = α(α + 1)β2

−"

α

β

#2

= α

β2.

Figure 5.7 shows several gamma distribution p.d.f.’s that all have mean equal to1 but different values of α and β.

Example5.7.6

Service Times in a Queue. In Example 5.7.5, the conditional mean service rate giventhe observations X1 = x1, . . . , Xn = xn is

E(Z|x1, . . . , xn) = n + 12 + $n

i=1 xi

.

For large n, the conditional mean is approximately 1 over the sample average ofthe service times. This makes sense since 1 over the average service time is what wegenerally mean by service rate. !

The m.g.f. ψ of X can be obtained similarly.

Theorem5.7.6

Moment Generating Function. Let X have the gamma distribution with parameters α

and β. The m.g.f. of X is

ψ(t) ="

β

β − t

for t < β. (5.7.15)

Page 16 Rohini Somanathan

Page 18: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Moments of the gamma distribution

• Parameters: (α,β) ∈Ω, α > 0,β > 0

• Moments: µ = αβ, σ2 = αβ2

• MGF: MX(t) = (1 −βt)−α for t < 1β which can be derived as follows:

MX(t) =

∞∫0

etx1

Γ(α)βαxα−1e

− xβ dx

=

∞∫0

1

Γ(α)βαxα−1e

−( 1β

−t)xdx

=1

Γ(α)βα1(

1β − t

)α−1

∞∫0

xα−1( 1

β− t

)α−1e−( 1β

−t)x

(1β − t

)(1β − t

)dx=

1

Γ(α)βαΓ(α)(

1β − t

)α (by setting y = ( 1β − t)x in the expression for Γ(α).)

=1(

1 −βt)α

Page 17 Rohini Somanathan

Page 19: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Gamma applications

• Survival analysis

– The waiting time till the rth event/success: If X is the time that passes until the first

success, then X could be gamma distribution with α = 1 and β = 1γ . This is known as an

exponential distribution. If, instead we are interested in the time taken for the rth

success, this has a gamma density with α = r and 1β = γ.

– Related to the Poisson distribution: If Y, the number of events in a given time period t

has a poisson density with parameter λ, the rate of success is given by γ = λt .

Example: A bottling plant breaks down, on average, twice every four weeks. We want the

probability that the number of breakdowns, X≤ 3 in the next four weeks. We have λ = 2

and the breakdown rate γ = 12 per week. P(X≤ 3) =

3∑i=0e−2 2i

i! = .135 + .271 + .271 + .18 = .857

Suppose we wanted the probability that the machine does not break down in the next four

weeks. The time taken until the first break-down, x must therefore be more than four

weeks. This follows a gamma distribution, with β = 1γ and α = 1.

P(X≥ 4) =∞∫4

12e

−x2 dx = e−x2

∣∣∣∞4

= e−2 = .135

• Income distributions that are uni-modal

Page 18 Rohini Somanathan

Page 20: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Gamma distributions: some useful properties

• Gamma Additivity: Let X1, . . .Xn be independently distributed random variables with

respective gamma densities Gamma(αi,β). Then

Y =

n∑i=1

Xi ∼ Gamma(

n∑i=1

αi,β)

• Scaling Gamma Random Variables: Let X be distributed with gamma density

Gamma(α,β) and let c > 0. Then

Y = cX ∼ Gamma(α,βc)

Both these can be easily proved using the gamma MGF and applying the MGF uniqueness

theorem: In the first case the MGF of Y is the product of the individual MGFs, i.e.

MY(t) =

n∏i=1

MXi(t) =

n∏i=1

(1 −βt)−αi = (1 −βt)

n∑i=1

−αifor t <

1

β

For the second result, MY(t) =McX(t) =MX(ct) = (1 −βct)−α for t < 1βc

Page 19 Rohini Somanathan

Page 21: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The gamma family: exponential distributions

An exponential distribution is simply a gamma distribution with α = 1

• Parameters: β ∈Ω, β > 0

• Density: f(x;β) = 1βe

− xβ I(0,∞)(x)

• Moments: µ = β, σ2 = β2

• MGF: MX(t) = (1 −βt)−1 for t < 1β

• Applications: As discussed above, the most important application the representation of

operating lives. The exponential is memoryless and so, if failure hasn’t occurred, the object

(or person, animal) is as good as new. The risk of failure at any point t is given by the

hazard rate,

h(t) =f(t)

S(t)

where S(t) is the survival function, 1 − F(t). Verify that the hazard rate in this case is a

constant, 1β .

If we would like wear-out effects, we should use a gamma with α > 1 and for work-hardening

effects, use a gamma with α < 1

Page 20 Rohini Somanathan

Page 22: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The gamma family: chi-square distributions

An Chi-square distribution is simply a gamma distribution with α = v2 and β = 2

• Parameters: v ∈Ω, v is a positive integer (referred to as the degrees of freedom)

• Density: f(x;v) = 1

2v2 Γ(v2 )

xv2 −1e−x2 I(0,∞)(x)

• Moments: µ = v, σ2 = 2v

• MGF: MX(t) = (1 − 2t)−v2 for t < 1

2

• Applications:

• Notice that for v = 2, the Chi-Square density is equivalent to the exponential density with

β = 2. It is therefore decreasing for this value of v and hump-shaped for other higher values.

• The χ2 is especially useful in problems of statistical inference because if we have v

independent random variables, Xi ∼N(0, 1), their sumv∑i=1X2i ∼ χ

2v Many of the estimators we

use in our models fit this case (i.e. they can be expressed as the sum of independent normal

variables)

Page 21 Rohini Somanathan

Page 23: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The Normal (or Gaussian) distribution

This symmetric bell-shaped density is widely used because:

1. Outcomes certain types of continuous random variables can be shown to follow this type of

distribution, this is the motivation we’ve used for most parametric distributions we’ve

considered so far (heights-humans, animals and plants, weights, strength of physical

materials, the distance from the centre of a target if errors in both directions are

independent).

2. It has nice mathematical properties: many functions of a set normally distributed random

variables have distributions that take simple forms.

3. Central Limit Theorems: The sample mean of a random sample from any distribution with

finite variance is approximately normal.

Page 22 Rohini Somanathan

Page 24: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The Normal density

• Parameters: (µ,σ2) ∈Ω, µ ∈ (−∞,∞), σ > 0

• Density: f(x;µ,σ2) = 1σ√

2πe− 1

2 (x−µσ )2

I(−∞,+∞)(x)

• MGF: MX(t) = eµt+σ2t2

2

• The MGF can be used to derive the moments, E(X) = µ and variance is σ2

• As can be seen from the p.d.f, the distribution is symmetric around µ, where it achieves its

maximum value. this is therefore also the median and the mode of the distribution.

• The normal distribution with zero mean and unit variance is known as the standard normal

distribution and is of the form: f(x; 0, 1) = 1√2πe− 1

2 x2I(−∞,+∞)(x)

• The tails of the distribution are thin: 68% of the total probability lies within one σ of the

mean, 95.4% within 2σ and 99.7% within 3σ.

Page 23 Rohini Somanathan

Page 25: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The Normal distribution: deriving the MGF

By the definition of the MGF:

M(t) =

∞∫−∞etx

1

σ√

2πe−

(x−µ)2

2σ2 dx

=

∞∫−∞

1

σ√

2πe

[tx−

(x−µ)2

2σ2

]dx

We can rewrite the term inside the square brackets to obtain:

tx−(x−µ)2

2σ2= µt+

1

2σ2t2 −

[x−(µ+σ2t)]2

2σ2

The MGF can now be written as:

MX(t) =Ceµt+12σ

2t2

where C =∞∫

−∞ 1σ√

2πe−

[x−(µ+σ2t)]2

2σ2 dx = 1 because the integrand is a normal p.d.f with parameter µ

replaced by (µ+σ2t)

Page 24 Rohini Somanathan

Page 26: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The Normal distribution: computing moments

• First taking derivatives of the MGF:

M(t) = e(µt+σ2t22 )

M′(t) = M(t)(µ+σ2t)

M′′(t) = M(t) σ2 +M(t)(µ+σ2t)2

(obtained by differentiating M(t) with respect to t and substituting for M′(t))

• Evaluating these at t = 0, we get M′(0) = µ and M′′(0) = σ2 +µ2, or the variance = σ2.

Page 25 Rohini Somanathan

Page 27: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Transformations of Normally Distributed Variables...1

RESULT 1: Let X ∼N(µ,σ2). Then Z =(X−µ)σ ∼N(0, 1)

Proof: Z is of the form aX+b with a = 1σ and b = −µ

σ . Therefore

MZ(t) = ebtMX(at) = e−µσ teµσ t+σ

2 t2

2σ2 = et22 which is the MGF of a standard normal distribution.

An important implication of the above result is that if we are interested in any distribution in

this class of normal distributions, we only need to be able to compute integrals for the standard

normal-these are the tables you’ll see at the back of most textbooks.

Example: The kilometres per litre of fuel achieved by a new Maruti model , X ∼N(17, .25). What

is the probability that a new car will achieve between 16 and 18 kilometres per litre?

Answer: P(16≤ x≤ 18) = P(

16−17.5 ≤ z≤

18−17.5

)= P(−2≤ z≤ 2) = 1 − 2(.0228) = .9544

Page 26 Rohini Somanathan

Page 28: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Transformations of Normals...2

• RESULT 2: Let X ∼N(µ,σ2) and Y = aX+b, where a and b are given constants and a 6= 0,

then Y has a normal distribution with mean aµ+b and variance a2σ2

Proof: The MGF of Y can be expressed as MY(t) = ebteµat+12σ

2a2t2= e(aµ+b)t+ 1

2 (aσ)2t2

.

This is simply the MGF for a normal distribution with the mean aµ+b and variance a2σ2

• RESULT 3: If X1, . . . ,Xk are independent and Xi has a normal distribution with mean µiand variance σ2

i, then Y = X1 + · · ·+Xk has a normal distribution with mean µ1 + · · ·+µkand variance σ2

1 + · · ·+σ2k.

Proof: Write the MGF of Y as the product of the MGFs of the Xi’s and gather linear and

squared terms separately to get the desired result.

• We can combine these two results to derive the distribution of sample mean:

RESULT 4: Suppose that the random variables X1, . . . ,Xn form a random sample from a

normal distribution with mean µ and variance σ2, and let Xn denote the sample mean.

Then Xn has a normal distribution with mean µ and variance σ2

n .

Page 27 Rohini Somanathan

Page 29: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Transformations of Normals to χ2 distributions

RESULT 5 : If X ∼N(0, 1), then Y = X2 has a χ2 distribution with one degree of freedom.

Proof:

MY(t) =

∞∫−∞ex

2t 1√

2πe−x

22 dx

=

∞∫−∞

1√

2πe− 1

2 x2(1−2t)dx

=1√

(1 − 2t)

∞∫−∞

1√

2π 1√(1−2t)

e− 12 (x√

(1−2t))2dx

=1√

(1 − 2t)for t <

1

2

( the integrand is a normal density with µ = 0 and σ2 = 1(1−2t) ).

The MGF obtained is that of a χ2 random variable with v = 1 since the χ2 MGF is given by

MX(t) = (1 − 2t)−v2 for t < 1

2 .

Page 28 Rohini Somanathan

Page 30: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

Normals and χ2 distributions...

RESULT 6 : Let X1, . . .Xn be independent random variables with each Xi ∼N(0, 1), then Y =n∑i=1X2i

has a χ2 distribution with n degrees of freedom.

Proof:

MY(t) =

n∏i=1

MX2i(t)

=

n∏i=1

(1 − 2t)−12

= (1 − 2t)−n2 for t <

1

2

which is the MGF of a χ2 random variable with v = n. This is the reason that the parameter v is

called the degrees of freedom. There are n freely varying random variables whose sum of squares

represents a χ2v-distributed random variable. This also follows directly from gamma-additivity.

Page 29 Rohini Somanathan

Page 31: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The Bivariate Normal distribution

The bivariate normal has the density:

f(x,y) =1

2πσ1σ2

√1 − ρ2

e−q2

2

where

q =1

1 − ρ2

[(x−µ1

σ1

)2− 2ρ

(x−µ1

σ1

)(y−µ2

σ2

)+(y−µ2

σ2

)2]

• E(Xi) = µi,Var(Xi) = σ2i and the correlation coefficient ρ(X1,X2) = ρ

• Verify that in this case, X1 and X2 are independent iff they are uncorrelated.

• Applications: heights of couples, scores on tests...

Page 30 Rohini Somanathan

Page 32: t4003specialdistributions_16

Course 003: Basic Econometrics, 2015-2016'

&

$

%

The Multivariate Normal distribution

• Parameters: (a, B) ∈Ω, a ∈<n, B a symmetric positive definite matrix.

• Density: f(x; a, B) = 1

|B|12 (2π)

n2e− 1

2 (x−a)′B−1(x−a)

• Moments: µ = a, Cov(X) = B

• MGF: MX(t) = ea′t+ 12 t′Bt Note: There are n+

n(n+1)2 parameters, n means and

n(n+1)2

unique elements in the variance-covariance matrix B.

• Applications: statistical inference in the classical linear regression model...and with large

samples in other models.

Additional distributions that we’ll use mainly for inference are the Student’s t-distribution and

F-distribution. We’ll introduce these in the second half of the course.

Page 31 Rohini Somanathan