t4003specialdistributions_16
description
Transcript of t4003specialdistributions_16
Course 003: Basic Econometrics, 2015-2016
Topic 4: Some Special Distributions
Rohini Somanathan
Course 003, 2015-2016
Page 0 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Parametric Families of Distributions
• Some families of probability distributions are frequently used because they have a small
number of parameters and are good approximations for the experiments and events we are
interested in analyzing.
• Examples:
– For modeling the distribution of income or consumption expenditure, we want a density
which is skewed to the right ( gamma, weibull, lognormal..)
– IQs, heights, weights, arm circumference are quite symmetric around a mode (normal
or truncated normal)
– number of successes in a given number of trials (binomial)
– the time to failure for a machine or person (gamma, exponential)
• We refer to these probability density functions by f(x;θ) where θ refers to a parameter
vector.
• A given choice of θ therefore leads to a given probability density function.
• Ω is used to denote the parameter space.
Page 1 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The discrete uniform distribution
• Parameter: N
• Probability function: f(x;N) = 1N I(1,2,...,N)(x)
• Moments:
µ =∑xf(x) =
1
N
N(N+ 1)
2=(N+ 1)
2
σ2 =∑x2f(x) − µ2 =
1
N
N(N+ 1)(2N+ 1)
6−((N+ 1)
2
)2
=N2 − 1
12
• MGF:∑Nj=1
ejt
N
• Applications: experiments with equally likely outcomes (dice, coins..) Can you think of
applications in economics?
Page 2 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The Bernoulli distribution
• Parameter: p ∈Ω, 0≤ p≤ 1
• Probability function: f(x;p) = px(1 −p)1−x I(0,1)(x)
• Moments:
µ =∑xf(x) = 1.p1(1 −p)0 + 0.p0((1 −p)1 = p
σ2 =∑x2f(x)−µ2 = p(1 −p)
• MGF: etp+ e0(1 −p) = pet+(1 −p)
• Applications: experiments with two possible outcomes: success or failure, defective or not
defective, male or female, etc.
Page 3 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The Binomial distribution.
• Parameters: (n,p) ∈Ω, 0≤ p≤ 1 and n is a positive integer
• Probability function: An observed sequence of n Bernoulli trials can be represented by an
n-tuple of zeros and ones. The number of ways to achieve x ones is given by(nx
)= n!x!(n−x)! .
The probability of x successes in n trials is therefore:
f(x;n,p) =
(nx
)px(1 −p)n−x x = 0, 1, 2, . . .n
0 otherwise
Notice that sincen∑x=0
(nx
)axbn−x = (a+b)n,
n∑x=0f(x) = [(p+(1 −p)]n = 1 so we have a valid
density function.
• MGF:The MGF is given by:∑xetxf(x) =
n∑x=0etx
(nx
)px(1 −p)n−x =
n∑x=0
(nx
)(pet)x(1 −p)n−x = [(1 −p)+pet]n
• Moments: The MGF can be used to derive µ = np and σ2 = np(1 −p)
• Result: If X1, . . .Xk are independent random variables and if each Xi has a binomial
distribution with parameters ni and p, then the sum X1 + · · ·+Xk has a binomial
distribution with parameters n = n1 + · · ·+nk and p.
Page 4 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The Multinomial distributionSuppose there are a small number of different outcomes (methods of public transport, water
purification etc. ) The Multinomial distribution gives us the probability associated with a
particular vector of these outcomes:
• Parameters: (n,p1, . . .pm) ∈Ω, 0≤ pi ≤ 1,∑ipi = 1 and n is a positive integer
• Probability function:
f(x1, . . .xm;n,p1, . . .pm) =
n!m∏i=1xi!
m∏i=1pxii x = 0, 1, 2, . . .n,
∑mi xi = n
0 otherwise
• MGF: MX(t) =(m∑i=1pie
ti
)n• Moments: µi = npi, σ
2i = npi(1 −pi)
Page 5 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Geometric and Negative Binomial distributions
• The Negative Binomial (or Pascal) distribution gives us the probability that x failures will
occur before r successes are achieved. This means that the rth success occurs on the
(x+ r)th trial.
– Parameters: (r,p) ∈Ω, 0≤ pi ≤ 1,∑ipi = 1 and r is a positive integer
– Density: For the rth success occurs on the (x+ r)th trial, we require (r− 1) successes in
the first (x+ r− 1) trials. We therefore obtain the density:
f(x;r,p) =
(r+ x− 1
x
)prqx, x = 0, 1, 2, 3...
• The geometric distribution is a special case of the negative binomial with r = 1.
– The density in this case takes the form f(x|1,p) = pqx over all natural numbers x
– the MGF is given by E(etX) = p∑∞x=0(qe
t)x = p1−qet
for t < log( 1q)
– We can use this function to get the mean and variance, µ = qp and σ2 = q
p2
– The negative binomial is just a sum of r geometric variables, and the MGF is therefore
( p1−qet
)r and the corresponding mean and variance is µ = rqp and σ2 = rq
p2
– The geometric distribution is memory-less, so the conditional probability of k+ t
failures given k failures is the unconditional probability of t failures,
P(X = k+ t|X≥ k) = P(X = t)
Page 6 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Discrete Distributions: Poisson• Parameter: λ ∈Ω,λ > 0
• Probability function:
f(x;λ) =
e−λλx
x! , x = 0, 1, 2, . . . ,
0 otherwise
Using the result that the series 1 + λ+ λ2
2! + λ3
3! + . . . converges to eλ,∑xf(x) =
∞∑x=0
e−λλx
x! = e−λ∞∑x=0
λx
x! = e−λeλ = 1 so we have a valid density.
• Moments: µ = λ = σ2
• MGF: E(etX) =∞∑x=0
etxe−λλx
x! = e−λ∞∑x=0
(λet)x
x! = eλ(et−1)
• The MGF can be used to get the first and second moments about the origin, λ and λ2 + λ
so the mean and the variance are both λ.
• We can also use the product of k identical MGFs to show that the sum of k independently
distributed Poisson variables has a Poisson distribution with mean λ1 + . . .λk.
Page 7 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
A Poisson processSuppose that the number of type A outcomes that occur over a fixed interval of time, [0, t]
follows a process in which
1. The probability that precisely one type A outcome will occur in a small interval of time ∆t
is approximately proportional to the length of the interval:
g(1,∆t) = γ∆t+o(∆t)
where o(∆t) denotes a function of ∆t having the property that lim∆t→0o(∆t)∆t = 0.
2. The probability that two or more type A outcomes will occur in a small interval of time ∆t
is negligible: ∞∑x=2
g(x,∆t) = o(∆t)
3. The numbers of type A outcomes that occur in nonoverlapping time intervals are
independent events.
These conditions imply a process which is stationary over the period of observation, i.e the
probability of an occurrence must be the same over the entire period with neither busy nor quiet
intervals.
Page 8 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Poisson densities representing poisson processes
RESULT: Consider a Poisson process with the rate γ per unit of time. The number of events in
a time interval t is a Poisson density with mean λ = γt.
Applications:
• the number of weaving defects in a yard of handloom cloth or stitching defects in a shirt
• the number of traffic accidents on a motorway in an hour
• the number of particles of a noxious substance that come out of chimney in a given period
of time
• the number of times a machine breaks down each week
Example:
• let the probability of exactly one blemish in a foot of wire be 11000 and that of two or more
blemishes be zero.
• we’re interested in the number of blemishes in 3, 000 feet of wire.
• if the numbers of blemishes in non-overlapping intervals are assumed to be independently
distributed, then our random variable X follows a poisson process with λ = γt = 3 and
P(X = 5) =35e−3
5!
• you can plug this into a computer, or alternatively use tables to compute f(5; 3) = .101
Page 9 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The Poisson as a limiting distributionWe can show that a binomial distribution with large n and small p can be approximated by a
Poisson ( which is computationally easier).
• useful result: ev = limn→∞(1 + vn)n
• We can rewrite the binomial density for non-zero values as
f(x;n,p) =
x∏i=1
(n−i+1)
x! px(1 −p)n−x. If np = λ, we can subsitute for p by(λn
)to get
limn→∞f(x;n,p) = limn→∞x∏i=1
(n− i+ 1)
x!
(λn
)x(1 −
λ
n
)n−x
= limn→∞x∏i=1
(n− i+ 1)
nxλx
x!
(1 −
λ
n
)n(1 −
λ
n
)−x= limn→∞[n
n.(n− 1)
n. . . .
(n− x+ 1)
n
λx
x!
(1 −
λ
n
)n(1 −
λ
n
)−x]=e−λλx
x!
(using the above result and the property that the limit of a product is the product of the
limits)
Page 10 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Poisson as a limiting distribution...example
• We have a 300 page novel with 1, 500 letters on each page.
• Typing errors are as likely to occur for one letter as for another, and the probability of such
an error is given by p = 10−5.
• The total number of letters n = (300) ∗ (1500) = 450, 000
• Using λ = np, the poisson distribution function gives us the probability of the number of
errors being less than or equal to 10 as:
P(x≤ 10)≈10∑x=0
e−4.5(4.5)x
x!= .9933
Rules of Thumb: close to binomial probabilities when n≥ 20 and p≤ .05, excellent when n≥ 100
and np≤ 10.
Page 11 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Discrete distributions: Hypergeometric
• Suppose, like in the case of the binomial, there are two possible outcomes and we’re
interested in the probability of x values of a particular outcome, but we are drawing
randomly without replacement so our trials are not independent.
• In particular, suppose there are A+B objects from which we pick n, A of the total number
available are of one type (red balls) and the rest are of the other (blue balls).
• If the random variable is the total number of red balls selected, then, for appropriate values
of x, we have f(x;A,B,n) =(Ax)(
Bn−x)
(A+Bn )
• Over what values of x is this defined? max0,n−B, ≤ X≤minn,A
• The multivariate extension is (for xi ∈ 0, 1, 2..n,n∑i=1xi = n and
m∑i=1Ki =M ):
f(x1 . . .xm;K1 . . .Km,n) =
m∏j=1
(Kjxj
)(Mn
)
Page 12 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Continuous distributions: uniform or rectangular
• Parameters: (a,b) ∈Ω, (a,b) : ∞ < a < b <∞• Density: f(x;a,b) = 1
b−a I(a,b)(x) (hence the name rectangular)
• Moments: µ =(a+b)
2 , σ2 =(b−a)2
12
• Applications:
– to construct the probability space of an experiment in which any outcome in the
interval [a,b] is equally likely.
– to generate random samples from other distributions (based on the probability integral
transformation). This is part of your first lab assignment.
Page 13 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The gamma functionThe gamma function is a special mathematical function that is widely used in statistics. The
gamma function of α is defined as
Γ(α) =
∞∫0
yα−1e−ydy (1)
• If α = 1, Γ(α) =∞∫0
e−ydy = −e−y∣∣∣∞0
= 1
• If α > 1, we can integrate (1) by parts, setting u = yα−1 and dv = e−y and using the formula∞∫0
udv = uv∣∣∣∞0
−∞∫0
vdu to get: −yα−1
ey
∣∣∣∞0
+(α− 1)∞∫0
yα−2e−ydy
• The first term in the above expression is zero because the exponential function goes to zero
faster than any polynomial and we obtain
Γ(α) = (α− 1)Γ(α− 1)
and for any integer α > 1, we have
Γ(α) = (α− 1)(α− 2)(α− 3) . . . (3)(2)(1)Γ(1) = (α− 1)!
Page 14 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The gamma distribution
Define the variable x by y = xβ , where β > 0. Then dy = 1
βdx and can rewrite Γ(α) as
Γ(α) =
∞∫0
( xβ
)α−1e− xβ
( 1
β
)dx
or as
1 =
∞∫0
1
Γ(α)βαxα−1e
− xβdx
This shows that for α,β > 0,
f(x;α,β) =1
Γ(α)βαxα−1e
− xβ I(0,∞)(x)
is a valid density and is known as a gamma-type probability density function.
Page 15 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Features of the gamma density
This is a valuable distribution because it can take a variety of shapes depending on the values of
the parameters α and β
• It is skewed to the right
• It is strictly decreasing when α≤ 1
• If α = 1, we have the exponential density, which is memory-less.
• For α > 1 the density attains it maximum at x = (α− 1)β320 Chapter 5 Special Distributions
Figure 5.7 Graphs of thep.d.f.’s of several differentgamma distributions withcommon mean of 1.
Gam
ma
p.d.
f.
x
a ! 0.1, b ! 0.1a ! 1, b ! 1a ! 2, b ! 2a ! 3, b ! 3
0.2
0
0.4
0.6
0.8
1.0
1.2
1 2 3 4 5
Theorem5.7.5
Moments. Let X have the gamma distribution with parameters α and β. For k =1, 2, . . . ,
E(Xk) = #(α + k)
βk#(α)= α(α + 1) . . . (α + k − 1)
βk.
In particular, E(X) = αβ , and Var(X) = α
β2 .
Proof For k = 1, 2, . . . ,
E(Xk) =! ∞
0xkf (x|α, β) dx = βα
#(α)
! ∞
0xα+k−1e−βx dx
= βα
#(α). #(α + k)
βα+k= #(α + k)
βk#(α). (5.7.14)
The expression for E(X) follows immediately from (5.7.14). The variance can becomputed as
Var(X) = α(α + 1)β2
−"
α
β
#2
= α
β2.
Figure 5.7 shows several gamma distribution p.d.f.’s that all have mean equal to1 but different values of α and β.
Example5.7.6
Service Times in a Queue. In Example 5.7.5, the conditional mean service rate giventhe observations X1 = x1, . . . , Xn = xn is
E(Z|x1, . . . , xn) = n + 12 + $n
i=1 xi
.
For large n, the conditional mean is approximately 1 over the sample average ofthe service times. This makes sense since 1 over the average service time is what wegenerally mean by service rate. !
The m.g.f. ψ of X can be obtained similarly.
Theorem5.7.6
Moment Generating Function. Let X have the gamma distribution with parameters α
and β. The m.g.f. of X is
ψ(t) ="
β
β − t
#α
for t < β. (5.7.15)
Page 16 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Moments of the gamma distribution
• Parameters: (α,β) ∈Ω, α > 0,β > 0
• Moments: µ = αβ, σ2 = αβ2
• MGF: MX(t) = (1 −βt)−α for t < 1β which can be derived as follows:
MX(t) =
∞∫0
etx1
Γ(α)βαxα−1e
− xβ dx
=
∞∫0
1
Γ(α)βαxα−1e
−( 1β
−t)xdx
=1
Γ(α)βα1(
1β − t
)α−1
∞∫0
xα−1( 1
β− t
)α−1e−( 1β
−t)x
(1β − t
)(1β − t
)dx=
1
Γ(α)βαΓ(α)(
1β − t
)α (by setting y = ( 1β − t)x in the expression for Γ(α).)
=1(
1 −βt)α
Page 17 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Gamma applications
• Survival analysis
– The waiting time till the rth event/success: If X is the time that passes until the first
success, then X could be gamma distribution with α = 1 and β = 1γ . This is known as an
exponential distribution. If, instead we are interested in the time taken for the rth
success, this has a gamma density with α = r and 1β = γ.
– Related to the Poisson distribution: If Y, the number of events in a given time period t
has a poisson density with parameter λ, the rate of success is given by γ = λt .
Example: A bottling plant breaks down, on average, twice every four weeks. We want the
probability that the number of breakdowns, X≤ 3 in the next four weeks. We have λ = 2
and the breakdown rate γ = 12 per week. P(X≤ 3) =
3∑i=0e−2 2i
i! = .135 + .271 + .271 + .18 = .857
Suppose we wanted the probability that the machine does not break down in the next four
weeks. The time taken until the first break-down, x must therefore be more than four
weeks. This follows a gamma distribution, with β = 1γ and α = 1.
P(X≥ 4) =∞∫4
12e
−x2 dx = e−x2
∣∣∣∞4
= e−2 = .135
• Income distributions that are uni-modal
Page 18 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Gamma distributions: some useful properties
• Gamma Additivity: Let X1, . . .Xn be independently distributed random variables with
respective gamma densities Gamma(αi,β). Then
Y =
n∑i=1
Xi ∼ Gamma(
n∑i=1
αi,β)
• Scaling Gamma Random Variables: Let X be distributed with gamma density
Gamma(α,β) and let c > 0. Then
Y = cX ∼ Gamma(α,βc)
Both these can be easily proved using the gamma MGF and applying the MGF uniqueness
theorem: In the first case the MGF of Y is the product of the individual MGFs, i.e.
MY(t) =
n∏i=1
MXi(t) =
n∏i=1
(1 −βt)−αi = (1 −βt)
n∑i=1
−αifor t <
1
β
For the second result, MY(t) =McX(t) =MX(ct) = (1 −βct)−α for t < 1βc
Page 19 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The gamma family: exponential distributions
An exponential distribution is simply a gamma distribution with α = 1
• Parameters: β ∈Ω, β > 0
• Density: f(x;β) = 1βe
− xβ I(0,∞)(x)
• Moments: µ = β, σ2 = β2
• MGF: MX(t) = (1 −βt)−1 for t < 1β
• Applications: As discussed above, the most important application the representation of
operating lives. The exponential is memoryless and so, if failure hasn’t occurred, the object
(or person, animal) is as good as new. The risk of failure at any point t is given by the
hazard rate,
h(t) =f(t)
S(t)
where S(t) is the survival function, 1 − F(t). Verify that the hazard rate in this case is a
constant, 1β .
If we would like wear-out effects, we should use a gamma with α > 1 and for work-hardening
effects, use a gamma with α < 1
Page 20 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The gamma family: chi-square distributions
An Chi-square distribution is simply a gamma distribution with α = v2 and β = 2
• Parameters: v ∈Ω, v is a positive integer (referred to as the degrees of freedom)
• Density: f(x;v) = 1
2v2 Γ(v2 )
xv2 −1e−x2 I(0,∞)(x)
• Moments: µ = v, σ2 = 2v
• MGF: MX(t) = (1 − 2t)−v2 for t < 1
2
• Applications:
• Notice that for v = 2, the Chi-Square density is equivalent to the exponential density with
β = 2. It is therefore decreasing for this value of v and hump-shaped for other higher values.
• The χ2 is especially useful in problems of statistical inference because if we have v
independent random variables, Xi ∼N(0, 1), their sumv∑i=1X2i ∼ χ
2v Many of the estimators we
use in our models fit this case (i.e. they can be expressed as the sum of independent normal
variables)
Page 21 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The Normal (or Gaussian) distribution
This symmetric bell-shaped density is widely used because:
1. Outcomes certain types of continuous random variables can be shown to follow this type of
distribution, this is the motivation we’ve used for most parametric distributions we’ve
considered so far (heights-humans, animals and plants, weights, strength of physical
materials, the distance from the centre of a target if errors in both directions are
independent).
2. It has nice mathematical properties: many functions of a set normally distributed random
variables have distributions that take simple forms.
3. Central Limit Theorems: The sample mean of a random sample from any distribution with
finite variance is approximately normal.
Page 22 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The Normal density
• Parameters: (µ,σ2) ∈Ω, µ ∈ (−∞,∞), σ > 0
• Density: f(x;µ,σ2) = 1σ√
2πe− 1
2 (x−µσ )2
I(−∞,+∞)(x)
• MGF: MX(t) = eµt+σ2t2
2
• The MGF can be used to derive the moments, E(X) = µ and variance is σ2
• As can be seen from the p.d.f, the distribution is symmetric around µ, where it achieves its
maximum value. this is therefore also the median and the mode of the distribution.
• The normal distribution with zero mean and unit variance is known as the standard normal
distribution and is of the form: f(x; 0, 1) = 1√2πe− 1
2 x2I(−∞,+∞)(x)
• The tails of the distribution are thin: 68% of the total probability lies within one σ of the
mean, 95.4% within 2σ and 99.7% within 3σ.
Page 23 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The Normal distribution: deriving the MGF
By the definition of the MGF:
M(t) =
∞∫−∞etx
1
σ√
2πe−
(x−µ)2
2σ2 dx
=
∞∫−∞
1
σ√
2πe
[tx−
(x−µ)2
2σ2
]dx
We can rewrite the term inside the square brackets to obtain:
tx−(x−µ)2
2σ2= µt+
1
2σ2t2 −
[x−(µ+σ2t)]2
2σ2
The MGF can now be written as:
MX(t) =Ceµt+12σ
2t2
where C =∞∫
−∞ 1σ√
2πe−
[x−(µ+σ2t)]2
2σ2 dx = 1 because the integrand is a normal p.d.f with parameter µ
replaced by (µ+σ2t)
Page 24 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The Normal distribution: computing moments
• First taking derivatives of the MGF:
M(t) = e(µt+σ2t22 )
M′(t) = M(t)(µ+σ2t)
M′′(t) = M(t) σ2 +M(t)(µ+σ2t)2
(obtained by differentiating M(t) with respect to t and substituting for M′(t))
• Evaluating these at t = 0, we get M′(0) = µ and M′′(0) = σ2 +µ2, or the variance = σ2.
Page 25 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Transformations of Normally Distributed Variables...1
RESULT 1: Let X ∼N(µ,σ2). Then Z =(X−µ)σ ∼N(0, 1)
Proof: Z is of the form aX+b with a = 1σ and b = −µ
σ . Therefore
MZ(t) = ebtMX(at) = e−µσ teµσ t+σ
2 t2
2σ2 = et22 which is the MGF of a standard normal distribution.
An important implication of the above result is that if we are interested in any distribution in
this class of normal distributions, we only need to be able to compute integrals for the standard
normal-these are the tables you’ll see at the back of most textbooks.
Example: The kilometres per litre of fuel achieved by a new Maruti model , X ∼N(17, .25). What
is the probability that a new car will achieve between 16 and 18 kilometres per litre?
Answer: P(16≤ x≤ 18) = P(
16−17.5 ≤ z≤
18−17.5
)= P(−2≤ z≤ 2) = 1 − 2(.0228) = .9544
Page 26 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Transformations of Normals...2
• RESULT 2: Let X ∼N(µ,σ2) and Y = aX+b, where a and b are given constants and a 6= 0,
then Y has a normal distribution with mean aµ+b and variance a2σ2
Proof: The MGF of Y can be expressed as MY(t) = ebteµat+12σ
2a2t2= e(aµ+b)t+ 1
2 (aσ)2t2
.
This is simply the MGF for a normal distribution with the mean aµ+b and variance a2σ2
• RESULT 3: If X1, . . . ,Xk are independent and Xi has a normal distribution with mean µiand variance σ2
i, then Y = X1 + · · ·+Xk has a normal distribution with mean µ1 + · · ·+µkand variance σ2
1 + · · ·+σ2k.
Proof: Write the MGF of Y as the product of the MGFs of the Xi’s and gather linear and
squared terms separately to get the desired result.
• We can combine these two results to derive the distribution of sample mean:
RESULT 4: Suppose that the random variables X1, . . . ,Xn form a random sample from a
normal distribution with mean µ and variance σ2, and let Xn denote the sample mean.
Then Xn has a normal distribution with mean µ and variance σ2
n .
Page 27 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Transformations of Normals to χ2 distributions
RESULT 5 : If X ∼N(0, 1), then Y = X2 has a χ2 distribution with one degree of freedom.
Proof:
MY(t) =
∞∫−∞ex
2t 1√
2πe−x
22 dx
=
∞∫−∞
1√
2πe− 1
2 x2(1−2t)dx
=1√
(1 − 2t)
∞∫−∞
1√
2π 1√(1−2t)
e− 12 (x√
(1−2t))2dx
=1√
(1 − 2t)for t <
1
2
( the integrand is a normal density with µ = 0 and σ2 = 1(1−2t) ).
The MGF obtained is that of a χ2 random variable with v = 1 since the χ2 MGF is given by
MX(t) = (1 − 2t)−v2 for t < 1
2 .
Page 28 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
Normals and χ2 distributions...
RESULT 6 : Let X1, . . .Xn be independent random variables with each Xi ∼N(0, 1), then Y =n∑i=1X2i
has a χ2 distribution with n degrees of freedom.
Proof:
MY(t) =
n∏i=1
MX2i(t)
=
n∏i=1
(1 − 2t)−12
= (1 − 2t)−n2 for t <
1
2
which is the MGF of a χ2 random variable with v = n. This is the reason that the parameter v is
called the degrees of freedom. There are n freely varying random variables whose sum of squares
represents a χ2v-distributed random variable. This also follows directly from gamma-additivity.
Page 29 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The Bivariate Normal distribution
The bivariate normal has the density:
f(x,y) =1
2πσ1σ2
√1 − ρ2
e−q2
2
where
q =1
1 − ρ2
[(x−µ1
σ1
)2− 2ρ
(x−µ1
σ1
)(y−µ2
σ2
)+(y−µ2
σ2
)2]
• E(Xi) = µi,Var(Xi) = σ2i and the correlation coefficient ρ(X1,X2) = ρ
• Verify that in this case, X1 and X2 are independent iff they are uncorrelated.
• Applications: heights of couples, scores on tests...
Page 30 Rohini Somanathan
Course 003: Basic Econometrics, 2015-2016'
&
$
%
The Multivariate Normal distribution
• Parameters: (a, B) ∈Ω, a ∈<n, B a symmetric positive definite matrix.
• Density: f(x; a, B) = 1
|B|12 (2π)
n2e− 1
2 (x−a)′B−1(x−a)
• Moments: µ = a, Cov(X) = B
• MGF: MX(t) = ea′t+ 12 t′Bt Note: There are n+
n(n+1)2 parameters, n means and
n(n+1)2
unique elements in the variance-covariance matrix B.
• Applications: statistical inference in the classical linear regression model...and with large
samples in other models.
Additional distributions that we’ll use mainly for inference are the Student’s t-distribution and
F-distribution. We’ll introduce these in the second half of the course.
Page 31 Rohini Somanathan