Week 4 Annotated

7/23/2019 Week 4 Annotated

http://slidepdf.com/reader/full/week-4-annotated 1/95

ACTL2002/ACTL5101 Probability and Statistics: Week 4 Video Lecture Notes

ACTL2002/ACTL5101 Probability and Statistics

c Katja Ignatieva

School of Risk and Actuarial StudiesAustralian School of Business

University of New South Wales

[email protected]

Week 4 Video Lecture NotesProbability: Week 1 Week 2 Week 3 Week 4

Estimation: Week 5 Week 6 Review

Hypothesis testing: Week 7 Week 8 Week 9

Linear regression: Week 10 Week 11 Week 12

Video lectures: Week 1 VL Week 2 VL Week 3 VL Week 5 VL

mailto:[email protected]





Sampling with and without replacement

Population parameters

Sampling

Sampling with and without replacementPopulation parameters

Random sampling: generalSampling with replacementSampling without replacementExample

Properties of the sample mean & varianceBackgroundSample mean: momentsSample variance: mean







Recall:- Population: the large body of data;

- Sample: a subset of the population.

Population of size N (you have full information about thepopulation you are interested in).

x 1, x 2, . . . , x N are characteristics of interest, which can be:- continuous;

- discrete.

The population mean is given by:

µ = 1

N ·

N

i =1

x i .

602/618







Recall: the population variance is given by:

σ2 = 1

N ·

N

i =1

(x i − µ)2

= 1

N ·

N i =1

x 2i − 2 · µ ·N i =1

x i + N · µ2

=

1

N · N

i =1 x

2

i − 2 · µ · N · µ + N · µ

2=

N i =1

x 2i

N − µ2.

603/618





Random sampling: general

Sampling



Properties of the sample mean & varianceBackground

Sample mean: momentsSample variance: mean







Now, consider a sample (size n) of the whole underlayingpopulation (size N );

Each sample of size n has same probability of occurring;

Select from the population with or without replacement.

- With replacement: X i is i.i.d. distributed for i = 1, . . . , n;

- Without replacement: X i and X j are dependent, i.e., once x i isselected Pr(X j = x i ) = 0;

Values of the sample: X 1, X 2, . . . , X n, are random variables.

We have:E[X i ] = µ.

604/618







The sample mean is given by:

X = 1

n ·

n

i =1

X i .

Recall that the sample mean is a random variable with asampling distribution and with mean:

E[X ] = µ.

Above result is for both with and without sampling withreplacement.

X is —correct “on average”— this is called unbiased (more

coverage of this later in the course).605/618







In general (for both with and without sampling withreplacement), the sample variance is:

Var (X ) = Var 1n ·

ni =1

X i ∗= Cov

1

n ·

n

i =1

X i , 1

n ·

n

j =1

X j

∗=

1

n2 ·

ni =1

n j =1

Cov (X i , X j ) ,

* using the properties of the covariance.

606/618





Sampling with replacement

Sampling





AC L /AC L P b b l d S W k V d L N







Now, consider sampling with replacement then we have:

Cov (X i , X j ) ∗=

0, if i = j ;Var (X i ) = σ2, if i = j .

* using independence between X i , X j . Thus we have:

Var (X ) ∗∗=

1

n2 ·

ni =1

n j =1

Cov (X i , X j )

= 1n2 ·

ni =1

Var (X i ) = n · σ2

n2 = σ2

n .

** using slide 606.

Standard deviation of X , i.e., the standard error: σX = σ√ n .

607/618

ACTL2002/ACTL5101 P b bili d S i i W k 4 Vid L N





Sampling without replacement

Sampling

Sampling with and without replacementPopulation parametersRandom sampling: generalSampling with replacementSampling without replacementExample



ACTL2002/ACTL5101 P b bilit d St tisti s Week 4 Vide Le t e N tes







Now, consider sampling without replacement ⇒ causesdependence in the X i . How does that affect:

Var X = 1

n2

·

n

i =1

n

j =1

Cov (X i , X j )?

Consider simple case where all the x i are different. We have:

Cov (X i , X j ) =N

i =1

N

j =1

(x i

−µ)

·(x j

−µ)

·Pr (X i = x i , X j = x j ) ,

where, due to sampling without replacement, we have:

Pr (X i = x i , X j = x j ) = 1N · 1

N −1 , if i = j ;

0, if i = j .608/618







Sampling without replacementWe have (note: {1, . . . , N }\i = {1, 2, . . . , i − 1, i + 1, . . . , N }):

Cov (X i , X j ) ∗

=N i =1

j ∈{1,...,N }\i

(x i − µ) · (x j − µ) · 1

N · 1

N − 1

= 1N · 1

N − 1 · N

i =1

(x i − µ) · j ∈{1,...,N }\i

(x j − µ)

= 1

N ·

1

N − 1 ·

N

i =1

(x i

−µ)

·N

j =1

(x j

−µ)

−(x i

−µ)

∗∗=

1

N · 1

N − 1 ·

N i =1

− (x i − µ)2 = − σ2

N − 1,

* j ∈ {1, . . . ,

N }\

i because Pr(

X i =

x i ,

X j =

x i ) = 0;** using

N =1 (x − µ) = 0.609/618








Thus, Cov (X i , X j ) = −σ2/(N − 1) for i = j .

Given this dependence, in the case of sampling without

replacement, what is Var (X )?

Var

X ∗

= 1

n2 ·

ni =1

n j =1

Cov (X i , X j )

=

1

n2 ·n

i =1

Var (X i ) + j ∈{1,...,n}\i Cov (X i , X j ) ,

* using slide 606.

continues next slide.

610/618








Thus:

Var

X

=

1

n2 ·

n ·

Var (X i )

σ2 + n · (n − 1) ·

Cov (X i ,X j ) −σ2

N

−1

=

σ2

n ·

1 − n − 1

N − 1

,

which differs from the simple sampling with replacement case

above.

Thus, where we sample without replacement we need a finitepopulation correction of:

1 − n

−1

N − 1 .611/618








If we sample without replacement, this means we need tobuild a whole new theory involving these finite populationcorrections.

Often not necessary. Approximation for sampling distributionof X without correction usually accurate enough (in case n

small relative to N ).

Hence, if N is large relative to n we have 1 − n−1N −1 ≈ 1.

This implies that sampling with or without replacement isapproximately the same if N is large relative to n.

612/618




/ y


Example

Sampling







/


Example

Example sampling with and without replacement

Suppose an insurer has 10 observations of terrorism insuranceclaims.

The claim size do not fit a special distribution. Summarizingstatistics: 10

i =1 x i = 100 and 10i =1 x 2i = 1, 250.

The insurer is going to forecast the 4-year ahead variance inthe average claim size.

Using sampling with replacement the variance is:

σ2/4 =1250

10 −

10010

24 = 25/4 = 6.25.

Using sampling without replacement the variance is:

σ2

/4 · 1 − 4

−1

10 − 1 = 25/4 · 6

9 =

25

6 = 4

1

6 .613/618




Properties of the sample mean & variance

Background

Sampling








Background

Properties of the sample mean and sample variance

Suppose you select randomly from a sample.

Assume selected with replacement or, alternatively, from alarge population size.

These outcomes (x 1, . . . , x n) are random variables, all with thesame distribution and independent.

Suppose X 1, X 2, . . . , X n are n independent r.v. with identicaldistribution. Define the sample mean by:

X = 1n ·

nk =1

X k ,

and recall that the sample variance is defined by:

S

2

=

1

n − 1 ·

n

k =1

X k − X 2

.614/618





Sample mean: moments

Sampling

Sampling with and without replacementPopulation parametersRandom sampling: general

Sampling with replacementSampling without replacementExample








Recall: the sample mean is given by:

X =

ni =1

X i

n .

Note: X is a random variable!

Using the i.i.d. property the expected value of the samplemean is:

E[X ] = E

ni =1 X i

n

=

ni =1

E [X i ]

n

= n · µ

n

= µ.615/618






The variance of the sample mean is given by:

Var X = Var n

i =1 X i n

=

ni =1

Var (X i )

n2

= n · σ2

n2

= σ2

n .

Hence, the uncertainty in the sample mean decreases as nincreases.

The standard deviation of the sample mean is:

Var (X ) = σ

√ n.

616/618


P f h l &




Sample variance: mean

Sampling

Sampling with and without replacementPopulation parametersRandom sampling: general

Sampling with replacementSampling without replacementExample




P i f h l & i






Recall: the sample variance is defined by:

S 2 =

n

i =1 X i − X

2

n − 1

=

ni =1

X 2i − n · X 2

n − 1

Question: why did we do that?

Solution: then the expectation of the sample variance isequal to the population variance.

Proof : see next slide.617/618


P e ties f the s le e & i e





Proof:

E[S 2] = 1n − 1

· E ni =1

X 2i − n · X 2

= 1

n

−1 ·

n

i =1

E

X 2i

− n · E

X

2

∗=

1

n − 1 ·

ni =1

σ2 + µ2

− n ·

σ2

n + µ2

= 1

n − 1 · n

·σ2 + n

·µ2− σ2 + n

·µ2

= 1

n − 1 · (n − 1) · σ2 = σ2

* using E X 2

= Var X + E X 2

, i.e., using the mean and

variance of the r.v. X & EX 2i

= Var (X i ) + (E [X i ])2.618/618

ACTL2002/ACTL5101 Probability and Statistics: Week 4



ACTL2002/ACTL5101 Probability and Statistics

c Katja Ignatieva

School of Risk and Actuarial StudiesAustralian School of BusinessUniversity of New South Wales

[email protected]

Week 4Probability: Week 1 Week 2 Week 3

Estimation: Week 5 Week 6 Review

Hypothesis testing: Week 7 Week 8 Week 9

Linear regression: Week 10 Week 11 Week 12

Video lectures: Week 1 VL Week 2 VL Week 3 VL Week 4 VL Week 5 VL






Last three weeks

Introduction to probability;

Distribution function;

Moments: (non)-central moments, mean, variance (standarddeviation), skewness & kurtosis;

Special univariate distribution (discrete & continue);

Joint distributions;

Dependence of multivariate distributions.

701/754




This week

Functions of random variables:

- Univariate case: order statistics (min/max/range);

- Univariate case: CDF-technique;

- Multivariate case: Jacobian transformation;

- Multivariate case: MGF-technique;

- Multivariate case: Convolutions (sum of independent r.v.).

702/754


Introduction



Introduction

Introduction

Distributions of functions of random variablesIntroduction

Introduction

Order statisticsDistributions of order statisticsExample, application & exercise order statistics

The CDF TechniquePreface

ExamplesExercises

The Jacobian transformation techniqueFundamentalsExample Jacobian technique

The MGF techniqueMotivationApplications & exercise of the MGF technique

Sums (Convolutions)ConvolutionsExercise

Approximate methodsDelta-method

Distribution characteristics in samplesExercise


Introduction



Introduction

Introduction

Often in practice we need to consider functions of a randomvariable (simplest case is Y = log(X ) for security returns orY = all i

X i for portfolios of risks).

Techniques we will consider:

- CDF/PMF/PDF technique;

- Jacobian transformation technique;

- MGF technique;

- Convolutions (for sums of random variables).

703/754


Introduction



Introduction

Suppose X 1, X 2, . . . , X n are n random variables.

We consider techniques that may be used to find thedistribution of functions these random variables, sayY = u (X 1, . . . , X n).

This will allow us to consider some very important topics forinsurance and financial modelling:

- Distributions of Order Statistics (the maximum claim, theminimum loss, the 95% quantile of a profit distribution);

- Special Sampling Distributions: chi-squared distribution,t −distribution, F −distribution (application of cdf technique,mgf technique, and Jacobian transformation, see online lectureweek 5).

704/754


Order statistics



Distributions of order statistics


Introduction


The CDF TechniquePreface

ExamplesExercises







Order statistics





Assume X 1, X 2, . . . , X n are n independent, identicallydistributed (i.i.d.) random variables with common distributionfunction F X (x ) and density f X (x ).

Sort these variables and denote by:

X (1) < X (2) < . . . < X (n)

the order statistics (also seen this in week 3).

In particular, X (1) = min {X 1, . . . , X n} is the minimum andX (n) = max {X 1, . . . , X n} is the maximum.

For simplicity, denote U = X (n) and V = X (1).

705/754


Order statistics




Distribution of the maximum

Interested in largest possible claim?

Deriving the distribution of the maximum, we have:

F U (u ) = Pr (U ≤ u )

∗= P r (X 1 ≤ u ) · Pr (X 2 ≤ u ) · . . . · Pr (X n ≤ u )

= (F X (u ))n ,

and the density function is:

f U (u ) ∗∗∗= n · f X (u ) · (F X (u ))n

−1

.

* Using Pr (U ≤ u ) = Pr (max{X 1, . . . , X n} ≤ u ) =

Pr (X 1 ≤ u ∩ . . . ∩ X n ≤ u ) ∗∗= Pr (X 1 ≤ u ) · . . . · Pr (X n ≤ u ).

** Using independence.

*** Using chain rule.706/754


Order statistics




Distribution of the minimum

Interested in maximum loss?

Deriving the distribution of the minimum, we have:

F V (v ) = Pr (V

≤v ) = 1

−Pr (V > v )

∗=1 − (Pr(X 1 > v ) · . . . · Pr (X n > v ))

=1 − (1 − F X (v ))n ,

and the corresponding density function is:

f V (v ) ∗∗∗

= n · f X (v ) · (1− F X (v ))n−1 .

* Using Pr (V ≤ v ) = 1 − Pr (min{X 1, . . . , X n} > v ) =1 − Pr (X 1 > v ∩ . . . ∩ X n > v ) and X i are independent.

*** Using chain rule.707/754


Order statistics




Distribution of the k th order statistic

What is the joint density of all the order statistics?

Let x 1, . . . , x n be a random (non-ordered) outcome of n drawsfrom random variable X .

Let y 1, . . . , y n be the ordered numbers of x 1, . . . , x n.

Question: How many sequences of x 1, . . . , x n lead to thesame sequence y 1, . . . , y n?

Solution: n!.

Question: What is the probability of each one such sequence?Solution: f X (y 1) · f X (y 2) · . . . · f X (y n). (using independence)

The joint probability density of the order statistics is given by:

f X (1)

,X (2)

,...,X (n)

(y 1, y 2, . . . , y n) = n!·

f X (y 1)·

f X (y 2)·

. . .·

f X (y n) .708/754


Order statistics

Di ib i f d i i




Distribution of the k th order statistic

Question: How many ways can you order k − 1 smallerobservations than x , one observation equal to x , and n − k

observations larger than x ?

Solution: Use multinomial, with n1 = k

−1, n2 = 1 and

n3 = n − k . Hence, number of ways is n!(k −1)!·1!·(n−k )! .

In general, the probability density of the k th order statistic isgiven by:

f K (x ) = n!

(k − 1)!(n − k )!

# possible ordering

· f X (x ) 1 observa-

tion equals x

· (F X (x ))k −1 k − 1 observa-

tions smaller

· (1 − F X (x ))n−k n − k observa-

tions larger

.

709/754


Order statistics

E l li ti & i d t ti ti



Example, application & exercise order statistics

Distributions of functions of random variables

IntroductionIntroduction


The CDF TechniquePrefaceExamplesExercises







Order statistics

E ample application & e ercise order statistics




Example order statistics

Distribution of the range.

Let X 1, X 2, . . . , X n be independent continuous randomvariables each with cumulative distribution F X (x ).

Explain in words why the joint cumulative distribution functionof the minimum X (1) and the maximum X (n) is given by:

Pr

X (1) ≤ x , X (n) ≤ y

=F X ,Y (x , y )

= (F X (y ))n

− (F X (y ) − F X (x ))n

,

with x ≤ y , Y = X (n), and X = X (1).

Note: maximum and minimum are not independent.

710/754


Order statistics

Example application & exercise order statistics




First, consider the maximum. In order for y to be themaximum we require that ALL of the X s be less than themaximum. Probability of this: (F X (y ))n.

Now if the minimum is less than x then at least one of theobservations must be less than x . In fact we must exclude allthe cases where all the observations are between x and y

because in these cases the minimum is NOT less than x .

The probability that a random variable X will be between x

and y is F X (y )

−F X (x ) and the probability that they are all

between x and y is (F X (y ) − F X (x ))n . Hence by subtractingoff the probability that they are all between x and y we willensure we have the probability that the minimum is less thanx .

711/754


Order statistics





Application order statistics

Consider an insurance company with n branches. Assume thatthe lifetimes of the branches are T 1, T 2, . . . , T n which arei.i.d. with exponential distribution with parameter λ.

Suppose that the branches in the system are connected in

“series”, that is, the insurance company will go bankrupt if any one of the branches goes bankrupt. The lifetime V of theinsurance company is therefore the minimum of the T k , i.e.,

V = min {T 1, . . . , T n} .

Therefore, the density of V is (exponential with parameternλ):

f V (v ) =n · f T (v ) · (1 − F T (v ))n−1

=n

·λ·

e −λ·v

· e −λ·v n−1

= (n

·λ)·

e −(n·λ)·v .712/754


Order statistics





Application order statistics

Suppose that the branches are connected in “parallel”, that is,the insurance company will go bankrupt only if all of thebranches go bankrupt. The lifetime U of the system is

therefore the maximum of the T k , i.e.,

U = max {T 1, . . . , T n} .

Therefore, the density of U is given by:

f U (u ) =n · f T (u ) · (F T (u ))n−1

=n · λ · e −λ·u ·

1 − e −λ·u n−1

.

713/754


Order statistics




a p e, app cat o & e e c se o de stat st cs

Exercise order statistics

Let claims size be Uniformly distributed between 0 and 1.

The insurer knows that he will receive 100 claims next year.

The insurer receives 1 from the reinsurer for each claim larger

than 0.995.Question: What is the probability that the reinsurer has tomake at least one payment?

Solution: 1

−Pr(no payment) = 1

−(0.995)100 = 0.3942.

In a proposed new contract the reinsurer would only pay twicethe second largest claim

Question: What is the price contract, when it is set to the

expected value plus half a standard deviation.714/754


Order statistics




p , pp

Exercise order statistics

Solution:Price=2 ·

99

100+1 + 12 ·

99·(100−99+1)(100+1)2·(100+2)

= 1.9742

f K (x ) = n!

(k

−1)!(n

−k )! · f X (x ) · (F X (x ))k −1 · (1− F X (x ))n−k

= n!

(k − 1)!(n − k )! · 1 · x k −1 · (1 − x )n−k

= Γ(n + 1)

Γ(k ) · Γ(n − k + 1) · x k −1 · (1 − x )n−k +1−1

This is the p.d.f. of a Beta(α = k , β = n − k + 1) distribution(with k = 99 and n = 100). Hence, E [K ] = α

α+β = k n+1 ,

Var (K ) = α·β(α+β)2·(α+β+1)

= k ·(n−k +1)(n+1)2·(n+2)

.

Alternative use simulated quantiles, see Excel file.715/754


The CDF Technique

Preface






The CDF TechniquePrefaceExamples

Exercises






ACTL2002/ACTL5101 Probability and Statistics: Week 4The CDF Technique

Preface



The CDF Technique

Let X be a continuous random variable with cumulativedistribution function F X (·) and density function f X (·).

Suppose that Y = g (X ) is a function of X where g (X ) isdifferentiable and strictly increasing . Thus, its inverse g −1(Y )

uniquely exists. Then, we can apply the CDF technique. TheCDF of Y can be derived using:

F Y (y ) = Pr (Y ≤ y ) = Pr (g (X ) ≤ y )

= Pr X ≤ g −1 (y ) = F X g −1 (y ) ,

and its density is given by:

f Y (y ) = ∂

∂ y F Y (y ) =

∂

∂ y F X

g −1 (y )

= f X g −1 (y ) ·

∂

∂ y

g −1 (y ) .716/754


Preface



Let X be a continuous random variable with cumulativedistribution function F X (

·) and density function f X (

·).

Suppose that Y = g (X ) is a function of X where g (X ) isdifferentiable and strictly decreasing . Thus, its inverse g −1(Y )uniquely exists. Then, we can apply the CDF technique.

The CDF of Y can be derived using:

F Y (y ) = Pr (Y ≤ y ) = Pr (g (X ) ≤ y )

= Pr

X ≥ g −1 (y )

= 1 − F X

g −1 (y )

,

and its density is given by:

f Y (y ) = ∂

∂ y F Y (y ) =

∂

∂ y

1 − F X

g −1 (y )

= − f X g −1 (y ) ·

∂

∂ y g −1 (y ) .

717/754


Preface



Summarizing: if g (X ) is strictly monotonic function, then:

f Y (y ) = f X g −1 (y ) · ∂

∂ y g −1 (y ) .

Common transformations that arise in applications:

1. Affine transformations: Y = m

·X + b . Example (strictly

increasing, i.e., m > 0):

g −1(y ) = y −b m

, F Y (y ) = F X

y −b m

,

∂ ∂ y g −1(y )

= 1

m, f Y (y ) = f X

y −b m

/|m|.

2. Power transformations: Y = X n, n > 0 and y > 0:

g −1(y ) = y 1n , F Y (y ) = F X

y

1n

,

∂ ∂ y g −1(y ) = 1

n

· y 1n−1 , f Y (y ) =

f X

y

1n

n

· y 1n−1 .

718/754


Preface



Recall: if g (X ) is strictly monotonic , then:

f Y (y ) = f X

g −1 (y ) · ∂ ∂ y

g −1 (y ) .

Common transformations that arise in applications (cont.):

3. Exponential transformation: Y = e a·X , a > 0:

g −1(y ) = log(y )a

, F Y (y ) = F X

log(y )

a

,

∂ ∂ y g −1(y )

= 1

a·y , f Y (y ) =f X

log(y )

a

a·y .

4. Inverse transformation: Y = 1/X , x > 0:

g −1(y ) = 1y

, F Y (y ) = F X

1y

,

∂ ∂ y g −1(y ) = 1

y 2, f Y (y ) =

f X

1y

y 2 .

719/754


Examples







Exercises







Examples



Example: Affine transformations: Y = mX + b

Example: Let Y = −3 · X + 4. Find F Y (y ) and f Y (y ) if X ∼ U(1, 9).

Solution: Apply special case of CDF-tecnique?

Note m < 0 ⇒ g (X ) = −3 · X + 4 is strictly decreasing.We have g −1(Y ) = −(Y − 4)/3.

We know that:

f X (x ) = 0, if x < 1 or x > 9;

1/8, if 1 ≤ x ≤ 9.

F X (x ) =

0, if x < 1;x −1

8 , if 1 ≤ x ≤ 9;1, if x > 9.

720/754


Examples



Support of Y : g (1) = −3 · 1 + 4 = 1 andg (9) =

−3·

9 + 4 =−

23.

Distribution function of Y (be careful with Pr

X ≤ g −1 (y )

,because m < 0) :

F Y (y ) = Pr(Y

≤y ) = Pr(

−3

·X + 4

≤y ) = Pr(

−3

·X

≤y

−4)

= Pr

X ≥ −y − 4

3

=

9 − y −4

3

1

8dx

= 1

8 x 9

− y −43

= 1

8 ·9 + y −

4

3

= 1

24 · (23 + y ), if − 23 ≤ y ≤ 1,

and zero if y

< −23 and one if y

> 1.721/754


Examples



Or, alternatively: Use g −1(y ) = y −4−3 :

F Y (y ) =1− F X g −1(y ) = 1 − F X

y − 4−3

=1−y −4−3 − 1

8 = 1 − y − 1

−24

=23 + y

24 , if − 23 ≤ y ≤ 1,

and zero if y < −23 and one if y > 1. Thus:

F Y (y ) = 1

24(23 + y ) =

1

24(y − (−23)), if − 23 ≤ y ≤ 1.

f Y (y ) = 124

, if − 23 ≤ y ≤ 1,

and zero otherwise. Equivalently: ∂ ∂ y

g −1(y )

= 1

3 , f Y (y ) =

f X g −1 (y ) · ∂

∂ y

g −1 (y ) = 1

8 · 1

3

= 1

24

, if

−23

≤y

≤1.

722/754


Examples



Example: Power transformations: Y = X n, n > 0

Example: Let Z ∼ N(0, 1) with p.d.f.:

f Z (z ) = 1√

2π· e −z

2/2, −∞ < z < ∞.

Question: Find f Y (y ), where Y = Z 2, can you applyCDF-technique?

Solution: Apply special case of CDF-technique?

No, g (Z ) = Z 2 is no monotonic function (decreasing forz < 0, increasing for z > 0).

Solution: use the symmetry of the standard normaldistribution!

723/754


Examples



We have:

F Y (y ) = Pr (Y ≤ y )= Pr

Z 2 ≤ y

= Pr (−√ y ≤ Z ≤ √

y )

=F Z (√

y ) − F Z (−√ y ) .

Using F Y (y ) = F Z (√

y )

−F Z (

−√

y ), for y

≥0, we have:

f Y (y ) =f Z (√

y ) ·

1

2 · y −1/2

− f Z (−√ y ) ·

−1

2 · y −1/2

=

1

2f Z (√

y ) ·

y −1/2

+

1

2f Z (−√ y ) ·

y −1/2

∗=f Z (√ y ) · y −1/2=

1√ 2π

·

y −1/2· e −y /2, if y ≥ 0,

and zero otherwise. * Using symmetry, i.e., f Z (−a) = f Z (a).724/754


Examples



Example: Exponential transformation: Y = e aX , a > 0

Example: X ∼ N(µ, σ2) and Y = e X , then (log-normal)

f Y (y ) = 1

y · σ ·

√ 2π

· exp−1

2 ·

log(y ) − µ

σ 2

, if y > 0,

and zero otherwise.

Question: Derive this result.

Solution: Apply special case of CDF-tecnique? Yes.

Support of Y : g (−∞) = exp(−∞) = 0, andg (∞) = exp(∞) = ∞.

725/754


Examples



We have X ∼ N(µ, σ2), and g (X ) = exp(X ) thus:

f X (x ) = 1σ · √ 2π

· exp−1

2 ·x − µ

σ

2 , if −∞ ≤ x ≤ ∞.

Now, using g −1(y ) = log(y ):

F Y (y ) = Pr(Y ≤ y )= Pr

e X ≤ y

= Pr (X ≤ log(y )) = F X (log(y )),

if y ≥ 0, and zero otherwise. Using ∂ ∂ y g −1(y ) = 1

y we have:

f Y (y ) =f X (log(y )) · 1y

= 1

y

·σ

·

√ 2π

· exp

−1

2 ·

log(y ) − µ

σ

2

, if y > 0. 726/754


Exercises







Exercises







Exercises



Exercises: Exponential transformation (generally)

Let X be a continuous distribution with probability densityfunction f X (x ) and cumulative density function F X (x ).

Let Y = exp(a

·X ), a > 0.

Question: Find the probability density function andcumulative density function of Y .

Can we use the CDF-technique?

727/754


Exercises



Exercises: Exponential transformation (generally)

Solution: Support Y : g (−∞) = 0, g (∞) = ∞, i.e., y ≥ 0.F Y (y ) = Pr(Y ≤ y )

=Pr

e a·X ≤ y

= Pr (a · X ≤ log(y ))

=PrX ≤ log(y )

a = Pr X ≤ log(y 1/

a

)=F X

log(y 1/a)

.

So we have: F Y (y ) = 0, if y < 0 and

F Y (y ) = F X log(y 1/a) , if y ≥ 0

and thus: f Y (y ) = 0, if y ≤ 0 and

f Y (y ) = f X log(y 1/a) · 1

a·

y , if y > 0.

728/754


Exercises



Exercises: Inverse transformation: Y = 1/X , x > 0

Example: inverse Gaussian (Wald) distribution.

Application: (OPTIONAL) First passing time for a Brownianmotion at a fixed level α.

Let X ∼ N(µ, σ2) be a normally distributed random variablewith probability density function f X (x ) and cumulative densityfunction F X (x ).

Let Y = 1/X .

Question: Find the distribution of Y .

729/754


Exercises



Exercises: Inverse transformation: Y = 1/X , x > 0

Solution: Support of Y : g (x → 0) = 1x →0 = ∞,lim→∞g () = 1

= 0.

In general, we have:

F Y (y ) = Pr(Y ≤ y )

=Pr

1

X ≤ y

= Pr

X ≥ 1

y

=1

−PrX <

1

y = 1

−F X

1

y , if y > 0,

and F Y (y ) = 0 if y ≤ 0. So we have: f Y (y ) = 0 if y ≤ 0 and

f Y (y ) = f X 1

y · 1

y 2, if y > 0.

730/754


Exercises



Exercise

Let X be a random variable with p.d.f.:

f X (x ) = e −x

(1 + e −x )2 , for −∞ ≤ x ≤ ∞.

Question: Find the probability density function of:

Y = e −X .

Use special case of CDF-technique?

731/754


Exercises



Solution: Yes, g (X ) = e −X , which is a strictly decreasingfunction.

Support of Y: g (−∞) = e ∞ = ∞, g (∞) = e −∞ = 0.

We have:

g −1 (y ) =−

log(y ),

so that ∂ ∂ y g −1 (y ) = − 1

y .

We have:

f Y (y ) =f X g −1 (y ) · ∂

∂ y g −1 (y )=

y

(1 + y )2 ·−1

y

= 1

(1 + y )2 , if 0 < y < ∞,

and zero otherwise732/754

ACTL2002/ACTL5101 Probability and Statistics: Week 4The Jacobian transformation technique

Fundamentals




Introduction



Exercises







Fundamentals



Fundamentals (bivariate case)

Consider the case of two continuous random variables X 1 andX 2 and assume that they are mapped onto U 1 and U 2 by thetransformation:

u 1 = g 1 (x 1, x 2) and u 2 = g 2 (x 1, x 2) .

Suppose this transformation is one-to-one so that we caninvert them to get:

x 1 = h1 (u 1, u 2) and x 2 = h2 (u 1, u 2) ,

where

h (u 1, u 2) = g −1 (u 1, u 2) .

Section 6.6 of W+(7ed).

Note: multivariate case of CDF technique.733/754


Fundamentals



The Jacobian of this transformation is the determinant:

J (u 1, u 2) = det

∂ h1(u 1, u 2)

∂ u 1

∂ h1(u 1, u 2)

∂ u 2

∂ h2(u 1, u 2)

∂ u 1

∂ h2(u 1, u 2)

∂ u 2

=

∂ h1(u 1, u 2)

∂ u 1 · ∂ h2(u 1, u 2)

∂ u 2 − ∂ h2(u 1, u 2)

∂ u 1 · ∂ h1(u 1, u 2)

∂ u 2,

provided this is not zero.

734/754


Fundamentals



Suppose the joint density of X 1 and X 2 is denoted byf X 1,X 2 (x 1, x 2).

Then, using the Jacobian transformation technique, the jointdensity of U 1 and U 2 is given by:

f U 1,U 2 (u 1, u 2) =f X 1,X 2 (h1 (u 1, u 2) , h2 (u 1, u 2)) · |J (h1 (u 1, u 2) , h2 (u 1, u 2))|.

i.e., joint density of X 1 and X 2 evaluated in h1(u 1, u 2) andh2(u 1, u 2) multiplied by the absolute value of the jacobian.

The above technique can be easily extended to n > 2variables.

735/754


Fundamentals

J bi f i h i d



Jacobian transformation technique: procedure

Procedure to find joint density of u 1 = g 1(x 1, x 2) andu 2 = g 2(x 1, x 2):

1. Find u 1 = g 1 (x 1, x 2) and u 2 = g 2 (x 1, x 2).

2. Determine h (u 1, u 2) = g −1 (u 1, u 2).

3. Find the absolute value of the Jacobian of the transformation.

4. Multiply that with the joint density of X 1, X 2 evaluated in

h1(u 1, u 2), h2(u 1, u 2).

Note: If interested in marginal density of U 1: take integralover all possible values of U 2 (see last week).

736/754


Example Jacobian technique

Di ib i f f i f d i bl




Introduction



ExercisesThe Jacobian transformation technique

FundamentalsExample Jacobian technique






The Jacobian transformation technique


E l J bi t f ti t h i



Example Jacobian transformation technique

Let {X 1, X 2} be uncertainty in the claim size of homeinsurance and unemployment insurance;

We have the joint p.d.f.:

f X 1,X 2 (x 1, x 2) = exp(−(x 1 + x 2)), if x 1 ≥ 0 and x 2 ≥ 0;0, otherwise.

Question: Find the covariance between the aggregate claimsize and the proportion due to home insurance.

Solution: Find the joint density between Y 1 = X 1 + X 2 and

Y 2 = X 1X 1+X 2 :

1. We have transformations: Y 1 = X 1 + X 2 and Y 2 = X 1X 1+X 2

.

2. Thus: X 1 = Y 2 · (X 1 + X 2) = Y 1 · Y 2 and

X2 = Y1 − X1 = Y1 · (1 − Y2)737/754




E l J bi t f ti t h i



Example Jacobian transformation technique

3. J (y 1, y 2) = y 2 y 11 − y 2 −y 1

= −y 2y 1 − y 1(1− y 2) = y 1.

4. For y 1 ≥ 0 and 0 ≤ y 2 ≤ 1 we have:

f Y 1,Y 2 (y 1, y 2) = exp (−(y 1 · y 2 + y 1 · ((1− y 2))) · y 1

=exp(−y 1) · y 1.

Hence:

f Y 1,Y 2 (y 1, y 2) = exp(−y 1) · y 1, for y 1 ≥ 0 and 0 ≤ y 2 ≤ 1;

0, otherwise.

Hence, covariance equals zero (independent!).

738/754






739/754


The MGF technique

Motivation





Introduction



ExercisesThe Jacobian transformation technique

FundamentalsExample Jacobian technique






The MGF technique

Motivation

Moment generating function



Moment generating function

Recall from week 1:

The moment generating function is defined as:

M X (t ) = E e t ·X = ∞

−∞

f X (x ) · exp(t · X ) dx

Important properties (a, b ∈ R and X , Y independent):

M m·X +b (t ) =E

e t ·(m·X +b )

= E [exp(t · m · X ) · exp(t · b )]

=E [exp(t · m · X )] · exp(t · b ) = M X (m · t ) · exp(t · b )

M X +Y (t ) =E

e t ·(X +Y )

= E [exp(t · X ) · exp(t · Y )]

=E [exp(t · X )] · E [exp(t · Y )] = M X (t ) · M Y (t )

740/754


The MGF technique

Motivation

The MGF technique



The MGF technique

This method can be effective where we recognize the m.g.f.because, when it exists, it is unique and it uniquely determinesthe distribution.

Suppose we are interested in the distribution of:

U = g (X 1, . . . , X n) ,where X 1, . . . , X n have a joint density f X 1,...,X n (x 1, . . . , x n).The MGF technique determines the distribution of U byfinding the m.g.f. of U :

M U (t ) =Ee Ut

=

∞−∞

· . . . · ∞−∞

e g (x 1,...,x n)t f X 1,...,X n (x 1, . . . , x n) dx 1 . . . dx n,

and determine distribution by comparing with known m.g.f.’s.741/754


The MGF technique

Motivation

In the special case where U is the sum of the random



In the special case where U is the sum of the randomvariables:

U = b 1 · X 1 + . . . + b n · X n,

and X 1, . . . , X n are independent, we have:

M U (t ) =Ee (b 1·X 1+...+b n·X n)·t = E

e X 1·b 1·t · . . . · E e X n·b n·t =M X 1 (b 1 · t ) · . . . · M X n (b n · t ) .

The m.g.f. of U is the product of the m.g.f. of X 1, . . . , X n.

We have also seen this in week 1 lecture, i.e., recall:

M a·X +b (t ) =M X (a · t ) · e b ·t

M X +Y (t ) =M X (t ) · M Y (t ), if X and Y are independent.

742/754


The MGF technique

Applications & exercise of the MGF technique





Introduction









The MGF technique


Application: Summing Poisson processes



Application: Summing Poisson processes

Let X

i be the i.i.d. claims arriving from males motor vehicleinsured, with rate λ1 and let Y i be i.i.d. claims arriving fromfemales motor vehicle insured with rate λ2. There are n malesinsured and m female insured.

Question: Find the distribution of the total number of claims.

Solution: Let X i ∼ Poisson (λ1) and Y i ∼ Poisson (λ2),where X 1, X 2 are independent. The m.g.f. of

U =n

i =1X i +

m

i =1Y i is given by:

M U (t ) =n

i =1

M X i (t ) · mi =1

M Y i (t )

=

e λ1·(e t −1)

n ·

e λ2·(e t −1)

m

= e (n·λ1+m·λ2)·(e t −1),

which is the m g f of a Poisson with parameter n λ1 + m λ2743/754


The MGF technique


Application: summing independent Normal




Let X 1 be the yearly return on Australian governments bondsand X 2 be the yearly return on American government bonds.

Assume that the return on government bonds is normallydistributed and the yearly return on Australian and Americangovernment bonds are independent.

Question: Find the distribution of asset returns when an

insurance invests half its wealth in Australian and half inAmerican government bonds.

744/754


The MGF technique






Solution: Let X 1 ∼ N µ1, σ21 and X 2 ∼ N µ2, σ

22, where

X 1, X 2 again are independent. The m.g.f. of U = (X 1 + X 2) /2 is given by:

M U (t ) =M X 1 (t /2) · M X 2 (t /2)

=exp

µ1/2 · t + 12

σ21/22t 2

· exp

µ2/2 · t + 12

σ22/22t 2

=exp

((µ1 + µ2)/2) t +

1

2

σ2

1 + σ22

/22 · t 2

,

which is the m.g.f. of another Normal with mean (µ1 + µ2) /2and variance (σ1/2)2 + (σ2/2)2.

U ∼ N

(µ1 + µ2)/2, (σ1/2)2 + (σ2/2)2

.

745/754


The MGF technique


Exercise: dependent Normal



Exercise: dependent Normal

Often independence is not a good assumption.

In week 9 we will see a hypothesis test for independence.

Now consider the asset value when X is bivariate normally

distributed, i.e.,:

X =

X 1X 2

∼ N

µ1

µ2

,

σ2

1 ρσ1σ2

ρσ1σ2 σ22

.

Question: What would be a logical value for ρ?Question: What is the distribution of U = (X 1 + X 2)/2.

Can we use MGF technique?

746/754


The MGF technique


Solution: No (due to dependency).



However, recall from last week:

X 1 =µ1 + σ1Z 1 X 2 = µ2 + ρσ2Z 1 + 1 − ρ2σ2Z 2,

where Z 1 and Z 2 are independent. Then we have:

M U (t ) =M (X 1+X 2)/2 (t ) = M µ1+σ1Z 1+µ2+ρσ2Z 1+

√ 1−ρ2σ2Z 2

(t /2)

=M µ1+µ2+(σ1+ρσ2)Z 1+√ 1−ρ2σ2Z 2 (t /2)

=exp

(µ1 + µ2) · t

2

· exp

1

2 (σ1 + ρσ2)2 ·

t

2

2·

exp1

2

(1

−ρ2)σ2

2

· t

22

=exp

((µ1 + µ2) /2) t +

1

2

σ2

1 + σ22 + 2ρσ1σ2

/22 · t 2

,

which is the m.g.f. of another Normal with mean (µ1 + µ2) /2

and variance σ2

1 + σ2

2 + 2ρσ1σ2 /4 (Important result!)747/754


Sums (Convolutions)

Convolutions




Introduction

Introduction









Sums (Convolutions)

Convolutions

Sums (Convolutions): the discrete case



( )

Discrete case: Let Z = X + Y , i.e., Y = Z − X :

p Z (z ) =all x

p X ,Y (x , z − x ) .

If X and Y are independent then:

p (x , y ) =p X (x ) · p Y (y ) ,

and

p Z (z ) =all x

p X (x ) · p Y (z − x ) .

This is called the convolution of p X and p Y .

748/754


Sums (Convolutions)

Convolutions

The Continuous Case ∞ z−x



F Z (z ) = ∞

−∞ z −x

−∞f X ,Y (x , y ) dydx

change of variables: y =v − x

=

∞−∞

z −∞

f X ,Y (x , v − x ) dvdx

= z −∞

∞−∞

f X ,Y (x , v − x ) dxdv .

Differentiate (under integral) to get:

f Z (z ) = ∞

−∞f X ,Y (x , z

−x ) dx .

If X and Y are independent:

f Z (z ) =

∞−∞

f X (x ) · f Y (z − x ) dx .

See W+ 7ed section 6 3749/754


Sums (Convolutions)

Exercise




Introduction

Introduction









Sums (Convolutions)

Exercise

Exercise



Let X i ∼

EXP(λ) be the size of the semiannual expecteddiscounted value of newly issued long-term disability insuranceclaims.

f X i (x i ) =

λ · exp(λ · x i ), if x i ≥ 0;0, otherwise.

Question: Find the distribution of the annual claim size.

Solution: Let Z = X 1 + X 2. If z ≥ 0 we have:

f Z (z ) = ∞

−∞

f X 1 (z − x 2) · f X 2 (x 2)dx 2

=

z 0

λ · exp(−λ · (z −x 2)) · λ · exp(−λ · x 2) dx 2

= z

0λ2 · exp(−λ · z ) dx 2 = λ2 · z · exp(−λ · z ) .

750/754


Sums (Convolutions)

Exercise

2Exponential density

2Convolution



0 2 4 60

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

xi

f X i (

x i )

λ=1

λ=2λ=3

λ=4

λ=5

λ=6

0 2 4 60

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

z

f Z ( z )

λ=1

λ=2λ=3

λ=4

λ=5

λ=6

751/754


Approximate methods

Delta-method




Introduction

Introduction









Approximate methods

Delta-method

Delta method



Say you know E [X ] = µX and Var (X ) = σ2

X

and areinterested in mean and variance of Y = g (X ), where g isnon-linear.

Using Taylor series:

Y = g (X ) ≈ g (µX ) + (X − µX ) · g (µX ) ,which implies:

E [Y ] ≈g (µX )

Var (Y ) ∗

≈ g (µX )2

·σ2X .

* Using E[Y 2] =E[g (µX )

2+(X − µX )2·g (µX )

2+2·g (µX )·(X − µX )·g (µX )]

Very useful where you do not know the exact distribution of Y = g (X )!

752/754


Distribution characteristics in samples

Exercise




Introduction

Introduction





Sums (Convolutions)

ConvolutionsExercise





Exercise

Consider an insurer offering flood insurance.

S h fl d ( i h b bili 0 7) d



Some years there are no floods (with probability q = 0.7) and

some years there are floods.

The claim size when there are floods is LogNormallydistributed with mean E[X ] = 150 and Var (X ) = 700.

The realizations of the claim sizes since 1950 are given in the

Excel file (60i =1 x i = 2, 700 and 60

i =1 x 2i = 400, 000).

Questions:a. What is the variance of the claims size since 1950?b. What is the variance of flood insurance claims when the

insurer is representative?

Solutions:

a. Population: σ2 =N

i =1 x 2i

N −

N i =1 x i N

2

= 4641.7.

b. Sample: s 2 = 1n−1 · n

i =1 x 2i − n ·n

i =1 x i n

2

= 4720.3.753/754



Exercise

The insurer wants to simulate the 5-year aggregate claim sizeusing only the latest 30 observations, without assumptions on



the distribution (60

i =31 x i = 1, 480 and 60

i =31 x 2

i

= 225, 000).

c. What would be the variance of the aggregate claim size if hesimulates with replacement?

d. Same as c., but now without replacement.e. Which one (with/without replacement) would you use to

simulate?Solutions: Note that:

Var (X ) =Var

ni =1 x i

n

=

1

n2 · Var

n

i =1

x i

⇒ n2 · Var (X ) =Var n

i =1

x i

.

c. σ2 = 225000/30 − 14802/302 = 5, 066; n2 · (σ2/n) = 25, 331.d. n2 · σ2/n · (1− (n − 1)/(N − 1)) = 25331 · 25/29 = 21, 837.

U i 5 N 30754/754

Week 4 Annotated

Documents

Transcript of Week 4 Annotated