Module 1: Statistical Issues in Micro simulation Paul Sousa.

Module 1: Module 1: Statistical Statistical Issues in Issues in

Micro Micro simulationsimulationPaul SousaPaul Sousa

OverviewOverview

Numerical SolutionNumerical Solution SimulationSimulation Random number generationRandom number generation TransformationTransformation Techniques: Gibbs sampling, Techniques: Gibbs sampling,

Metropolis Hasting algorithmMetropolis Hasting algorithm Variance reduction techniquesVariance reduction techniques ConclusionConclusion

Numerical Solution

Monte Carlo Technique

Simulation

Deterministic Simulation

Stochastic Simulation

Monte Carlo Simulation

IntroductionIntroduction

Model Solution: Analytical vs NumericalModel Solution: Analytical vs Numerical Numerical solution: Substitutes Numbers for Numerical solution: Substitutes Numbers for

Independent Variables and Parameters--------Independent Variables and Parameters--------Needs Iteration Technique.Needs Iteration Technique.

Numerical Technique: Monte Carlo Method & Numerical Technique: Monte Carlo Method & SimulationSimulation

Simulation: Deterministic Simulation & Simulation: Deterministic Simulation & Stochastic Simulation. Stochastic Simulation.

Deterministic Simulation: Does not Deterministic Simulation: Does not Necessarily Imply the Use of Random NumberNecessarily Imply the Use of Random Number

Stochastic Simulation: Uses Random Stochastic Simulation: Uses Random Numbers---Denoted as Monte Carlo Numbers---Denoted as Monte Carlo Simulation.Simulation.

Linear Congruential Linear Congruential GeneratorsGenerators

A sequence of integers IA sequence of integers I11, I, I22,…, each between 0 and m-1 (a ,…, each between 0 and m-1 (a large number) is generated by the recurrence relation:large number) is generated by the recurrence relation:

IIj+1j+1 = mod (a I = mod (a Ijj + c, m) + c, m)

where a and c are positive integers known as the multiplier where a and c are positive integers known as the multiplier and increment, and m is the modulusand increment, and m is the modulus

To calculate mod (X, m) divide X by m, then take the To calculate mod (X, m) divide X by m, then take the remainder term and multiply it by mremainder term and multiply it by m

e.g. e.g. mod (12, 7) = 5mod (12, 7) = 5 12/7 = 1.714312/7 = 1.7143 0.7143 x 7 =50.7143 x 7 =5 Finally, divide IFinally, divide Ij j by m gives a uniform variable between 0 and by m gives a uniform variable between 0 and

11 Linear congruential methods are very fast, but are not Linear congruential methods are very fast, but are not

completely free of sequential correlation on successive calls.completely free of sequential correlation on successive calls.

Transformation to other Transformation to other DistributionsDistributions

Consider a random variable with density function f Consider a random variable with density function f (x) and corresponding cumulative density function F (x) and corresponding cumulative density function F (x). If the inverse of cumulative density function for (x). If the inverse of cumulative density function for X can be calculated, then X can be obtained from U.X can be calculated, then X can be obtained from U.

By definition, F (x) = k means that the probability of By definition, F (x) = k means that the probability of obtaining a draw equal to or below x is k, where k is obtaining a draw equal to or below x is k, where k is between 0 and 1. A draw u from the standard between 0 and 1. A draw u from the standard uniform provides a number between 0 and 1. We uniform provides a number between 0 and 1. We can set F (x) = u.can set F (x) = u.

thus x = Fthus x = F-1-1 (u) (u) This procedure works only for univariate This procedure works only for univariate

distributions.distributions.

Univariate Density Univariate Density ExampleExample

Example: Extreme value distributionExample: Extreme value distribution

density function, f (x) = exp (-x) * exp(-exp(-x))density function, f (x) = exp (-x) * exp(-exp(-x))

CDF, F (x) = exp(-exp(-x))CDF, F (x) = exp(-exp(-x))

A draw from this density is obtained as x = -ln A draw from this density is obtained as x = -ln (-ln u)(-ln u)

Draws from more complicated densities:Draws from more complicated densities:

Accepting-Reject MethodAccepting-Reject Method

Importance SamplingImportance Sampling

Gibbs SamplingGibbs Sampling

Metropolis-Hasting AlgorithmMetropolis-Hasting Algorithm

Accept-Reject MethodAccept-Reject Method More generalized way of drawing from More generalized way of drawing from

multivariate distributions.multivariate distributions. Suppose we want to draw from multivariate Suppose we want to draw from multivariate

density g (x) within the range a ≤ x ≤ bdensity g (x) within the range a ≤ x ≤ b i.e. drawing from:i.e. drawing from:

f (x) = f (x) = { 1/k g (x) { 1/k g (x) a ≤ x ≤ ba ≤ x ≤ b{ 0{ 0 otherwiseotherwise

where k is a normalized constantwhere k is a normalized constant We can obtain draws from f by simply drawing We can obtain draws from f by simply drawing

from g and retaining (“accepting”) the draws that from g and retaining (“accepting”) the draws that are within the relevant range and discarding are within the relevant range and discarding (“rejecting”) the draws that are outside the range.(“rejecting”) the draws that are outside the range.

Accept-Reject MethodAccept-Reject Method

Advantage: It can be applied Advantage: It can be applied whenever it is possible to draw from whenever it is possible to draw from the untruncated density.the untruncated density.

Disadvantage: Crude method -> Disadvantage: Crude method -> problemsproblems

However, it is a useful “last option”However, it is a useful “last option”

Importance SamplingImportance Sampling

Suppose x has a density f (x) that cannot be Suppose x has a density f (x) that cannot be easily drawn from by other procedures. easily drawn from by other procedures. Suppose further that there is another Suppose further that there is another density g (x) that can be easily draw from.density g (x) that can be easily draw from.

Draws from f (x) can be obtained as follows:Draws from f (x) can be obtained as follows:

1.1. Take a draw from g (x) and label it xTake a draw from g (x) and label it x11..

2.2. Weight the draw by f (xWeight the draw by f (x11) /g (x) /g (x11))

3.3. Repeat this process many times.Repeat this process many times. The set of weight draws is equivalent to the The set of weight draws is equivalent to the

set of draws from f.set of draws from f.

Gibbs SamplingGibbs Sampling For multinomial distributions, it is sometimes difficult For multinomial distributions, it is sometimes difficult

to draw directly from the joint density and yet easy to to draw directly from the joint density and yet easy to draw from the conditional density of each element draw from the conditional density of each element given the values of the other elements. Gibbs sampling given the values of the other elements. Gibbs sampling can be used in these situations.can be used in these situations.

Consider two random variables xConsider two random variables x11 and x and x22..

The joint density is f (xThe joint density is f (x11, x, x22), and the conditional ), and the conditional densities are f (xdensities are f (x11|x|x22) and f (x) and f (x22|x|x11).).

Gibbs sampling proceeds by drawing iteratively from Gibbs sampling proceeds by drawing iteratively from the conditional densities: drawing xthe conditional densities: drawing x11 conditional on a conditional on a value of xvalue of x22, drawing x, drawing x22 conditional on this draw of x conditional on this draw of x11, , drawing a new xdrawing a new x11 conditional on the new value of x conditional on the new value of x22, , and so on.and so on.

This process converges to draws from the joint density.This process converges to draws from the joint density.

Metropolis-Hastings Metropolis-Hastings AlgorithmAlgorithm

1.1. Start with a value of the vector x, labeled xStart with a value of the vector x, labeled x00

2.2. Choose a trial value of xChoose a trial value of x11 as x as x1t1t = x = x00 + n, where n is + n, where n is drawn from a distribution g (drawn from a distribution g (ηη) that has zero mean. ) that has zero mean. Usually a normal distribution is specified for g (Usually a normal distribution is specified for g (ηη).).

3.3. Calculate the density at the trial value xCalculate the density at the trial value x1t1t, and , and compare it with the density at the original value xcompare it with the density at the original value x00, , i.e. compare f (xi.e. compare f (x1t1t) with f(x) with f(x00). If f (x). If f (x1t1t) > f (x) > f (x00), then ), then accept xaccept x1t1t, label it x, label it x11, and move to step 4. If f (x, and move to step 4. If f (x1t1t) ≤ ) ≤ f (xf (x00), then accept x), then accept x1t1t with probability f(x with probability f(x1t1t)/f(x)/f(x00), and ), and reject it with probability 1 - f(xreject it with probability 1 - f(x1t1t)/f(x)/f(x00). To determine ). To determine whether to accept or reject xwhether to accept or reject x1t1t in this case, draw a in this case, draw a standard uniform standard uniform μμ. If . If μμ ≤ f(x ≤ f(x1t1t)/f(x)/f(x00), then keep x), then keep x1t1t. . Otherwise, reject xOtherwise, reject x1t1t. If x. If x1t 1t is accepted, then label it is accepted, then label it xx11. If x. If x1t 1t is rejected, then use xis rejected, then use x00 as x as x11..

Metropolis-Hastings Metropolis-Hastings AlgorithmAlgorithm

4.4. Choose a trial value of xChoose a trial value of x22 as x as x2t2t = x = x11 + + ηη, where , where ηη is a new draw from g ( is a new draw from g (ηη).).

5.5. Apply the rule in step 3 to either accept xApply the rule in step 3 to either accept x2t2t as x as x22 or reject xor reject x2t2t and use x and use x11 as x as x2. 2.

6.6. Continue this process for many iterations. The Continue this process for many iterations. The sequence xsequence xtt becomes equivalent to draws from becomes equivalent to draws from f (x) for sufficiently large t. f (x) for sufficiently large t.

General but computational intensive algorithmGeneral but computational intensive algorithm

Variance ReductionVariance Reduction The use of independent random draws in simulation is The use of independent random draws in simulation is

appealing because it is conceptually straightforward and appealing because it is conceptually straightforward and the statistical properties of the resulting simulator are the statistical properties of the resulting simulator are easy to derive.easy to derive.

However, there are other ways to take draws that can However, there are other ways to take draws that can provide greater accuracy for a given number of draws.provide greater accuracy for a given number of draws.

In taking a sequence of draws from the density f( ), two In taking a sequence of draws from the density f( ), two issues are at stake: Coverage and Covariance.issues are at stake: Coverage and Covariance.

Coverage: If our objective is to approximate over the Coverage: If our objective is to approximate over the entire domain F (x) = ∫ f (x)entire domain F (x) = ∫ f (x)

A more accurate approximation would be obtained by A more accurate approximation would be obtained by evaluating f (x) throughout the entire domain of f evaluating f (x) throughout the entire domain of f better coveragebetter coverage

Variance ReductionVariance ReductionCovarianceCovariance With independent draws, the covariance over draws is With independent draws, the covariance over draws is

zero. The variance of a simulator based on R zero. The variance of a simulator based on R independent draws is therefore the variance based on independent draws is therefore the variance based on one draw divided by R.one draw divided by R.

If the draws are negatively correlated instead of If the draws are negatively correlated instead of independent, then the variance of the simulator is lower.independent, then the variance of the simulator is lower.

The issue of Covariance is related to CoverageThe issue of Covariance is related to Coverage By inducing a negative correlation between draws, By inducing a negative correlation between draws,

better coverage is usually assured.better coverage is usually assured. E.g. With R=2, if the two draws are taken independently, E.g. With R=2, if the two draws are taken independently,

then both could end up being at the low side of the then both could end up being at the low side of the distribution. If negative correlation is induced, then the distribution. If negative correlation is induced, then the second draw will tend to be high if the first draw is low, second draw will tend to be high if the first draw is low, which provides better coverage.which provides better coverage.

Variance Reduction Variance Reduction TechniquesTechniques

AntitheticsAntithetics Antithetics draws are obtained by Antithetics draws are obtained by

creating various types of mirror creating various types of mirror images of a random draw.images of a random draw.

For a symmetric density that is For a symmetric density that is centered on zero, the simplest centered on zero, the simplest antithetic variate is created by antithetic variate is created by reversing the sign of all elements of a reversing the sign of all elements of a draw. E.g. xdraw. E.g. x2k2k = - x = - x2k-12k-1 k = 1 k = 1 n/2 n/2

Variance Reduction Variance Reduction TechniquesTechniques

Systematic samplingSystematic sampling Systematic sampling creates a grid of points over the Systematic sampling creates a grid of points over the

support of the density and randomly shifts the entire support of the density and randomly shifts the entire grid.grid.

Consider draws from a uniform distribution between 0 Consider draws from a uniform distribution between 0 and 1. The unit interval is divided into four segments and 1. The unit interval is divided into four segments and draws taken in a way that assures one draw in each and draws taken in a way that assures one draw in each segment with equal distance between the draws. Take a segment with equal distance between the draws. Take a draw from a uniform between 0 and 0.25, as xdraw from a uniform between 0 and 0.25, as x11; x; x22 = = 0.25 + x0.25 + x11; x3 = 0.5 + x; x3 = 0.5 + x11; ;

xx44 = 0.75 + x = 0.75 + x11.. It implies a tradeoff between the number of random It implies a tradeoff between the number of random

variables and the coveragevariables and the coverage

Module 1: Module 1: Statistical Statistical Issues in Issues in

Micro Micro simulationsimulationPaul SousaPaul Sousa

Module 1: Statistical Issues in Micro simulation Paul Sousa.

Documents

Transcript of Module 1: Statistical Issues in Micro simulation Paul Sousa.