Simulation - Generating Continuous Random Variables

39
Chair of Information Systems IV (ERIS) Institute for Enterprise Systems (InES) 16 April 2013, 10.15 am – 11.15 am Martin Kretzer Phone: +49 621 181 3276 E-Mail: [email protected] Generating Continuous Random Variables (IS 802 “Simulation”, Section 3)

description

 

Transcript of Simulation - Generating Continuous Random Variables

Page 1: Simulation - Generating Continuous Random Variables

Chair of Information Systems IV (ERIS) Institute for Enterprise Systems (InES)

16 April 2013, 10.15 am – 11.15 am

Martin Kretzer

Phone: +49 621 181 3276E-Mail: [email protected]

Generating Continuous Random Variables(IS 802 “Simulation”, Section 3)

Page 2: Simulation - Generating Continuous Random Variables

Agenda 2

Agenda

1 Introduction

2 Selected Methods

3 Generating Important Distributions

4 Summary

5 Exercises (Simulations using R)

Generating Continuous Random Variables

Page 3: Simulation - Generating Continuous Random Variables

Why “Generation of Continuous Random Variables”? 3

Simulations are able to address magnifold questions, e.g.: How many customers will we have? How long does it take to handle arriving customers? Will queues develop?

Problem: Simulations require random variables Uniformly distributed random variables are often not enough, but can be used for generation of further distributions Uniformly distributed random variables can be generated, e.g., modulo function (section 2) Example: Customers in a store arrive randomly, between 10 and 20 minutes apart, normally / exponentially

distributed on this interval.

Goal: Generate random variables Discrete random variables (section 2) Continuous random variables (univariate) (section 3)

Each method for generating a discrete random variable has its analogue in the continuous case! Multivariate normal distributed variables (section 4)

Generating Continuous Random Variables

Discrete Variables Continuous Variables Continuous distribution function Example: temperature Univariate

One predicting variable () Multivariate

Multiple predicting variables (,…)

Finite set of values Example: population 0, 1, 2, 3, …

Page 4: Simulation - Generating Continuous Random Variables

Learning Outcomes 4

Generating Continuous Random Variables

Generate random variables using the Inverse-Transform and Acceptance-Rejection Method

Develop algorithms for simulating Exponential, Normal, Poisson and Nonhomogeneous Poisson distributions

Perform simulations using R

Page 5: Simulation - Generating Continuous Random Variables

Agenda 5

Agenda

1 Introduction

2 Selected Methods

2.1 Inverse Transform Method

2.2 Acceptance Rejection Method

3 Generating Important Distributions

4 Summary

5 Exercises (Simulations using R)

Generating Continuous Random Variables

Page 6: Simulation - Generating Continuous Random Variables

Prerequisites The uniform (0,1) random variable The continuous, cumulative distribution function of the targeted random variable

Inverse Transform Method Definition:

Note: The superscript “-1” is not an exponent! It indicates the inverse! Transformation:

is defined to be that value of such that Proof:

is the distribution function of

– * because U is uniform has the same distribution as A random variable can be generated from the continuous, cumulative distribution function be generating a

random number and then setting

Inverse Transform Method: Proof 6

Generating Continuous Random Variables

Inverse Transform Method:1. Find a formula for the function 2. Generate a uniform random number 3. Return the random number

Page 7: Simulation - Generating Continuous Random Variables

Inverse Transform Method: Example 7

Example Task: Generate a random variable with

cumulative distribution function

1. Find a formula for the function

2. Formulate the algorithm for generating the random variable

Generate a random number and Return

Limitation of the Inverse Transform Method Cumulative distribution function needs to be

invertible Thus the inverse transform method is not

suited for the normal distribution!

Generating Continuous Random Variables

3xxF

31 uuF

3n

Page 8: Simulation - Generating Continuous Random Variables

Acceptance-Rejection Method 8

Difference to section 2 (discrete random variables): mass functions are replaced by densities

Prerequisites The continuous probability distribution function of the targeted random variable A method for generating a random variable having density function ; needs to be defined on the same interval as

Acceptance-Rejection Method is a constant such that (for all ) Theorem

has density Number of required iterations is a geometric random variable with mean

Proof See section 2 (discrete random variables)

Efficiency The smallest possible represents the average number of iterations

Generating Continuous Random Variables

Acceptance-Rejection Method:1. Generate having density 2. Accept the generated value with a probability proportional to

2.1 Generate a random number 2.2 If set . Otherwise go to step 1.

Page 9: Simulation - Generating Continuous Random Variables

Acceptance-Rejection Method: Example 9

Example Task: Give the rejection algorithm that generates having

Use

1. Generate a rejection procedure 1. Determine the smallest such that

1.1 Determine maximum of – 1.1.1 Differentiation yields – 1.1.2 Setting this equal to shows that maximum for

1.2 Determine smallest c– 1.2.1 – 1.2.2

2. Formulate the rejection algorithm 1. Generate random numbers and 2. If stop and set ; otherwise go to step 1

Average number of iterations =

Generating Continuous Random Variables

3120 xxxf

Page 10: Simulation - Generating Continuous Random Variables

Agenda 10

Agenda

1 Introduction

2 Selected Methods

3 Generating Important Distributions

3.1 Exponential Distribution

3.2 Normal Distribution

3.3 Poisson Distribution

3.4 Nonhomogeneous Poisson Distribution

4 Summary

5 Exercises (Simulations using R)

Generating Continuous Random Variables

Page 11: Simulation - Generating Continuous Random Variables

Exponential Distribution (1/2) 11

Example task: Generate an exponential random variable Exponential distribution:

Inverse Transform Method

1. Find a formula for the function

2. Formulate the algorithm for generating the random variable 1. Generate a random number 2. Return

Generating Continuous Random Variables

since is uniform(0,1), (1- ) is also uniform(0,1) and has the same distribution

Page 12: Simulation - Generating Continuous Random Variables

Exponential Distribution (2/2) 12

Generating Continuous Random Variables

xexF

uuF ln1

11

𝜆=1

Page 13: Simulation - Generating Continuous Random Variables

Normal Distribution (1/4) 13

Generating Continuous Random Variables

Example 2: Normal Distribution Give the rejection algorithm that generates a sequence of normal random variables with mean and variance Approach: Generate a sequence of half-normal distributed random variables (also called absolute standard

distributed random variables) and determine each variable‘s sign randomly Half-Normal distribution: only positive values!

Optimize the algorithm and assess its efficiency Acceptance-Rejection Method Set 1. Generate a rejection procedure

1. Determine the smallest such that 1.1 Determine maximum of

– 1.1.1 has its maximum if has its maximum– 1.1.1 Differentiation yields – 1.1.2 Setting this equal to shows that maximum for

1.2 Determine smallest c– 1.2.1 – 1.2.2

Page 14: Simulation - Generating Continuous Random Variables

Normal Distribution (2/4) 14

Generating Continuous Random Variables

2. Formulate the rejection algorithm 0. 1. Generate an exponential random variable with rate 1 called 2. Generate a random number 3. If stop and set

Otherwise go to step 1 4. Generate a random number and set

5. Set and go to step 1

5.0;

5.0;*

22

22

UY

UYX i

Page 15: Simulation - Generating Continuous Random Variables

5.0;

5.0;*

1

1

UY

UYX i

Normal Distribution (3/4) 15

Generating Continuous Random Variables

3. Optimization Transformations

– * because is exponential with rate 1 (see exponential distribution) Since and are exponentials with rate 1, is also an exponential with rate 1

Optimized Algorithm 0. 1. Generate an exponential random variables with rate 1 called 2. Generate an exponential random variables with rate 1 called 3. If stop and set

– Otherwise go to step 1 4. Generate a random number U and set

5. Set and set and go to step 2 4. Efficiency assessment

Average number of required exponential random variables Average number of iterations (step 1 and 2) (geometrically distributed) Usage of exponential : Average number of iterations

Average number of required squares (step 3):

Page 16: Simulation - Generating Continuous Random Variables

Normal Distribution (4/4) 16

Graphical illustration

Generating Continuous Random Variables

)(

)(

xg

xfxh

2

2

2

2 x

exf

xexg

Page 17: Simulation - Generating Continuous Random Variables

Example task: Generate the first time units of a Poisson process having rate Approach: Use the exponential distribution to generate event times (so-called interarrival times) and stop when

their sum exceeds

Inverse Transform Method

1. Find a formula for the function We already did that (see exponential distribution)

2. Formulate the algorithm for generating the Poisson process 0. 1. Generate 2.

If , stop! 3. 4. Go to step 1

The solution to the task is the sequence of the event times to

Poisson Process 17

Generating Continuous Random Variables

Legend: = first time units = time = number of events that has occurred by time

( the final values represent the number of events that occurred by time )

= event times in increasing order( represents the most recent event time)

Page 18: Simulation - Generating Continuous Random Variables

Comparison: Homogeneous and Nonhomogeneous Poisson Process 18

Generating Continuous Random Variables

Nonhomogeneous Poisson Process

Events occur randomly in time

Expected arrival rates vary with time

The rate , which represents the expected number of events, is not constant

The intensity function represents the expected number of events around the time

(Homogeneous) Poisson Process

Events are as likely to occur in all intervals of equal size

The rate , which represents the expected number of events, is constant

Previously: Now:

Page 19: Simulation - Generating Continuous Random Variables

Nonhomogeneous Poisson Process (1/2) 19

Example task: Generate the first time units of a nonhomogeneous Poisson process with intensity function Option 1: Thinning (also called random sampling) Option 2: Successive event times ( backup slides)

Inverse Transform Method

Option 1: Thinning 1. Simulate a Poisson process 2. Randomly count its events Remaining events will be nonhomogeneous

Formulate the algorithm for generating the Poisson process 0. 1. Generate a random number 2. Set and stop if 3. Generate another random number 4. If, set 5. Go to step 1

Generating Continuous Random Variables

Legend: = first time units = time = number of events that has occurred

by time = most recent event time = intensity function = expected

number of events around t;

simulation of a Poisson process

randomly counting events

Page 20: Simulation - Generating Continuous Random Variables

Nonhomogeneous Poisson Process (2/2) 20

Why is this option called “thinning”? Not all simulated evens are counted Events are only counted randomly, which “thins” the (homogeneous) Poisson process

Efficiency Rule: The more events are counted, the more efficient the thinning approach Thinning is most efficient, if , because then almost all events are counted Improvement ( backup slides):

1. Break up the interval into subintervals 2. Perform the thinning approach over each subinterval

Generating Continuous Random Variables

Page 21: Simulation - Generating Continuous Random Variables

Agenda 21

Agenda

1 Introduction

2 Selected Methods

3 Generating Important Distributions

4 Summary

5 Exercises (Simulations using R)

Generating Continuous Random Variables

Page 22: Simulation - Generating Continuous Random Variables

If the targeted distribution function is invertible, the Inverse Transform method can and should be used

For the Acceptance-Rejection method you always have to find a suitable function first

Summary 22

Generating Continuous Random Variables

Distribution Inverse Transform Method

Acceptance-Rejection Method

Exponential X

Normal X

(Homogeneous)Poisson X

Nonhomogeneous Poisson X

Page 23: Simulation - Generating Continuous Random Variables

Agenda 23

Agenda

1 Introduction

2 Selected Methods

3 Generating Important Distributions

4 Summary

5 Exercises (Simulations using R)

5.1 Exercise 1 – Inverse Transform Method (Exponential Distr.)

5.2 Exercise 2 – Acceptance Rejection Method (Normal Distr.)

Generating Continuous Random Variables

Page 24: Simulation - Generating Continuous Random Variables

Exercise 1 – Task 24

Required knowledge Inverse Transform Method Exponential Distribution

Task A casualty insurance company has 1,000 policyholders, each of whom will independently present

a claim in the next month with probability .05. The amount of the claims made are independent exponential random variables with mean $800. Use simulation to estimate the probability that the sum of these claims exceeds $50,000.

Why do we have to use simulation? The expected sum of claims will be normally distributed with mean $40,000 (), due to the strong law of large

numbers However, we do not know the mean and standard deviation

Recap: Algorithm for generating an exponential random variable 1. Generate a random number 2. Return

Generating Continuous Random Variables

Page 25: Simulation - Generating Continuous Random Variables

Exercise 1 – Live Demonstration 25

R Version: 3.0.0 (Windows 32-bit) http://cran.rstudio.com/

RStudio Version: 0.97.336 http://www.rstudio.com/ide/download/

Generating Continuous Random Variables

Page 26: Simulation - Generating Continuous Random Variables

Exercise 1 – Code 26

####################################### Author: Martin Kretzer### Date: 11 April 2012### Descr: Simulation of an exponential distribution + inverse transform method####################################### 0. Functions:

### Performs simulations and returns a list of all resulting valuesgetSimulationsVector<- function( numberOfSimulations, numberOfCustomers, mean, probability ){ lambda<- 1/mean simulationsVector<- c() for( i in 1:numberOfSimulations ) { claimsPerSimulationVector<- c() #uniformVars<- runif( numberOfCustomers ) for( j in 1:numberOfCustomers ) { uniformVarProbability <- runif(1) if( uniformVarProbability <= probability ) { uniformVarAlgorithm <- runif(1) claim<- -(1/lambda)*log( uniformVarAlgorithm ) claimsPerSimulationVector<- c( claimsPerSimulationVector, claim ) } } simulationsVector<- c(simulationsVector, sum(claimsPerSimulationVector) ) } return(simulationsVector) };

Generating Continuous Random Variables

### Probability that all values are above the variable limitgetProbability<- function( simulationsVector, limit ){ eventcount<- 0 for( i in 1:length( simulationsVector ) ) { if( simulationsVector[[i]] >= limit ) { eventcount<- eventcount + 1 } } probability<- eventcount / length( simulationsVector ) return(probability)};

####################################### 1. Set constant variables:numberOfSimulations<- 100000;numberOfCustomers<- 1000mean<- 800;limit<- 50000;probability<- 0.05;

### 2. Execute and output variables:simulationsVector <- getSimulationsVector( numberOfSimulations, numberOfCustomers, mean, probability );simulationsVector; #R maximum outputs 10000 values

### 3. Execute and output variables:hist(simulationsVector, 250, xlab="Sum of all claims",ylab="Simulations",main="Exercise: Transform Method and Exp. Distribution");

### 4. Calculate and output probability:probability<- getProbability( simulationsVector, limit );probability;

Page 27: Simulation - Generating Continuous Random Variables

Exercise 1 – Solution 27

Executed simulations: 100,000

Estimated probability that the sum of claims exceeds $50,000:

Generating Continuous Random Variables

Page 28: Simulation - Generating Continuous Random Variables

5.0;

5.0;*

1

1

UY

UYX i

Exercise 2 – Task 28

Required knowledge Acceptance-Rejection Method Normal distribution

Task Write a program that efficiently generates normal random variables using the acceptance-rejection

method with (real) mean 10 and (real) standard deviation 3. What is your estimated mean and what is your estimated standard deviation?

Recap: (Optimized) Algorithm for generating a normal distribution 0. 1. Generate an exponential random variables with rate 1 called 2. Generate an exponential random variables with rate 1 called 3. If stop and set

Otherwise go to step 1 4. Generate a random number U and set

5. Set and set and go to step 2

Generating Continuous Random Variables

Page 29: Simulation - Generating Continuous Random Variables

Exercise 2 – Live Demonstration 29

R Version: 3.0.0 (Windows 32-bit) http://cran.rstudio.com/

RStudio Version: 0.97.336 http://www.rstudio.com/ide/download/

Generating Continuous Random Variables

Page 30: Simulation - Generating Continuous Random Variables

Exercise 2 – Code 30

####################################### Author: Martin Kretzer### Date: 12 April 2012### Descr: Simulation of a half-normal distribution + accept/reject####################################### 0. Function:

### Performs simulations and returns a vector of all simulated valuesgetSimulationsVector<- function( numberOfSimulations, mean, stdDev ){ # Define variables simulations.vector<- rep(0, numberOfSimulations)

# Generate the exponential variable y1 y1<- runif(1) y1<- -log(y1) for( i in 1:numberOfSimulations ) { # Generate the independent exp. variable y2 y2<- runif(1) y2<- -log(y2) # Acceptance-Rejection procedure while( y2 < (y1-1)^2/2 ) { # if rejected, generate two new independent exp. variables y1 and # y2 and repeat procedure y1<- runif(1) y1<- -log(y1) y2<- runif(1) y2<- -log(y2) }

Generating Continuous Random Variables

y3<- y2-(y1-1)^2/2 # Generate a random number U and store the simulated variable y1 # in simulations.vector u<- runif(1) if( u <= 0.5 ) { simulations.vector[i]<- mean + (y1*stdDev) } else { simulations.vector[i]<- mean + (-y1*stdDev) } y1<- y3 } return(simulations.vector) };

####################################### Function End####################################

simulations.vector<- getSimulationsVector( 100000, 10, 3 )summary( simulations.vector )sd( simulations.vector )hist(simulations.vector, 250, xlab="",ylab="Simulations",main="Exercise: Acceptance-Rejection Method and Normal Distribution");

Page 31: Simulation - Generating Continuous Random Variables

Exercise 2 – Solution 31

Executed simulations: 100,000

Estimated mean: 10.010

Estimated standard deviation: 2.993299

Generating Continuous Random Variables

Page 32: Simulation - Generating Continuous Random Variables

Thank you for your attention.

32

Generating Continuous Random Variables

Page 33: Simulation - Generating Continuous Random Variables

Appendix 33

Appendix

1 Backup Slides

2 Bibliography

Generating Continuous Random Variables

Page 34: Simulation - Generating Continuous Random Variables

Poisson Process Introduction (1/2) 34

is the number of events in one period

is the number of subintervals is the probability that there is an event in one subinterval (Binominal distribution) Problem: there might be more than one event per subinterval

Solution: smaller subintervals this increases the amount of subintervals is required

= Poisson process having rate ()

Assumptions Number of events are independent Homogeneity: same probability for intervals of equal length

Generating Continuous Random Variables

Page 35: Simulation - Generating Continuous Random Variables

Poisson Process Introduction (2/2) 35

Interarrival times = [ time of -th event ] – [ time of the ()-th event ] E.g., and means that the first event occurred at time 5 and the second event at time

15 , ,… are independent , ,… are identically distributed exponential random variables with rate

= time of the -th event

Probability of Density function of = Sum of independent exponential random variables, each having parameter = Gamma distribution

Generating Continuous Random Variables

Page 36: Simulation - Generating Continuous Random Variables

Nonhomogeneous Poisson Process: Option 1 Improvement 36

Idea 1. Break up the interval into subintervals 2. Perform the thinning approach over each subinterval Hence “ throughout the interval” is more likely the thinning approach will be more efficient

General Algorithm 1. Determine appropriate values such that

(Equation 1) 2. Generate the nonhomogeneous Poisson process over the interval (, )

2.1 Generate exponential random variables with rate 2.2 Accept the generated event occurring at time , , with probability

Concrete algorithm (Generate the first time units of a nonhomogeneous Poisson process) 1. 2. Generate a random number and set 3. If , go to step 8 4. 5. Generate a random number 5. If , set 7. Go to step 2 8. If , stop 9. 10. Go to step 3

Generating Continuous Random Variables

Legend: = intensity function when Equation 1

is satisfied = present time = present interval = number of events so far = event times

Page 37: Simulation - Generating Continuous Random Variables

Nonhomogeneous Poisson Process: Option 2 37

Option 2: Successive event times

Idea: 1. (Directly) Generate the event time

1.1 Generate the event time from the distribution – The simulated value is (; only differentiated, because “” refers to event times and “” refers to the value

of ) (e.g., simulate by using the Inverse Transform Method)

2. Use to generate 2.1 Generate a value from the distribution 2.2

3. Use to generate 2.1 Generate a value from the distribution 2.2

= distribution of the additional time until the next event = probability, that time from the current event to the next event is less than

Generating Continuous Random Variables

Page 38: Simulation - Generating Continuous Random Variables

Appendix 38

Appendix

1 Backup Slides

2 Bibliography

Generating Continuous Random Variables

Page 39: Simulation - Generating Continuous Random Variables

Bibliography 39

ETH Zürich. 2013. The Uniform Distribution, ETH Zürich, Zürich, CH (available online at http://stat.ethz.ch/R-manual/R-devel/library/stats/html/Uniform.html ).

Haugh, M. 2010. “Generating Random Variables and Stochastic Processes,” Columbia University, US (available online at http://www.columbia.edu/~mh2078/MCS_ Generate_ RVars .pdf ).

The R Core Team 2013. “R: A Language and Environment for Statistical Computing. Reference Index. Version 3.0.0 (2013-04-03),” (part of the R download).

R-Project 2013. “Probability Distributions,” (available online at http://cran.r-project.org/web/views/Distributions.html ).

Ross, S. M. 2013. Simulation, 5th ed., San Diego, CA: Academic Press. Sigman, K. 2009. “Inverse Transform Method,” Columbia University, US (available online at

http://www.columbia.edu/~ks20/4404-13-Spring/4404-Notes-ITM.pdf ). Valdez, E. A. 2008. “Generating Continuous Random Variables,” University of Connecticut,

US (available at http://www.math.uconn.edu/~valdez/math276s08/Math276-Week45.pdf ). Venables, W. N., Smith, D. M., and the R Core Team. 2013. “An Introduction to R. Notes on

R: A Programming Environment for Data Analysis and Graphics Version 3.0.0 (2013-04-03),” (part of the R download).

Generating Continuous Random Variables