non_unif_HO
-
Upload
gowthamkurri -
Category
Documents
-
view
216 -
download
0
Transcript of non_unif_HO
-
8/11/2019 non_unif_HO
1/58
-
8/11/2019 non_unif_HO
2/58
1 Introduction
2 Inversion method
3 Rejection method
4 R functions
Important families of discrete random variablesImportant families of continuous random variables
5 The bootstrap
6 Gaussian vectors
7 Exercises
G. Guillot (DTU) Non uniform random numbers - 02443 2 / 58
-
8/11/2019 non_unif_HO
3/58
Introduction
Introduction
From today on, we assume that we have a reliable (pseudo-)RNG thatproduces uniform random variables on[0, 1].We can use runif() in R for this purpose.
How to simulate non uniform
discrete random variables
e.g. from the geometric distribution: waiting time of the first headP(X =k) = (1 p)k1pX Geom(p) p [0, 1], k Ncontinuous random variablese.g. the Gaussian (or normal) random distribution with pdf
f(x) = 1
2exp[ (x )
2
22 ]
random vectors, i.e. random variables with values in Rp
We can do that by simulating uniform numbers and makingsomething to them...
G. Guillot (DTU) Non uniform random numbers - 02443 3 / 58
-
8/11/2019 non_unif_HO
4/58
I i h d
-
8/11/2019 non_unif_HO
5/58
Inversion method
Anamorphosis by the cdf
Theorem: image of a rv through its one-to-one c.d.f.Let Xbe a continuous r.v. with a one-to-one (bijection) c.d.fF.Let us define Y as Y =F(X). Then Y U([0, 1])
## Illustration of theorem on inversion
h = seq(0,10,.01) ; lambda = 1/2plot(h,dexp(h,rate=lambda),type=l,col=2,lwd=2,xlab=,ylab=,ylim=c(0,1)) ; abline(v=0);abline(h=0)lines(h,pexp(h,rate=lambda),lty=1,lwd=2)x = rexp(n=1,rate=lambda) ; y = pexp(x,rate=lambda)points(x,0,col=2,lwd=2,pch=16,cex=2)points(x,y,col=1,lwd=2,pch=16,cex=2) ;arrows(x0=x,y0=0,x1=x,y1=y,lty=2,angle=15)points(0,y,col=3,lwd=2,pch=16,cex=2);arrows(x0=x,y0=y,x1=0,y1=y,lty=2,angle=15)
n = 100
h = seq(0,10,.01) ; lambda = 1/2plot(h,dexp(h,rate=lambda),type=l,col=2,lwd=2,xlab=,ylab=,ylim=c(0,1)) ; abline(v=0);abline(h=0)lines(h,pexp(h,rate=lambda),lty=1,lwd=2)x = rexp(n=n,rate=lambda) ; y = pexp(x,rate=lambda)points(x,rep(0,n),col=2,lwd=2,pch=1,cex=2)points(x,y,col=1,lwd=2,pch=1,cex=2)points(rep(0,n),y,col=3,lwd=2,pch=1,cex=2)
G. Guillot (DTU) Non uniform random numbers - 02443 5 / 58
I i th d
-
8/11/2019 non_unif_HO
6/58
Inversion method
The proof...
Theorem: image of a rv through its one-to-one c.d.f.
Let Xbe a continuous r.v. with a one-to-one (bijection) c.d.fF.Let us define Y as Y =F(X). Then Y U([0, 1])
Proof: For yR
we have
P(Y y) = P(F(X) y)= P(X F1(y))= F(F1(y))
= y
This property suggests a method to simulate aU([0, 1])from an arbitrarydistrib. with a one-to-one c.d.f.
G. Guillot (DTU) Non uniform random numbers - 02443 6 / 58
-
8/11/2019 non_unif_HO
7/58
Inversion method
-
8/11/2019 non_unif_HO
8/58
Inversion method
Simulation of an exponential rv by inversion
Imagine we want to simulate an exponential rv,
i.e. with pdff(x) = exp(x)I[0,+](x); R+The corresponding cdf is F(x) = 1 exp(x)Fhas an inverse F1(y) = 1/ ln(1 y)Then X defined as
1/ ln(1
U)
Exp()
## Simulation of an exponential rv by inversion
lambda = 1/2 ; n=1000 ; u = runif(n)
h = seq(0,10,.1) ; plot(h,dexp(h,rate=lambda),
type=l,lwd=3,col=3,xlab=,ylab=)
x = (-1/lambda) *log(1-u)points(x,rep(0,n),col=3)
hist(x,breaks=seq(-0.1,max(x)+.1,.1),add=TRUE,prob=TRUE,col=3)
## the right way to code this in R without using rexp:
qexp(runif(n))G. Guillot (DTU) Non uniform random numbers - 02443 8 / 58
Rejection method
-
8/11/2019 non_unif_HO
9/58
Rejection method
Points uniform in a subset ofR2
Definition
Let Dbe a subset ofR2
of finite area. A random vector U= (U1, U2)isuniform in D if for any A D, P(U A) = |A|/|D|.The associated density is f(u) =f(u1, u2) =
1
|D| ID(u1, u2)
G. Guillot (DTU) Non uniform random numbers - 02443 9 / 58
Rejection method
-
8/11/2019 non_unif_HO
10/58
Rejection method
Points uniform between a density curve and the x-axis
Theorem
Let gbe a pdf on R and Sgthe subset ofR2 defined as
{(x, y); 0 y g(x)}i) If(X,Y)is uniformly distributed in Sg then X has gas pdf.
ii) IfX
g, U
U([0, 1])and Y =Ug(X)then(X,Y)is uniformly
distributed in Sg.
G. Guillot (DTU) Non uniform random numbers - 02443 10 / 58
-
8/11/2019 non_unif_HO
11/58
Rejection method
-
8/11/2019 non_unif_HO
12/58
j
Proof
i) If(X,Y) U(Sg) then
P(X x) = Area({(u, v); u x; 0 v g(u)})=
x
g(t)dt i.e. X g
ii) IfX g, U U([0, 1])and Y =Ug(X),then(X,Y)has a joint density h(x, y)defined as
h(x, y) =g(x) 1
g(x)I[0,g(x)](y) = I[0,g(x)](y)
Hence for B Sg,
P((X,Y) B) =B
h(x, y)dxdy=
B
I[0,g(x)](y)dxdy= |B|
i.e. (X,Y)is uniformly distributed in Sg.G. Guillot (DTU) Non uniform random numbers - 02443 12 / 58
-
8/11/2019 non_unif_HO
13/58
Rejection method
-
8/11/2019 non_unif_HO
14/58
Rejection method
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
Target distribution with density f.
G. Guillot (DTU) Non uniform random numbers - 02443 14 / 58
Rejection method
-
8/11/2019 non_unif_HO
15/58
Rejection method
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
An auxiliary distribution gfrom which we now how to sample.
G. Guillot (DTU) Non uniform random numbers - 02443 15 / 58
Rejection method
-
8/11/2019 non_unif_HO
16/58
Rejection method
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
We assume that gcan majorate uniformly f after re-scaling.
G. Guillot (DTU) Non uniform random numbers - 02443 16 / 58
Rejection method
-
8/11/2019 non_unif_HO
17/58
Rejection method
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
Generate some points(Xi, Yi)in the domain under the graph of there-scaled auxiliary density C g.
G. Guillot (DTU) Non uniform random numbers - 02443 17 / 58
Rejection method
-
8/11/2019 non_unif_HO
18/58
Rejection method
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
Throw away points(Xi,Yi)outside Sf.The x coordinates of the remaining points follow the target distribution f.
G. Guillot (DTU) Non uniform random numbers - 02443 18 / 58
Rejection method
-
8/11/2019 non_unif_HO
19/58
Remarks on the rejection method
All Xivalues produced initially are drawn from gbut after dropping
the bad ones, they are distributed according to f
Method can be generalized easily in higher dimension
G. Guillot (DTU) Non uniform random numbers - 02443 19 / 58
Rejection method
-
8/11/2019 non_unif_HO
20/58
Pseudo-code for the rejection algorithm
Init:
X =x0; Y =f(x0) + 1
While Y>f(X) do:X gY U([0,C g(x)])End do
Deliver X
G. Guillot (DTU) Non uniform random numbers - 02443 20 / 58
-
8/11/2019 non_unif_HO
21/58
R functions Important families of discrete random variables
-
8/11/2019 non_unif_HO
22/58
Back to initial examples
P(X =k) = (1 p)k1pX Geom(p) p [0, 1], k N
f(x) = 1
2exp[
(x )2
22 ]
These distributions have nice concrete interpretations, which can be usedfor simulation.
The geometric distribution: waiting time of the first head
The Gaussian (or normal) random distribution can be seen as the sumofn i.i.d random variables (Central Limit Theorem)
G. Guillot (DTU) Non uniform random numbers - 02443 22 / 58
-
8/11/2019 non_unif_HO
23/58
R functions Important families of continuous random variables
-
8/11/2019 non_unif_HO
24/58
Important families of continuous random variables
Continuous uniform: all subsets of equal sizes equally likely
fX(x) = 1
b
a
I[a,b](x)
X U([a, b])Normal (or Gaussian) distribution
fX(x) = 1
2exp
1
2
x
2X N(, )orX N(,
2
) R, R+
G. Guillot (DTU) Non uniform random numbers - 02443 24 / 58
-
8/11/2019 non_unif_HO
25/58
R functions Important families of continuous random variables
-
8/11/2019 non_unif_HO
26/58
Common R simulation functions for continuous simulation
runif, rnorm, rexp, rgamma, rbeta, rlnorm, rf, rchisqSee ?distribution
G. Guillot (DTU) Non uniform random numbers - 02443 26 / 58
The bootstrap
-
8/11/2019 non_unif_HO
27/58
Assessing the value of a sample
Blurp on opinion pole & qiality control
G. Guillot (DTU) Non uniform random numbers - 02443 27 / 58
The bootstrap
-
8/11/2019 non_unif_HO
28/58
Reminder: estimation, estimator, bias, variance, MSE
G. Guillot (DTU) Non uniform random numbers - 02443 28 / 58
-
8/11/2019 non_unif_HO
29/58
The bootstrap
-
8/11/2019 non_unif_HO
30/58
Now imagine for a while that we do not know that V() =2/n
or that2 = (Xi Xn)2/(n 1)
G. Guillot (DTU) Non uniform random numbers - 02443 30 / 58
The bootstrap
A f i l l i i
-
8/11/2019 non_unif_HO
31/58
A fairly general situation:
An unknown parameter
An estimator
No known formula for the variance (or MSE) of
G. Guillot (DTU) Non uniform random numbers - 02443 31 / 58
The bootstrap
B k t th i f
-
8/11/2019 non_unif_HO
32/58
Back to the variance of
By definition:variance = expected square of the difference with expectation
V(Xn)can be thought of as the number obtained as follows
Take a 1st i.i.d sample of size n, compute X(1)n
Take a 2nd i.i.d sample of size n, compute X(2)n
...
...Take a B-th i.i.d sample of size n, compute X
(B)n
Compute V(Xn) = 1B
Bj=1
X(j)n 1/B Bj=1
X(j)n2
G. Guillot (DTU) Non uniform random numbers - 02443 32 / 58
-
8/11/2019 non_unif_HO
33/58
The bootstrap
-
8/11/2019 non_unif_HO
34/58
The previous quantity estimates V()more and more accurately as Bincreases(law of large numbers)
Little problem: we do not have Bsamples of size n, we have only one!
There is a get around ... just pull your bootstraps!
G. Guillot (DTU) Non uniform random numbers - 02443 34 / 58
The bootstrap
Estimating V () with the so called bootstrap estimator
-
8/11/2019 non_unif_HO
35/58
Estimating V()with the so-called bootstrap estimator(Efron, 1979)
We have a single dataset(X1, ...,Xn)We can pretend to have Bdifferent samples of size n:
sample Y(1)
1 ,..., Y(1)n in(X1,...,Xn)uniformly with replacement,
compute Y(1)n
sample Y(2)
1 ,..., Y(2)n in(X1,...,Xn)uniformly with replacement,
compute Y(2)n......sample Y
(B)1 ,..., Y
(B)n in(X1, ...,Xn)uniformly with replacement,
compute Y(B)n
Note that Y(b)
1 , ...,Y(b)n usually have ties
Compute V(Xn) = 1
B
B
j=1
Y(j)n 1/B
B
j=1Y
(j)n
2
G. Guillot (DTU) Non uniform random numbers - 02443 35 / 58
-
8/11/2019 non_unif_HO
36/58
-
8/11/2019 non_unif_HO
37/58
The bootstrap
Numerical comparison
-
8/11/2019 non_unif_HO
38/58
Numerical comparison
sd
-
8/11/2019 non_unif_HO
39/58
Bootstrap-based confidence intervals
Problem: give a C.I. for a parameter on the basis of a sample X1, ...,XnSolution:
Build an estimator
Assume that NEstimate V()by the bootstrap estimator V()
Build the normal-based C.I. =
+z/2
V() ; +z1/2
V()
G. Guillot (DTU) Non uniform random numbers - 02443 39 / 58
The bootstrap
Take home message about the bootstrap
-
8/11/2019 non_unif_HO
40/58
Take home message about the bootstrap
A collection of several samples can be mimicked by resampling thedata
A theoretical parameter that can be expressed as an expectation canbe estimated by an average over the fake samples
The idea is very general (see example for the median below)
G. Guillot (DTU) Non uniform random numbers - 02443 40 / 58
Gaussian vectors
Multivariate Gaussian (or normal) random vectors
-
8/11/2019 non_unif_HO
41/58
Multivariate Gaussian (or normal) random vectors
Examples of bivariate normal densities (n= 2)
G. Guillot (DTU) Non uniform random numbers - 02443 41 / 58
Gaussian vectors
Multivariate Gaussian (or normal) random vectors
-
8/11/2019 non_unif_HO
42/58
Multivariate Gaussian (or normal) random vectors
Definition
Let X= (X1,X2, . . . , Xn)T be a random vector
X has an n-dimensional multivariate normal distribution if there is
a kdimensionnal random vector Z with iid N(0, 1)entriesan n kdeterministic matrix Aan n dimensional deterministic vector b
such that X and AZ + b have the same distribution
Since A and b are fixed, we haveE[X] =b andVar[X] =AAT.
G. Guillot (DTU) Non uniform random numbers - 02443 42 / 58
-
8/11/2019 non_unif_HO
43/58
Gaussian vectors
Reminder: the expectation (or mean) of a random vector
-
8/11/2019 non_unif_HO
44/58
p ( )
Definition: expectation of a random vector
The mean vector = (i)of a random vector X= (X1,X2, . . . , Xn)T isdefined as the vector whose entries i arei=E(Xi) i= 1, ..., n
G. Guillot (DTU) Non uniform random numbers - 02443 44 / 58
-
8/11/2019 non_unif_HO
45/58
Gaussian vectors
The density of a multivariate normal vector
-
8/11/2019 non_unif_HO
46/58
y
Let us assume that Y has a multivariate normal distribution in Rn. If thevariance-covariance matrix ofY is of full rank, then Y has a density onRn as follows:
fY(y) = 1
(2)n/2
detexp1
2(y )T1(y )
is the expectation ofY
We write Y Nn(,).
G. Guillot (DTU) Non uniform random numbers - 02443 46 / 58
Gaussian vectors
Interest of the multivariate Gaussian distribution
-
8/11/2019 non_unif_HO
47/58
Flexible: many results can be derived analytically
Central Limit Theorem (CLT): many processes exhibit aGaussian-like distribution
No obvious other alternative in multivariate statistics
G. Guillot (DTU) Non uniform random numbers - 02443 47 / 58
-
8/11/2019 non_unif_HO
48/58
Gaussian vectors
-
8/11/2019 non_unif_HO
49/58
Examples of bivariate normal densities (n= 2)
G. Guillot (DTU) Non uniform random numbers - 02443 49 / 58
Gaussian vectors
The Choleski factorization
-
8/11/2019 non_unif_HO
50/58
The Choleski factorization
Let be the covariance matrix of a random vector.
Being a symetric positive-definite matrix, can be written = LU
where L is lower triangular and U = LT.
This is known as the Choleski or (LU) factorization
Useful everywhere in numerical analysis because computations withtriangular matrices are fast
G. Guillot (DTU) Non uniform random numbers - 02443 50 / 58
-
8/11/2019 non_unif_HO
51/58
-
8/11/2019 non_unif_HO
52/58
Gaussian vectors
The LUalgorithm for simulating general random vectors
-
8/11/2019 non_unif_HO
53/58
Algorithm to simulate a random vectors with mean and variance matrix
1
Simulate X= (X1,...,Xn)T
centred (E[Xi] = 0) andVar[X] =In2 Find L such that =LU
3 Compute Y = + LX
G. Guillot (DTU) Non uniform random numbers - 02443 53 / 58
Gaussian vectors
The LUalgorithm forsimulating MVN random vectors
-
8/11/2019 non_unif_HO
54/58
Algorithm to simulate amultivariate Gaussianrandom vectors with mean and variance matrix
1 Simulate X= (X1,...,Xn)T i.i.dN(0, 1)2 Find L such that =LU
3 Compute Y= (Y1, ...,Yn)T =+ LX
G. Guillot (DTU) Non uniform random numbers - 02443 54 / 58
Gaussian vectors
R code for simulation of a bivariate Gaussian vector with the LU method
-
8/11/2019 non_unif_HO
55/58
## Define target densitysd =1 ; rho = .8Sigma = matrix(nr=2,nc=2,data=c(sd,rho,rho,sd))Sigma
## Choleski factorisationU = chol(Sigma) ; L = t(U)LL %*% U ## just checking
# simulationn = 1000x1 = rnorm(n) ; x2 = rnorm(n)X = rbind(x1,x2)Y = L %*% XY = t(Y)plot(Y,asp=1)
## cheking that the empirical var-covar matrix is close to the targetvar(Y)
Sigma
## cheking that empirical bivariate densityrequire(MASS)image(kde2d(Y[,1],Y[,2]),asp=TRUE) ; points(Y,cex=.1,col=3)contour(kde2d(Y[,1],Y[,2]),add=TRUE)
G. Guillot (DTU) Non uniform random numbers - 02443 55 / 58
-
8/11/2019 non_unif_HO
56/58
-
8/11/2019 non_unif_HO
57/58
Exercises
Exercises II
-
8/11/2019 non_unif_HO
58/58
5 You want to sell your car. You receive a first offer at a price of 12 (say in K Euro) Since itis the first offer, you decide to reject it and wait a little bit to smell the market. You
reject offers and wait until you get the first offer strictly larger than 12. Study bysimulation the waiting time (counted in number of offers untill the sell). You can forinstance estimate the expectation of the waiting time. Assume that the Xis are i.i.dPoisson(10). Consider the solution conditionally on X1 = 12, then unconditionally (i.e. inaverage over all possible X1 values under the assumption that the Xis are i.i.dPoisson(10)). The use ofrpois is allowed in this exercise.
6 Variance in the estimation of the median of a Poisson distributionSimulate a single dataset of say n= 200observations from a Poisson distributionwith parameter10Estimate the theoretical median from your sampleImplement the bootstrap estimation technique to evaluate the variance of yourestimator
7 Simulate a sample of size 50000 from a bivariate centred, standardized Gaussian vectorwith correlation coefficient = 0.7. Plot this sample. Extract a sample of a randomvariable approximately distributed as Y2|Y1 =.5. Estimate its mean and variance. Plot itsempirical histogram and compare it to a normal density with same parameters as the onesimulated above.
G. Guillot (DTU) Non uniform random numbers - 02443 58 / 58