Durham 102208

8/14/2019 Durham 102208

1/35

SIMULATIONS ANDCOSMOLOGICAL

INFERENCEMichael D. Schneider

Durham

In collaboration with Lloyd Knox(UC Davis), Salman Habib, KatrinHeitmann, David Higdon(Los Alamos National Laboratory), Charles

Nakhleh(Sandia National Laboratories)

October 22, 2008

8/14/2019 Durham 102208

2/35

OverviewQuestion: How do we estimate cosmological parameters

when theoretical models are only known via forwardsimulation?

Answer: Use statistical model to interpolate outputs ofselect simulation runs.

1. Simulation design

2. Emulator

Simultaneously learn the error distribution for the data.

Applicable to CMB,galaxy, andweak lensing surveys(orreally anywhere that uses simulations for parameter inference).

arXiv:0806.1487

8/14/2019 Durham 102208

3/35

Technical motivation:

simulations are costly!Most astrophysical systems can only be modeled withnumerical simulations

Even when the physics is easily understood, accuratenoise modeling can require large simulations (e.g. theCMB)

Constrainingdark energyviaBAO and cosmic shearprovides formidable computational challenges inpredicting both the model and the error distributions

8/14/2019 Durham 102208

4/35

Parameter estimation

requires many simulationsUse Monte Carlo algorithms to integrate the jointprobability distribution of the data and model:

Requires many calculations of the model at differentparameter settings (~10,000 evaluations for ~5

parameters)

This is computationally prohibitive for manyapplications

P(model | data) = P(model, data) / P(data)

8/14/2019 Durham 102208

5/35

Likelihood model

For galaxy surveys or CMB, data = power spectrum

model dependence of covariance usually neglected

Framework identical for N-point correlations

Gaussian distribution can be extended usingmixture models

2log(P(x|)) = (x

x())T C1() (x

x()) + log(det(C()))

x model parameters

Multivariate Gaussian model for the Likelihood:

8/14/2019 Durham 102208

6/35

EXAMPLE:

NONLINEAR MATTERPOWER SPECTRUM

8/14/2019 Durham 102208

7/35

Non-Gaussian errors in the cosmicshear power spectrum

Fisher matrix constraints fromHalo Model calculation ofpower spectrum covariance(Cooray & Hu (2000))

non-Gaussian effects candominate at scales < 10

arcmin. (even when apparentlyshape noise dominated)(Semboloni et al. (2006))

Full sky weak lensing survey(limiting mag in R~25)

8/14/2019 Durham 102208

8/35

Clusters + weak lensing

Takada & Bridle (2007)

Consider cross-covariancebetween cluster numbercounts and cosmic shearpower spectrum

8/14/2019 Durham 102208

9/35

Power spectrum covariance

from N-body simulations32 realizations of N-body cube 450 Mpc/h on a sideChop into 64 sub-cubes

Window has large impact on covariance

Not explained by simple convolution with the power spectrum

0.02 0.05 0.10 0.20 0.50 1.00 2.00

1e!05

1e!04

1e!03

1e!02

1e!01

Normalized variance

k [h/Mpc]

Gaussian

450 Mpc/h periodic box

112.5 Mpc/h windowed box

0.02 0.05 0.10 0.20 0.50 1.00 2.00

100

200

500

1000

5000

20000

Mean power spectra

k [h/Mpc]

450 Mpc/h periodic box112.5 Mpc/h windowed box

0.05 0.10 0.20 0.50 1.00 2.00

!0.

2

0.

0

0.

2

0.

4

0.

6

0.

8

1.

0

Correlation coefficients

k [h/Mpc]

450 Mpc/h periodic box

112.5 Mpc/h windowed box

8/14/2019 Durham 102208

10/35

Parameter dependence of the

power spectrum covariance

0.05 0.10 0.20 0.50 1.00 2.00

1e!

04

5e!

04

5e!

03

5e!

02

k [h/Mpc]

Normalize

dvarianceofpowerspectrum

GaussianHM !!8 == 0.6

HM!!

8==

1PT !!8 == 0.6

PT !!8 == 1

sim. !!8 == 0.6

sim. !!8 == 1

Normalized variance Correlation coefficients

(Halo model)

8/14/2019 Durham 102208

11/35

Parameterization of the power

spectrum error distributionMultivariate Normal distribution:

Consider shell-averaged estimates of power spectrum bands

Central limit theorem guarantees a Gaussian distribution forband powers except for a few k-bins on the largest scales of the

survey

Correlations in power spectrum captured in this model

P(k) N((),())

8/14/2019 Durham 102208

12/35

SIMULATION DESIGN

8/14/2019 Durham 102208

13/35

Choosing which

simulations to runOrthogonal Array Latin Hypercube

Specify hypercube parameter

bounds (rescaled to unit interval)

Latin square: one point perrow and column

Orthogonal array: each

quadrant has a sample

Optimize with distancecriterion

!

!

!

!

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0

.8

1.0

Simulation design (OALH)

parameter 1

parameter2

8/14/2019 Durham 102208

14/35

8/14/2019 Durham 102208

15/35

8/14/2019 Durham 102208

16/35

GAUSSIAN PROCESS

MODELS FORINTERPOLATION

8/14/2019 Durham 102208

17/35

How to do interpolation in

high dimensionsWe need to interpolate multivariate simulation output as afunction of large (~ 10) numbers of parameters

Power spectrum mean and covariance components modeledas Gaussian processes(GPs)(following Habib et. al 2007)

Interpolation error propagated within Bayesian framework

GP determined by correlation parameters for theinterpolated surface

GPs scale well for interpolation in high dimensions

8/14/2019 Durham 102208

18/35

Gaussian process models for spatial phenomena

0 1 2 3 4 5 6 7

!2

!1

0

1

2

s

z(s)

An example ofz(s) of a Gaussian process model on s1, . . . , sn

z =

z(s1)...

z(sn)

N

0...

0

,

, with ij = exp{||si sj||

2},

where ||si sj|| denotes the distance between locations si and sj.

z has density (z) = (2)n2 ||

1

2 exp{12zT1z}.

32

Higdon, Williams, Gattiker (LANL)

8/14/2019 Durham 102208

19/35

Realizations from (z) = (2)n

2 ||12 exp{12z

T1

z}

0 1 2 3 4 5 6 7!2

!1

0

1

2

z(s)

0 1 2 3 4 5 6 7!2

!1

0

1

2

z(s)

0 1 2 3 4 5 6 7!2

!1

0

1

2

s

z(s)

model for z(s) can be extended to continuous s

33


8/14/2019 Durham 102208

20/35

Conditioning on some observations of z(s)

0 1 2 3 4 5 6 7!2

!1

0

1

2

z(s)

We observe z(s2) and z(s5) what do we now know about{z(s1), z(s3), z(s4), z(s6), z(s7), z(s8)}?

z(s2)z(s5)z(s1)

z(s3)z(s4)z(s6)z(s7)z(s8)

N

00000000

,

1 .0001.0001 1

.3679 00 .0001

.3679 0. . . . . .

0 .0001

1 0... . . . ...

0 1

38


8/14/2019 Durham 102208

21/35

Conditioning on some observations of z(s)

z1z2

N

00

,

11 12

21 22

, z2|z1 N(21

1

11z1,22 21

1

1112)

0 1 2 3 4 5 6 7!2

!1

0

1

2

z(s)

conditional mean

0 1 2 3 4 5 6 7

!2

!1

0

1

2

z(s)

contitional realizations

s

39


8/14/2019 Durham 102208

22/35

A 2-d example, conditioning on the edge

ij = exp{(||si sj||/5)2}

510

152

X5

10

15

20

Y

-2

-1

0

1

2

3

4

Z

a realization

510

15

X5

10

15

20

Y

-2

-1

0

1

2

3

4

Z

mean conditional on Y=1 points

5

1015

2

X5

10

15

20

Y

-2

-1

0

1

2

3

4

Z

realization conditional on Y=1 points

5

1015

X5

10

15

20

Y

-2

-1

01

2

3

4

Z

realization conditional on Y=1 points

42


8/14/2019 Durham 102208

23/35

Limitations of Gaussian Processes

A

alph

a

modeam

p.

A

alph

a

modeamp

.

s

z(s)

8/14/2019 Durham 102208

24/35

EMULATOR

8/14/2019 Durham 102208

25/35

Power spectrum emulator

Multivariate power spectrum output decomposed intoincompleteorthogonal basis(achieves dimension reduction):

Model basis weights as independent Gaussian Processes

Do MCMC to calibrate GP parameters given the design runs

(k,

) =(k

)w

(

) +

N

(0,1 )

w() GP (0,w (;w, w))

P(wdesign|,w, w) 1

+ w

1/2 exp

1

2w

Tdesign

1

+w1wdesign

8/14/2019 Durham 102208

26/35

8/14/2019 Durham 102208

27/35

Covariance matrixparameterization

Generalized Cholesky decomposition (Pouramahdi et. al 2007)

Components of T are unconstrained:

Impose prior structure on covariance with a( independent) conjugate Gaussianprior on (allows shrinking to constant T)

Prior mean can be set from sample covariance of design runs

Model as GP just like mean and variance

Estimate covariance at each design point simultaneously-fewer realizations needed

ij Tij 2 i ny, j = 1, . . . , i 1

N ( , C)

1

y () = TT()D1()T()

i() GP (i,(;,i,,i )) i = 1, . . . ,ny(ny 1)

2

8/14/2019 Durham 102208

28/35

Simplified emulator

Simulation outputs reduced to mean and covariance estimates ateach design point,

Approximation: neglect error in sample mean and covarianceModel variance as a GP just like the mean

Sampling model for the data:

The joint likelihood for parameter estimation breaks into:

L(y, , D|0,,, ) = dpDv L( wy, w|v, 0, ,w, w) (v, v|0,v, v)

y|w(), v()

N (w(),y(Dv()))

, D

8/14/2019 Durham 102208

29/35

Covariance is diagonal

Assume the same numberof modes are used toestimate P(k) in each band

This gives morenoticeable differencesin posteriors for later

validation tests

!3 !2 !1 0 1

3

4

5

6

7

8

9

log(k)

log(P(k))

!

!! ! !

!

!

!

!!!!

!

!

!

!

!!!!!!!

!

!!

!!!!

!

!

var(P(k)) P2(k)

P(k) = Ak

Validation: toy power-law model

Black: N-bodyRed: modelBlue: mock data

8/14/2019 Durham 102208

30/35

Emulator correlations

!!

PC5

PC4

PC3

PC2

PC1

0.0 0.2 0.4 0.6 0.8 1.0

!

!

!

!

!

amplitude

0.0 0.2 0.4 0.6 0.8 1.0

!

!

!

!

!

slope

Marginal posterior samples given design runs

8/14/2019 Durham 102208

31/35

Scaled model parameters

Density

0

1

2

3

4

5

0.2 0.4 0.6 0.8

amplitude

30 pt. design: sample cov.

slope

30 pt. design: sample cov.

amplitude7 pt. design

0

1

2

3

4

5

slope7 pt. design

0

1

2

3

4

5

amplitude30 pt. design

0.2 0.4 0.6 0.8

slope30 pt. design

Parameter

posteriorsMarginal distributions for

the 2 cosmological

parameters

8/14/2019 Durham 102208

32/35

Variance parametersMarginal posterior distributions of PC weights for the

power spectrum variance

PC weights of variance

Density

0.0

0.1

0.2

0.3

!5 0 5

PC weight 1

!5 0 5

PC weight 2

8/14/2019 Durham 102208

33/35

Summary

Our method uses limited numbers of simulations to calibrate amodel for the power spectrum sample variance distribution.

Obtaining precise estimates of the power spectrumcovariance is a challenge - full formulation may make thisfeasible

Our framework can be readily applied togeneral parameter

inference problems using simulationsPlan to release an R package implementing these methods

Next: demonstrate covariance matrix emulator using N-bodysimulations of the matter power spectrum

8/14/2019 Durham 102208

34/35

8/14/2019 Durham 102208

35/35

Gaussian process model formulation

for the mean power spectrumPrincipal component weights of mean are modeled as independent Gaussian processes:

Design outputs also have Gaussian sampling model(from error term

)

After marginalization over GP realizations:

Emulator outputs at new designs points can be drawn from:

(k, ) =

p

i=1

,i(k)wi() +

|w, N(w,1

I), (a, b)

complicatedNormal distribution, modifiedGammaprior

wi() GP(0,w(;w, w))

(w, w()) N(0,w,w()(w, w))

draws from posterior

Durham 102208

Documents

Transcript of Durham 102208