Durham 102208

download Durham 102208

of 35

Transcript of Durham 102208

  • 8/14/2019 Durham 102208

    1/35

    SIMULATIONS ANDCOSMOLOGICAL

    INFERENCEMichael D. Schneider

    Durham

    In collaboration with Lloyd Knox(UC Davis), Salman Habib, KatrinHeitmann, David Higdon(Los Alamos National Laboratory), Charles

    Nakhleh(Sandia National Laboratories)

    October 22, 2008

  • 8/14/2019 Durham 102208

    2/35

    OverviewQuestion: How do we estimate cosmological parameters

    when theoretical models are only known via forwardsimulation?

    Answer: Use statistical model to interpolate outputs ofselect simulation runs.

    1. Simulation design

    2. Emulator

    Simultaneously learn the error distribution for the data.

    Applicable to CMB,galaxy, andweak lensing surveys(orreally anywhere that uses simulations for parameter inference).

    arXiv:0806.1487

  • 8/14/2019 Durham 102208

    3/35

    Technical motivation:

    simulations are costly!Most astrophysical systems can only be modeled withnumerical simulations

    Even when the physics is easily understood, accuratenoise modeling can require large simulations (e.g. theCMB)

    Constrainingdark energyviaBAO and cosmic shearprovides formidable computational challenges inpredicting both the model and the error distributions

  • 8/14/2019 Durham 102208

    4/35

    Parameter estimation

    requires many simulationsUse Monte Carlo algorithms to integrate the jointprobability distribution of the data and model:

    Requires many calculations of the model at differentparameter settings (~10,000 evaluations for ~5

    parameters)

    This is computationally prohibitive for manyapplications

    P(model | data) = P(model, data) / P(data)

  • 8/14/2019 Durham 102208

    5/35

    Likelihood model

    For galaxy surveys or CMB, data = power spectrum

    model dependence of covariance usually neglected

    Framework identical for N-point correlations

    Gaussian distribution can be extended usingmixture models

    2log(P(x|)) = (x

    x())T C1() (x

    x()) + log(det(C()))

    x model parameters

    Multivariate Gaussian model for the Likelihood:

  • 8/14/2019 Durham 102208

    6/35

    EXAMPLE:

    NONLINEAR MATTERPOWER SPECTRUM

  • 8/14/2019 Durham 102208

    7/35

    Non-Gaussian errors in the cosmicshear power spectrum

    Fisher matrix constraints fromHalo Model calculation ofpower spectrum covariance(Cooray & Hu (2000))

    non-Gaussian effects candominate at scales < 10

    arcmin. (even when apparentlyshape noise dominated)(Semboloni et al. (2006))

    Full sky weak lensing survey(limiting mag in R~25)

  • 8/14/2019 Durham 102208

    8/35

    Clusters + weak lensing

    Takada & Bridle (2007)

    Consider cross-covariancebetween cluster numbercounts and cosmic shearpower spectrum

  • 8/14/2019 Durham 102208

    9/35

    Power spectrum covariance

    from N-body simulations32 realizations of N-body cube 450 Mpc/h on a sideChop into 64 sub-cubes

    Window has large impact on covariance

    Not explained by simple convolution with the power spectrum

    0.02 0.05 0.10 0.20 0.50 1.00 2.00

    1e!05

    1e!04

    1e!03

    1e!02

    1e!01

    Normalized variance

    k [h/Mpc]

    Gaussian

    450 Mpc/h periodic box

    112.5 Mpc/h windowed box

    0.02 0.05 0.10 0.20 0.50 1.00 2.00

    100

    200

    500

    1000

    5000

    20000

    Mean power spectra

    k [h/Mpc]

    450 Mpc/h periodic box112.5 Mpc/h windowed box

    0.05 0.10 0.20 0.50 1.00 2.00

    !0.

    2

    0.

    0

    0.

    2

    0.

    4

    0.

    6

    0.

    8

    1.

    0

    Correlation coefficients

    k [h/Mpc]

    450 Mpc/h periodic box

    112.5 Mpc/h windowed box

  • 8/14/2019 Durham 102208

    10/35

    Parameter dependence of the

    power spectrum covariance

    0.05 0.10 0.20 0.50 1.00 2.00

    1e!

    04

    5e!

    04

    5e!

    03

    5e!

    02

    k [h/Mpc]

    Normalize

    dvarianceofpowerspectrum

    GaussianHM !!8 == 0.6

    HM!!

    8==

    1PT !!8 == 0.6

    PT !!8 == 1

    sim. !!8 == 0.6

    sim. !!8 == 1

    Normalized variance Correlation coefficients

    (Halo model)

  • 8/14/2019 Durham 102208

    11/35

    Parameterization of the power

    spectrum error distributionMultivariate Normal distribution:

    Consider shell-averaged estimates of power spectrum bands

    Central limit theorem guarantees a Gaussian distribution forband powers except for a few k-bins on the largest scales of the

    survey

    Correlations in power spectrum captured in this model

    P(k) N((),())

  • 8/14/2019 Durham 102208

    12/35

    SIMULATION DESIGN

  • 8/14/2019 Durham 102208

    13/35

    Choosing which

    simulations to runOrthogonal Array Latin Hypercube

    Specify hypercube parameter

    bounds (rescaled to unit interval)

    Latin square: one point perrow and column

    Orthogonal array: each

    quadrant has a sample

    Optimize with distancecriterion

    !

    !

    !

    !

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0

    .8

    1.0

    Simulation design (OALH)

    parameter 1

    parameter2

  • 8/14/2019 Durham 102208

    14/35

  • 8/14/2019 Durham 102208

    15/35

  • 8/14/2019 Durham 102208

    16/35

    GAUSSIAN PROCESS

    MODELS FORINTERPOLATION

  • 8/14/2019 Durham 102208

    17/35

    How to do interpolation in

    high dimensionsWe need to interpolate multivariate simulation output as afunction of large (~ 10) numbers of parameters

    Power spectrum mean and covariance components modeledas Gaussian processes(GPs)(following Habib et. al 2007)

    Interpolation error propagated within Bayesian framework

    GP determined by correlation parameters for theinterpolated surface

    GPs scale well for interpolation in high dimensions

  • 8/14/2019 Durham 102208

    18/35

    Gaussian process models for spatial phenomena

    0 1 2 3 4 5 6 7

    !2

    !1

    0

    1

    2

    s

    z(s)

    An example ofz(s) of a Gaussian process model on s1, . . . , sn

    z =

    z(s1)...

    z(sn)

    N

    0...

    0

    ,

    , with ij = exp{||si sj||

    2},

    where ||si sj|| denotes the distance between locations si and sj.

    z has density (z) = (2)n2 ||

    1

    2 exp{12zT1z}.

    32

    Higdon, Williams, Gattiker (LANL)

  • 8/14/2019 Durham 102208

    19/35

    Realizations from (z) = (2)n

    2 ||12 exp{12z

    T1

    z}

    0 1 2 3 4 5 6 7!2

    !1

    0

    1

    2

    z(s)

    0 1 2 3 4 5 6 7!2

    !1

    0

    1

    2

    z(s)

    0 1 2 3 4 5 6 7!2

    !1

    0

    1

    2

    s

    z(s)

    model for z(s) can be extended to continuous s

    33

    Higdon, Williams, Gattiker (LANL)

  • 8/14/2019 Durham 102208

    20/35

    Conditioning on some observations of z(s)

    0 1 2 3 4 5 6 7!2

    !1

    0

    1

    2

    z(s)

    We observe z(s2) and z(s5) what do we now know about{z(s1), z(s3), z(s4), z(s6), z(s7), z(s8)}?

    z(s2)z(s5)z(s1)

    z(s3)z(s4)z(s6)z(s7)z(s8)

    N

    00000000

    ,

    1 .0001.0001 1

    .3679 00 .0001

    .3679 0. . . . . .

    0 .0001

    1 0... . . . ...

    0 1

    38

    Higdon, Williams, Gattiker (LANL)

  • 8/14/2019 Durham 102208

    21/35

    Conditioning on some observations of z(s)

    z1z2

    N

    00

    ,

    11 12

    21 22

    , z2|z1 N(21

    1

    11z1,22 21

    1

    1112)

    0 1 2 3 4 5 6 7!2

    !1

    0

    1

    2

    z(s)

    conditional mean

    0 1 2 3 4 5 6 7

    !2

    !1

    0

    1

    2

    z(s)

    contitional realizations

    s

    39

    Higdon, Williams, Gattiker (LANL)

  • 8/14/2019 Durham 102208

    22/35

    A 2-d example, conditioning on the edge

    ij = exp{(||si sj||/5)2}

    510

    152

    X5

    10

    15

    20

    Y

    -2

    -1

    0

    1

    2

    3

    4

    Z

    a realization

    510

    15

    X5

    10

    15

    20

    Y

    -2

    -1

    0

    1

    2

    3

    4

    Z

    mean conditional on Y=1 points

    5

    1015

    2

    X5

    10

    15

    20

    Y

    -2

    -1

    0

    1

    2

    3

    4

    Z

    realization conditional on Y=1 points

    5

    1015

    X5

    10

    15

    20

    Y

    -2

    -1

    01

    2

    3

    4

    Z

    realization conditional on Y=1 points

    42

    Higdon, Williams, Gattiker (LANL)

  • 8/14/2019 Durham 102208

    23/35

    Limitations of Gaussian Processes

    A

    alph

    a

    modeam

    p.

    A

    alph

    a

    modeamp

    .

    s

    z(s)

  • 8/14/2019 Durham 102208

    24/35

    EMULATOR

  • 8/14/2019 Durham 102208

    25/35

    Power spectrum emulator

    Multivariate power spectrum output decomposed intoincompleteorthogonal basis(achieves dimension reduction):

    Model basis weights as independent Gaussian Processes

    Do MCMC to calibrate GP parameters given the design runs

    (k,

    ) =(k

    )w

    (

    ) +

    N

    (0,1 )

    w() GP (0,w (;w, w))

    P(wdesign|,w, w) 1

    + w

    1/2 exp

    1

    2w

    Tdesign

    1

    +w1wdesign

  • 8/14/2019 Durham 102208

    26/35

  • 8/14/2019 Durham 102208

    27/35

    Covariance matrixparameterization

    Generalized Cholesky decomposition (Pouramahdi et. al 2007)

    Components of T are unconstrained:

    Impose prior structure on covariance with a( independent) conjugate Gaussianprior on (allows shrinking to constant T)

    Prior mean can be set from sample covariance of design runs

    Model as GP just like mean and variance

    Estimate covariance at each design point simultaneously-fewer realizations needed

    ij Tij 2 i ny, j = 1, . . . , i 1

    N ( , C)

    1

    y () = TT()D1()T()

    i() GP (i,(;,i,,i )) i = 1, . . . ,ny(ny 1)

    2

  • 8/14/2019 Durham 102208

    28/35

    Simplified emulator

    Simulation outputs reduced to mean and covariance estimates ateach design point,

    Approximation: neglect error in sample mean and covarianceModel variance as a GP just like the mean

    Sampling model for the data:

    The joint likelihood for parameter estimation breaks into:

    L(y, , D|0,,, ) = dpDv L( wy, w|v, 0, ,w, w) (v, v|0,v, v)

    y|w(), v()

    N (w(),y(Dv()))

    , D

  • 8/14/2019 Durham 102208

    29/35

    Covariance is diagonal

    Assume the same numberof modes are used toestimate P(k) in each band

    This gives morenoticeable differencesin posteriors for later

    validation tests

    !3 !2 !1 0 1

    3

    4

    5

    6

    7

    8

    9

    log(k)

    log(P(k))

    !

    !! ! !

    !

    !

    !

    !!!!

    !

    !

    !

    !

    !!!!!!!

    !

    !!

    !!!!

    !

    !

    var(P(k)) P2(k)

    P(k) = Ak

    Validation: toy power-law model

    Black: N-bodyRed: modelBlue: mock data

  • 8/14/2019 Durham 102208

    30/35

    Emulator correlations

    !!

    PC5

    PC4

    PC3

    PC2

    PC1

    0.0 0.2 0.4 0.6 0.8 1.0

    !

    !

    !

    !

    !

    amplitude

    0.0 0.2 0.4 0.6 0.8 1.0

    !

    !

    !

    !

    !

    slope

    Marginal posterior samples given design runs

  • 8/14/2019 Durham 102208

    31/35

    Scaled model parameters

    Density

    0

    1

    2

    3

    4

    5

    0.2 0.4 0.6 0.8

    amplitude

    30 pt. design: sample cov.

    slope

    30 pt. design: sample cov.

    amplitude7 pt. design

    0

    1

    2

    3

    4

    5

    slope7 pt. design

    0

    1

    2

    3

    4

    5

    amplitude30 pt. design

    0.2 0.4 0.6 0.8

    slope30 pt. design

    Parameter

    posteriorsMarginal distributions for

    the 2 cosmological

    parameters

  • 8/14/2019 Durham 102208

    32/35

    Variance parametersMarginal posterior distributions of PC weights for the

    power spectrum variance

    PC weights of variance

    Density

    0.0

    0.1

    0.2

    0.3

    !5 0 5

    PC weight 1

    !5 0 5

    PC weight 2

  • 8/14/2019 Durham 102208

    33/35

    Summary

    Our method uses limited numbers of simulations to calibrate amodel for the power spectrum sample variance distribution.

    Obtaining precise estimates of the power spectrumcovariance is a challenge - full formulation may make thisfeasible

    Our framework can be readily applied togeneral parameter

    inference problems using simulationsPlan to release an R package implementing these methods

    Next: demonstrate covariance matrix emulator using N-bodysimulations of the matter power spectrum

  • 8/14/2019 Durham 102208

    34/35

  • 8/14/2019 Durham 102208

    35/35

    Gaussian process model formulation

    for the mean power spectrumPrincipal component weights of mean are modeled as independent Gaussian processes:

    Design outputs also have Gaussian sampling model(from error term

    )

    After marginalization over GP realizations:

    Emulator outputs at new designs points can be drawn from:

    (k, ) =

    p

    i=1

    ,i(k)wi() +

    |w, N(w,1

    I), (a, b)

    complicatedNormal distribution, modifiedGammaprior

    wi() GP(0,w(;w, w))

    (w, w()) N(0,w,w()(w, w))

    draws from posterior