Tutorial on Bayesian Techniques for Inference

66
Tutorial on Bayesian Techniques for Inference A.Asensio Ramos Instituto de Astrofísica de Canarias

description

Tutorial on Bayesian Techniques for Inference. Asensio Ramos Instituto de Astrofísica de Canarias. Outline. General introduction The Bayesian approach to inference Examples Conclusions. The Big Picture. Deductive Inference. Statistical Inference. The Big Picture. - PowerPoint PPT Presentation

Transcript of Tutorial on Bayesian Techniques for Inference

Page 1: Tutorial  on Bayesian Techniques for Inference

Tutorial onBayesian Techniques

for Inference

A. Asensio RamosInstituto de Astrofísica de Canarias

Page 2: Tutorial  on Bayesian Techniques for Inference

Outline

• General introduction

• The Bayesian approach to inference

• Examples

• Conclusions

Page 3: Tutorial  on Bayesian Techniques for Inference

The Big Picture

Predictions

ObservationData

Hypothesis testing

Parameter estimation

TestableHypothesis

(theory)

Statistical Inference

Deductive Inference

Page 4: Tutorial  on Bayesian Techniques for Inference

The Big Picture

Available information is always incomplete

Our knowledge of nature is necessarily probabilistic

Cox & Jaynes demonstrated that probability calculusfulfilling the rules

can be used to do statistical inference

Page 5: Tutorial  on Bayesian Techniques for Inference

Probabilistic inference

H1, H2, H3, …., Hn are hypothesis that we want to test

The Bayesian way is to estimate p(Hi|…) and selectdepending on the comparison of their probabilities

What are the p(Hi|…)???

But…

Page 6: Tutorial  on Bayesian Techniques for Inference

What is probability? (Frequentist)

In frequentist approach, probability describes “randomness”

If we carry out the experiment many times, which is the distribution ofevents (frequentist) p(x) is the histogram of random variable x

Page 7: Tutorial  on Bayesian Techniques for Inference

What is probability? (Bayesian)

In Bayesian approach, probability describes “uncertainty”

p(x) gives how probability is distributed among the possiblechoice of x

We observethis value

Everything can be a random variable as we will see later

Page 8: Tutorial  on Bayesian Techniques for Inference

Bayes theorem

It is trivially derived from the product rule

• Hi proposition asserting the truth of a hypothesis• I proposition representing prior information• D proposition representing data

Page 9: Tutorial  on Bayesian Techniques for Inference

Bayes theorem - Example

• Model M1 predicts a star at d=100 ly• Model M2 predicts a star at d=200 ly• Uncertainty in measurement is Gaussian with s=40 ly• Measured distance is d=120 ly

Likelihood

Posteriors

Page 10: Tutorial  on Bayesian Techniques for Inference

Bayes theorem – Another example

1.4% false negative(98.6% reliability)

2.3% false positive

Page 11: Tutorial  on Bayesian Techniques for Inference

Bayes theorem – Another example

H you have the diseaseH you don’t have the diseaseD1 your test is positive

You take the test and you get it positive. What is the probability that you have the disease if the incidence is 1:10000?

Page 12: Tutorial  on Bayesian Techniques for Inference

Bayes theorem – Another example

10-4 0.986

0.986 0.02310-4 0.9999

Page 13: Tutorial  on Bayesian Techniques for Inference

What is usually known as inversion

All inversion methods work by adjusting the parameters ofthe model with the aim of minimizing a merit function thatcompares observations with the synthesis from the model

One proposes a model to explain observations

Least-squares solution (maximum-likelihood) is the solution to theinversion problem

Page 14: Tutorial  on Bayesian Techniques for Inference

Defects of standard inversion codes

• Solution is given as a set of model parameters (max. likelihood)

• Not necessary the optimal solution• Sensitive to noise

• Error bars or confidence regions are scarce

• Gaussian errors• Not easy to propagate errors• Ambiguities, degeneracies, correlations are not detected

• Assumptions are not explicit

• Cannot compare models

Page 15: Tutorial  on Bayesian Techniques for Inference

Inversion as a probabilistic inference problem

Likelihood Prior

EvidencePosterior

Use Bayes theorem to propagate informationfrom data to our final state of knowledge

Page 16: Tutorial  on Bayesian Techniques for Inference

Priors

Typical priorsTop-hat function(flat prior)

qiqmaxqmin

Gaussian prior(we know some values

are more probablethan others)

qi

Assuming statistical independence for all parameters the total prior can be calculated as

Contain information about model parameters that we know before presenting the data

Page 17: Tutorial  on Bayesian Techniques for Inference

Likelihood

Assuming normal (gaussian) noise, the likelihood can be calculated as

where the c2 function is defined as usual

In this case, the c2 function is specific for the the case of Stokes profiles

Page 18: Tutorial  on Bayesian Techniques for Inference

Visual example of Bayesian inference

Page 19: Tutorial  on Bayesian Techniques for Inference

Advantages of Bayesian approach

• “Best fit” values of parameters are e.g., mode/median of the posterior

• Uncertainties are credible regions of the posterior

• Correlation between variables of the model are captured

• Generalized error propagation (not only Gaussian and including correl.)

Integration over nuissance parameters (marginalization)

Page 20: Tutorial  on Bayesian Techniques for Inference

Bayesian inference – an example

Hinode

Page 21: Tutorial  on Bayesian Techniques for Inference

Beautiful posterior distributions

Field strength Field inclination

Field azimuth Filling factor

Page 22: Tutorial  on Bayesian Techniques for Inference

Not so beautiful posterior distributions - degeneraciesField inclination

Page 23: Tutorial  on Bayesian Techniques for Inference

Inversion with local stray-light – be careful

si is the variance of the numerator

But… what happens if we propose a model like Orozco Suárez et al. (2007)with a stray-light contamination obtained from a local

average on the surrounding pixels

From observations

Page 24: Tutorial  on Bayesian Techniques for Inference

Variance becomes dependent on stray-light contamination

It is usual to carry out inversions with a stray-light contaminationobtained from a local average on the surrounding pixels

Page 25: Tutorial  on Bayesian Techniques for Inference

Spatial correlations: use global stray-light

It is usual to carry out inversions with a stray-light contaminationobtained from a local average on the surrounding pixels

If M correlations tend to zero

Page 26: Tutorial  on Bayesian Techniques for Inference

Spatial correlations

Page 27: Tutorial  on Bayesian Techniques for Inference

Lesson: use global stray-light contamination

Page 28: Tutorial  on Bayesian Techniques for Inference

Recommendation

Use global stray-light contaminationto avoid problems

Page 29: Tutorial  on Bayesian Techniques for Inference

But… the most general inversion method is…

Page 30: Tutorial  on Bayesian Techniques for Inference

Model comparison

Choose among the selected models the one that is preferred by the data

Posterior for model Mi

Model likelihood is just the evidence

Page 31: Tutorial  on Bayesian Techniques for Inference

Model comparison (compare evidences)

Page 32: Tutorial  on Bayesian Techniques for Inference

Model comparison – a worked example

H0 : simple GaussianH1 : two Gaussians of equal width but unknown amplitude ratio

Page 33: Tutorial  on Bayesian Techniques for Inference

H0 : simple Gaussian

H1 : two Gaussians of equal width but unknown amplitude ratio

Model comparison – a worked example

Page 34: Tutorial  on Bayesian Techniques for Inference

Model comparison – a worked example

Page 35: Tutorial  on Bayesian Techniques for Inference

Model comparison – a worked example

Model H1 is 9.2 times more probable

Page 36: Tutorial  on Bayesian Techniques for Inference

Model comparison – an example

Model 11 magnetic component

Model 21 magnetic+1 non-magnetic

component

Model 32 magnetic components

Model 42 magnetic components

with (v2=0, a2=0)

Page 37: Tutorial  on Bayesian Techniques for Inference

Model comparison – an example

Model 11 magnetic component

9 free parameters

Model 21 magnetic+1 non-magnetic

component17 free parameters

Model 32 magnetic components

20 free parameters

Model 42 magnetic components

with (v2=0, a2=0)18 free parameters

Model 2 is preferred by the data“Best fit with the smallest number of parameters”

Page 38: Tutorial  on Bayesian Techniques for Inference

Model averaging. One step further

Models {Mi, i=1..N} have a common subset of parameters y of interestbut each model depends on a different set of parameters q

or have different priors over these parameters

What all models have to say about parameters yAll of them give a “weighted vote”

Posterior for y including all models

Page 39: Tutorial  on Bayesian Techniques for Inference

Model averaging – an example

Page 40: Tutorial  on Bayesian Techniques for Inference

Hierarchical models

In the Bayesian approach, everything can be considered a random variable

DATAMODEL LIKELIHOOD

MARGINALIZATIONNUISANCE PAR.

PRIOR

INFERENCE

PRIORPAR.

Page 41: Tutorial  on Bayesian Techniques for Inference

Hierarchical models

In the Bayesian approach, everything can be considered a random variable

DATAMODEL LIKELIHOOD

MARGINALIZATIONNUISANCE PAR.

PRIOR

INFERENCE

PRIORPAR.

PRIOR PRIOR

Page 42: Tutorial  on Bayesian Techniques for Inference

Bayesian Weak-field

Bayes theorem

Advantage: everything is close to analytic

Page 43: Tutorial  on Bayesian Techniques for Inference

Bayesian Weak-field – Hierarchical priors

Priors depend on some hyperparameters over whichwe can again set priors and marginalize them

Page 44: Tutorial  on Bayesian Techniques for Inference

Bayesian Weak-field - Data

IMaX data

Page 45: Tutorial  on Bayesian Techniques for Inference

Bayesian Weak-field - Posteriors

Joint posteriors

Page 46: Tutorial  on Bayesian Techniques for Inference

Bayesian Weak-field - Posteriors

Marginal posteriors

Page 47: Tutorial  on Bayesian Techniques for Inference

Hierarchical priors - Distribution of longitudinal B

Page 48: Tutorial  on Bayesian Techniques for Inference

Hierarchical priors – Distribution of longitudinal B

We want to infer the distribution of longitudinal B frommany observed pixels taking into account uncertainties

Parameterize the distribution in terms of a vector a

Mean+varianceif Gaussian

Height of binsif general

Page 49: Tutorial  on Bayesian Techniques for Inference

Hierarchical priors – Distribution of longitudinal B

Page 50: Tutorial  on Bayesian Techniques for Inference

Hierarchical priors – Distribution of longitudinal B

We generate N synthetic profiles with

noise withlongitudinal field sampled from a

Gaussian distributionwith standard deviation

25 Mx cm-2

Page 51: Tutorial  on Bayesian Techniques for Inference

Hierarchical priors – Distribution of any quantity

Page 52: Tutorial  on Bayesian Techniques for Inference

Bayesian image deconvolution

Page 53: Tutorial  on Bayesian Techniques for Inference

Bayesian image deconvolution

PSF blurringusing linear expansion

Image is sparsein any basis

Maximum-likelihood solution (phase-diversity, MOMFBD,…)

Page 54: Tutorial  on Bayesian Techniques for Inference

Inference in a Bayesian framework

• Solution is given as a probability over model parameters

• Error bars or confidence regions can be easily obtained, including correlations, degeneracies, etc.

• Assumptions are explicit on prior distributions

• Model comparison and model averaging is easily accomplished

• Hierarchical model is powerful for extracting information from data

Page 55: Tutorial  on Bayesian Techniques for Inference

Hinode data

Continuum Total polarization

Asensio Ramos (2009)

Observations of Lites et al. (2008)

Page 56: Tutorial  on Bayesian Techniques for Inference

How much information? – Kullback-Leibler divergence

Field strength (37% larger than 1) Field inclination (34% larger than 1)

Measures “distance” between posterior and prior distributions

Page 57: Tutorial  on Bayesian Techniques for Inference

Posteriors

Field strength

Field azimuthField inclination

Stray-light

Page 58: Tutorial  on Bayesian Techniques for Inference

Field inclination – Obvious conclusion

Linear polarization is fundamental to obtain reliable inclinations

Page 59: Tutorial  on Bayesian Techniques for Inference

Field inclination – Quasi-isotropic

Isotropic field

Our prior

Page 60: Tutorial  on Bayesian Techniques for Inference

Field inclination – Quasi-isotropic

Page 61: Tutorial  on Bayesian Techniques for Inference

Representation

Marginal distribution for each parameter

Sample N values from the posterior and all values arecompatible with observations

Page 62: Tutorial  on Bayesian Techniques for Inference

Field strength – Representation

All maps compatible with observations!!!

Page 63: Tutorial  on Bayesian Techniques for Inference

Field inclination

All maps compatible with observations!!!

Page 64: Tutorial  on Bayesian Techniques for Inference

In a galaxy far far away… (the future)

RAWDATA

POSTERIOR+MARGINALIZATIONNON-IMPORTANT

PARAMETERS

INFERENCE

INSTRUMENTSWITH SYSTEMATICS

PRIORS

MODEL PRIORS

Page 65: Tutorial  on Bayesian Techniques for Inference

Conclusions

• Inversion is not an easy task and has to be considered as a probabilistic inference problem

• Bayesian theory gives us the tools for inference

• Expand our view of inversion as a model comparison/averaging problem (no model is the absolute truth!)

Page 66: Tutorial  on Bayesian Techniques for Inference

Thank youand be Bayesian, my friend!