July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from...

25
July 9, 2007 Bayesian Inference and Maximum Entropy 20 07 1 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics; Theoretical Division Los Alamos National Laboratory This presentation available at http://www.lanl.gov/home/k mh/ LA-UR-07- 2971 Bayesian Inference and Maximum Entropy Workshop, Saratoga Springs, NY, July 8-13, 2007

Transcript of July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from...

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 1

Lessons about Likelihood Functions from Nuclear Physics

Kenneth M. Hanson

T-16, Nuclear Physics; Theoretical DivisionLos Alamos National Laboratory

This presentation available at http://www.lanl.gov/home/kmh/

LA-UR-07-2971

Bayesian Inference and Maximum Entropy Workshop, Saratoga Springs, NY, July 8-13, 2007

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 2

Overview• Uncertainties in physics experiments

• Particle Data Group (PDG)

• Lifetime data

• Outliers

• Uncertainty in the uncertainty

• Student t distributions vs. Normal distribution

• Analysis of lifetime data using t distributions

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 3

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 4

Physics experiments• Experimenters state their measurement of physical quantity y as

measurement ± standard error or d ± σd

• Experimenter’s degree of belief in measurement described by a normal distribution (Gaussian) and σd its standard deviation

• Experimental uncertainty often composed of two components:► statistical uncertainty

• from noise in signal or event counting (Poisson)

• Type A – determined by repeated meas., frequentist methods► systematic uncertainty

• from equipment calibration, experimental procedure, corrections

• Type B – determined by nonfrequentist methods

• based on experimenter’s judgment, hence subjective; difficult to assess ► these usually added in quadrature (rms sum)

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 5

Physics experiments – likelihood functions• In probabilistic terms, experimentalist’s statement y ± σd

is interpreted as a likelihood functionp( d | y σd I)

where I is background information about situation, including how experiment is performed

• Inference about the physical quantity y is obtained by Bayes lawp( y | d σy I) ~ p( y | d σd I) p( y | I)

where p( y | I) is the prior information about y

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 7

Exploratory data analysis• John Tukey (1977) suggested each set of

measurements of a quantity be scrutinized

► find quantile positions, Q1, Q2, Q3► calculate the inter-quantile range

IQR = Q3 – Q2 (for normal distr., IQR = 1.35 σ)

► determine fraction of data outside interval, SO Q1 – 1.5 IQR < y < Q3 + 1.5 IQRlabeling these as suspected outliers(for normal distr., 0.7%)

• IQR measures width of core• SO measures extent of tail

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 8

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 9

Particle Data Group (PDG)• Particle Data Group formed in 1957

► annually summarizes state-of-knowledge of properties of elementary particles

• For each particle property► list all relevant experimental data► committee decides which data to include in final analysis► state best current value (usually least-squares average)

and its standard error • often magnified by sqrt[χ2/(N – 1)] (avg of 2.0, 50% of time)

• PDG reports are excellent source of information about measurements of unambiguous physical quantities

► available online, free► provide insight into how physicists interpret data

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 11

Lambda lifetime measurement in the 60s• Hydrogen bubble chambers were used in

1950s and 60s to record elementary particles

• Picture shows reaction sequence: K- + p → Ξ- + K+

Ξ-→ Λ0 + π-

Λ0 → p + π-

• Track lengths and particle momenta determined from curvature in magnetic field yield survival time of Ξ- and Λ

• Hubbard et al. observed 828 such events to obtain lifetimes: τΞ = 1.69 ±0.06×10-10 s τΛ = 2.59 ±0.09×10-10 s

Hydrogen bubble- chamber photo

From J.R. Hubbard et al., Phys.Rev. 135B (1964)

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 12

Measurements of neutron lifetime• Because n lifetime is so long,

it is difficult to measure without slowing neutrons or trapping them

• Plot shows all measurements of neutron lifetime

• Red line is PDG value, which includes 7 data sets,but excludes older ones and #2 because it is discrepant

• χ2 for red line = 149/21 pts.

• Evidence of outliers

• Large systematic uncertainties

Neutron lifetime measurements

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 13

Measurements of lifetimes of other particles

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 14

Collection of lifetime measurements• Goal: determine distribution of

measurements relative to their estimated uncertainties

• Upper graph shows deviations of 99 lifetime meas. for 5 particles from PDG values, divided by their standard errors, i.e. Δx/σ

• Lower graph shows histogram

• Objective: characterize the distribution of Δx/σ for these expts

• χ2 = 367/99 DOF

• IQR = 1.83 (1.35 for normal)

• Suspected outliers = 6.6 %

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 15

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 16

Uncertainty in the uncertainty• Suppose there is uncertainty in the stated standard error σ0 for

measurement d

• Dose and von der Linden (2000) gave plausible derivation:► assume likelihood has underlying Normal distr

► assume uncertainty distr for ω, where σ is scaled by

► marginalizing over ω, the likelihood is Student t distr., (2a = ν)

• Many have contributed to outlier story: Box and Tiao, O’Hagan, Fröhner, Press, Sivia, Hanson and Wolf

21

2( | ) expy

d yp d y I

1( | ) ( ) expaap I a

0 /

21

21( | ) 1 ( )

d y d yp d y I t

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 17

Student t distribution• Student* t distribution

► long tail for ν < 9 (SO > 1%)► outlier-tolerant likelihood

function► ν = 1 is Cauchy distr (solid red)► ν = ∞ is Normal distr (solid blue)

2

121

( ) 1t z z

ν = 1, 5, ∞

* Student (1908) was pseudonym for W.S Gossett, who was not allowed to publish by his employer, Guiness brewery

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 18

Physical analogy of probability• Drawing analogy between φ(Δx) = minus-log-posterior and a physical potential

• is a force with which each datum pulls on fit model

• Outlier-tolerant likelihoods► generally have long tails► restoring force eventually decreases for large residuals

G

2GG+C

G+E

2G

G

G+C

G+E

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 19

Analysis of a collection of data• To calculate “average” value of a data set, use the Student t

distribution for the likelihood of each datum:

where s is scaling factor of standard error for whole data set

• Select ν based on data using model selection

• Scale factor s marginalized out of posterior

21

21

( | ) 1 ( )i i

i i

i ii i

i

d y

s

d yp d y s I t

s

( | ) ( | ) ( )p y p y s p s dsdσ dσ

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 21

Model selection• Odds ratios of t distr (t) to Normal (N)

is

= 1.32x10-85 /2.2x10-90 = 6x104

► for prior ratio on models = 1► evidence is integral over x (lifetime)

and s; includes prior on s proportional to 1/s

• Thus, t distr is strongly preferred by data to Normal distr

► ν ≈ 2.6 (maximizes evidence)

( | ) ( | ) ( | )

( | ) ( | ) ( | )

p t I p t I p t I

p N I p N I p N I

dσ d σ

dσ d σ

t distr

Normal distr

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 22

Measurements of neutron lifetime• Upper plot shows all

measurements of neutron lifetime

• Lower plot shows results based on all 21 data points:

► posterior for t-distr analysis (ν = 2.6, margin. over s)

► least-squares result (with and w/o χ2 scaling)

► PDG results (using 7 selected data points, Serebrov rejected)

Neutron lifetime measurements

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 23

Measurements of π0 lifetime• Upper plot shows all

measurements of π0 lifetime

• Lower plot shows results based on all 13 data points:

► posterior for t-distr analysis (ν = 2.6, margin. over s)

► least-squares result (with χ2 scaling)

► PDG results (using 4 selected data points, excl. latest one)

π0 lifetime measurements

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 24

Measurements of lambda lifetime• Upper plot shows all

measurements of lambda lifetime

• Lower plot shows results based on all 27 data points:

► posterior for t-distr analysis (ν = 2.6, margin. over s)

► least-squares result (with χ2 scaling)

► PDG results (using 3 latest data points)

Lambda lifetime measurements

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 25

Tests • Draw 20 data points from various

t distrs. and analyze them using likelihoods: a) t distr with ν = 3 b) Normal distr

► scale uncertainties according to data variance

► results from 10,000 random trials

• Conclude► t distr results well behaved► normal distr results unstable

when data have significant outliers

t distr

Normal

dashed line = estimated σ

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 26

Summary• Technique presented for dealing gracefully with outliers

► is based on using for likelihood function the Student t distr. instead of the Normal distr.

► copes with outliers, while treating every datum identically

• Particle lifetime data distribution matched by t distr. with ν ≈ 2.6 to 3.0

► using likelihood functions based on t distr. produce stable results when outliers exist in data sets, whereas Normal distr. does not

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 27

Bibliography ► “A further look at robustness via Bayes;s theorem,” G.E.P. Box and G.C. Tiao,

Biometrica 49, pp. 419-432 (1962)► “On outlier rejection phenomena in Bayes inference,” A. O’Hagan, J. Roy. Statist. Soc. B

41, 358–367 (1979)► “Bayesian evaluation of discrepant experimental data,” F.H. Fröhner, Maximum Entropy

and Bayesian Methods, pp. 467–474 (Kluwer Academic, Dordrecht, 1989)► “Estimators for the Cauchy distribution,” K.M. Hanson and D.R. Wolf, Maximum Entropy

and Bayesian Methods, pp. 157-164 (Kluwer Academic, Dordrecht, 1993)► “Dealing with duff data,” D. Sivia, Maximum Entropy and Bayesian Methods, pp. 157-

164 (1996)► “Understanding data better with Bayesian and global statistical methods,” in W.H. Press,

Unsolved Problems in Astrophysics, pp. 49-60 (1997)► “Outlier-tolerant parameter estimation,” V. Dose and W. von der Linden, Maximum

Entropy and Bayesian Methods, pp. 157-164 (AIP, 2000)

This presentation available at http://www.lanl.gov/home/kmh/

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 28