Preparatory Statistics

24
Preparatory Statistics • Beta Handbook: all (most) distributions • Wikibook in statistics: http:en.wikibooks.org/Statistics • MIT open course: EE&CS: Introductory Probability and Statistics • Mainly as lookup literature

description

Preparatory Statistics. Beta Handbook: all (most) distributions Wikibook in statistics: http:en.wikibooks.org/Statistics MIT open course: EE&CS: Introductory Probability and Statistics Mainly as lookup literature. Kolmogorov Axioms. Fundamentals of Bayes ’ method. - PowerPoint PPT Presentation

Transcript of Preparatory Statistics

Page 1: Preparatory Statistics

Preparatory Statistics

• Beta Handbook: all (most) distributions

• Wikibook in statistics: http:en.wikibooks.org/Statistics

• MIT open course: EE&CS:Introductory Probability and Statistics

• Mainly as lookup literature

Page 2: Preparatory Statistics

Kolmogorov Axioms

Page 3: Preparatory Statistics

Fundamentals of Bayes’ method

D - observations from simple or complex Observation space- state or parameter from state or parameter space P- Likelihood function: probability that D is generated in state f - prior: what we know about the statef(|D)- posterior: what we know after seeing DSign between right and left part: proportionality. Multiply rightpart with c normalising the left part so it integrates to 1 over

Page 4: Preparatory Statistics

State space alternatives

• Discrete, two or more states (e.g., diseased, non-diseased)

• Continuous, e.g., an interval of reals• High-dimensional, a vector of reals

(e.g., target position and velocity)• Composite, a vector plus a label

(e.g., missile type x at point p with velocity v)• Product space: vector of non-fixed dimension

(e.g., one or more targets approaching,one or more disease-causing genes)

• Voxel set (medical imaging, atmosphere mapping

Page 5: Preparatory Statistics

Testing for disease

• State space (d,n), disease or not

• Observation space (P,N) positive or negative

• Prior: 0.1% of population has disease,prior is (0.001,0.999)

• Likelihood: Test gives 5% false negatives, 10% false positives: P: (0.95, 0.1), N: (0.05, 0.90)

Page 6: Preparatory Statistics

Testing for disease• State space (d,n), disease or not• Observation space (P,N) positive or negative• Prior: 0.1% of population has disease,

prior is Prior=[0.001,0.999]• Likelihood: 5% false negatives, 10% false positives:

P: [0.95, 0.10], N: [0.05, 0.90]• Combing prior and likelihood:

Prior.*P=[0.00095,0.0999]; ->[0.01,0.99]Prior.*N=[0.00005,0.8991]; ->[0.0001, 0.9999]

Page 7: Preparatory Statistics

Deciding target type

• Attack aircraft: small, dynamic • Bomber aircraft: large, dynamic• Civilian: Large, slow dynamics• Prior: (0.5,0.4,0.1);• Observer 1: probably small,

likelihood (0.8,0.1,0.1);• Observer 2: probably fast,

likelihood (0.4,0.4,0.2);

Page 8: Preparatory Statistics

Target classification, Matlab

>> prior=[0.5,0.4,0.1 ];>> lik1=[0.8,0.1,0.1 ];>> lik2=[0.4,0.4,0.2];>> post1=prior.*lik1; post1=post1/sum(post1)post1 = 0.8889 0.0889 0.0222>> post2=prior.*lik2; post2=post2/sum(post2)post2 = 0.5263 0.4211 0.0526>> post12=post1.*lik2; post12=post12/sum(post12)post12 = 0.8989 0.0899 0.0112

Page 9: Preparatory Statistics

Odds and Bayes’ Factor

>> OddsCiv=post12(3)/(1-post12(3)) %Civilian vs notOddsCiv = 0.0114

>> OddsAtt=post12(1)/(1-post12(1)) % Attack vs notOddsAtt = 8.8889>>

In first case the Odds is conveniently low, not CSecond case high, probably A

Odds for A against B is P(A|D)/P(B|D):

Page 10: Preparatory Statistics

Inference on a probability

• Bayes’ original problem: estimating success probability p from experiment with f failures and s successes, n=s+f;

• Prior is uniform probability for p;

• In particular s=9; n=12;

• Likelihood:

Page 11: Preparatory Statistics

Estimating a probability

>> p=[0:0.01:1];>> likh=p.^9.*(1-p).^3;>> posterior=likh/sum(likh);>> plot(p)>> print -depsc beta>>

NOTE: In Lecture notes example, s and f are swappedand the computation is analytic instead of numeric!

Page 12: Preparatory Statistics

Estimating a probability>>postcum=cumsum(post);>> plot(p,postcum,'b-',[0,1],[0.025 0.025],…'r-',[0,1],[0.975 0.975],'r-');>>

95% credible intervalfor p: [0.46, 0.91]in other words, fairness is notrejected. Estimate p byposterior mean:>> sum(posterior.*p)ans = 0.7143>> postcum([50:51])ans = 0.0427 0.0497

95% credible interval

Page 13: Preparatory Statistics

Is the coin balanced (LN 2.1.10)?

• Use outcome D:(s,f) in flipping n=s+f times• Evaluate using two models, one H_r where probability

is 0.5, one H_u where it is uniformlydistributed over [0,1].

• P(D:(s,f)|H_r) = 2^(-n)• P(D:(s,f)|H_u) = s!f!/(n+1)! (normalization in Beta

dist)• For s=3, f=9,

Bayes factor P(D|H_u)/P(D|H_r)1.4, orP(H_r|D) 0.42 ; P(H_u|D) 0.58

HW 1

Page 14: Preparatory Statistics

Is the coin balanced (LN 2.1.10)?>> s=3;f=9;>> gamma(s+1)*gamma(f+1)/gamma(s+f+2)*2^(s+f)ans = 1.4322

>> s=6; f=18;>> gamma(s+1)*gamma(f+1)/gamma(s+f+2)*2^(s+f)ans = 4.9859

>> s=30;f=90;>> gamma(s+1)*gamma(f+1)/gamma(s+f+2)*2^(s+f)ans = 6.4717e+05% in logs:>> exp(gammaln(s+1)+gammaln(f+1)-gammaln(s+f+2)+…log(2)*(s+f))ans = 6.4717e+05

Page 15: Preparatory Statistics

Dissecting Master Bayes’ formula

• Parametrized and composite models:

Page 16: Preparatory Statistics

Recursive & Dynamic inference

• Repeated measurements improve accuracy:

• Chapman Kolmogorov, tracking in time:

Page 17: Preparatory Statistics

Retrodiction: what happened?

Retrodiction(smoothing) gives additional precision, but later

Page 18: Preparatory Statistics

MCMC: PET camera

D: film, count by detector jX: radioactivity in voxel ia_ij: camera geometry Fraction of emission from voxel i reaching detector j

likelihood

prior

Inference about X gives posterior, its mean is often a good picture of patient

Page 19: Preparatory Statistics

MCMC: PET camera

likelihood

prior

MCMC: Stochastic solution of probability problemsGenerate sequence of states with the same distributionas the posterior. In this case (X1, X2, …). Each memberis a full 3D image.ESTIMATE X by taking mean over trace.

Page 20: Preparatory Statistics

MCMC: PET camera

likelihood

prior

Page 21: Preparatory Statistics

MCMC: PET cameraMain MCMC loop: We have (X1, X2, … Xk) and wantto compute X(k+1).

Propose a new image Z by changing the value in one voxel

Compute a=(Z)/(Xk), acceptance probability.Accept X(k+1)=Z if a>1 or with probability a.If not accept, X(k+1)=Xk. Matlab: if a>rand X(k+1)=Z else X(k+1)=X(k) end;

In practise: Compute in logarithms to avoid underflowDifferential computation: Most of the terms in (Z) same as in (Xk)

Page 22: Preparatory Statistics

Sinogram and reconstruction

Tumour

Fruit FlyDrosophila family (Xray)

Page 23: Preparatory Statistics

Does Bayes give the right answer?

• Output is a posterior. How accurate?Depends on prior and likelihood assessed.

• If data is generated by distribution g(x)and inference is for parameter of f(x|)then asymptotically posterior for willconcentrate on argmin KL(g(.),f(.| ))

KL: Kullback-Leibler distance

Page 24: Preparatory Statistics

Does Bayes give right answer?

• Coherence:If you evaluate bets on uncertain events, anyone who does not use Bayes’ rule to evaluate will potentially loose unlimited amount to you who use Bayes’ rule. (Freedman Purves, 1969)

• Consistency:Observing properties of Boolean (set) algebra, the calculus of plausibility has to be embeddable in an ordered field where + and correspond to the combination functions for deriving plausibilities of disjunction and conjunction. (Jaynes Ch 2, Arnborg, Sjödin MaxEnt 2000, ECCAI 2000)