Introduction to Probability and Statistics From a Bayesian Viewpoint_Part 2
Introduction to Bayesian statistics
description
Transcript of Introduction to Bayesian statistics
![Page 1: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/1.jpg)
Introduction to Bayesian statistics Three approaches to Probability
Axiomatic Probability by definition and properties
Relative Frequency Repeated trials
Degree of belief (subjective) Personal measure of uncertainty
Problems The chance that a meteor strikes earth is 1% The probability of rain today is 30% The chance of getting an A on the exam is 50%
![Page 2: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/2.jpg)
Problems of statistical inference Ho: θ=1 versus Ha: θ>1 Classical approach
P-value = P(Data | θ=1) P-value is NOT P(Null hypothesis is true) Confidence interval [a, b] : What does it mean?
But scientist wants to know: P(θ=1 | Data) P(Ho is true) = ?
Problem θ “not random”
![Page 3: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/3.jpg)
Bayesian statistics Fundamental change in philosophy Θ assumed to be a random variable Allows us to assign a probability distribution
for θ based on prior information 95% “confidence” interval [1.34 < θ < 2.97]
means what we “want” it to mean: P(1.34 < θ < 2.97) = 95%
P-values mean what we want them to mean: P(Null hypothesis is false)
![Page 4: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/4.jpg)
Estimating P(Heads) for a biased coin Parameter p Data: 0, 0, 0, 1, 0, 1, 0, 0, 1, 0 p = 3/10 = 0.3 But what if we believe
coin is biased in favor
of low probabilities? How to incorporate prior beliefs into model We’ll see that p-hat = .22
0.1 0.2 0.3 0.4
0.2
0.4
0.6
0.8
1
![Page 5: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/5.jpg)
Bayes Theorem
( and )( | )
( )
( | ) ( )
( )
( | ) ( )
( | ) ( ) ( | ) ( )C C
P A BP A B
P B
P B A P A
P B
P B A P A
P B A P A P B A P A
![Page 6: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/6.jpg)
Example Population has 10% liars Lie Detector gets it “right” 90% of the time. Let A = {Actual Liar}, Let R = {Lie Detector reports you are Liar} Lie Detector reports suspect is a liar. What is
probability that suspect actually is a liar?
( | ) ( )( | )
( | ) ( ) ( | ) ( )
(.90)(.10) 1!!!!!
(.90)(.10) (.10)(.90) 2
C C
P L A P AP A L
P L A P A P L A P A
![Page 7: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/7.jpg)
More general form of Bayes Theorem
1
If , then
( and ) ( | ) ( )( | )
( ) ( )
( | ) ( )
( | ) ( )
n
ii
i i ii
i i
j jj
S A
P A B P B A P AP A B
P B P B
P A A P A
P B A P A
![Page 8: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/8.jpg)
Example Three urns
Urn A: 1 red, 1 blue Urn B: 2 reds, 1 blue Urn C: 2 reds, 3 blues Roll a fair die. If it’s 1, pick Urn A. If 2 or 3, pick Urn B. If 4, 5, 6, pick
Urn C. Then choose one ball. A ball was chosen and it’s red. What’s the probability it came from
Urn C?
( | )
( | ) ( )
( | ) ( ) ( | ) ( ) ( | ) ( )
(2 / 5)(3 / 6)0.3956
(1/ 2)(1/ 6) (2 / 3)(2 / 6) (2 / 5)(3 / 6)
P C red
P red C P C
P red A P A P red B P B P red C P C
![Page 9: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/9.jpg)
Bayes Theorem for Statistics Let θ represent parameter(s) Let X represent data
Left-hand side is a function of θ Denominator on right-hand side does not depend on θ
Posterior distribution Likelihood x Prior distribution Posterior dist’n = Constant x Likelihood x Prior dist’n Equation can be understood at the level of densities Goal: Explore the posterior distribution of θ
( | ) ( | ) ( ) / ( )f X f X f f X
( | ) ( | ) ( )f X f X f
![Page 10: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/10.jpg)
A simple estimation example
Biased coin estimation: P(Heads) = p = ? 0-1 i.i.d. Bernoulli(p) trials Let be the number of heads in n trials Likelihood is For prior distribution use uninformative prior
Uniform distribution on (0,1): f(p) = 1 So posterior distribution is proportional to
f(X|p)f(p) = f(p|X)
1, , nX X
( | ) (1 )X n Xf X p p p
(1 )X n Xp p (1 )X n Xp p
iX X
![Page 11: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/11.jpg)
Coin estimation (cont’d)
Posterior density of the form f(p)=Cpx(1-p)n-x
Beta distribution: Parameters x+1 and n-x+1 http://
mathworld.wolfram.com/BetaDistribution.html Data: 0, 0, 1, 0, 0, 0, 0, 1, 0, 1 n=10 and x=3 Posterior dist’n is Beta(3+1,7+1) = Beta(4,8)
![Page 12: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/12.jpg)
Coin estimation (cont’d) Posterior dist’n: Beta(4,8) Mean: 0.33 Mode: 0.30 Median: 0.3238 qbeta(.025,4,8),
qbeta(.975,4,8)
= [.11, .61] gives 95%
credible interval for p P(.11 < p < .61|X) = .95
![Page 13: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/13.jpg)
Prior distribution Choice of beta distribution for prior
![Page 14: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/14.jpg)
Posterior Likelihood x Prior
= [ px(1-p)n-x ] [ pa+1(1-p)b+1 ]
= px+a+1(1-p)n-x+b+1
Posterior distribution is Beta(x+a, n-x+b)
![Page 15: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/15.jpg)
Prior distributions
Posterior summaries: Mean = (x+a)/(n+a+b) Mode = (x+a-1)/(n+a+b-2) Quantiles can be computed by integrating the
beta density For this example, prior and posterior
distributions have same general form Priors which have the same form as the
posteriors are called conjugate priors
![Page 16: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/16.jpg)
Data example Maternal condition placenta previa Unusual condition of pregnancy where placenta is
implanted very low in uterus preventing normal delivery
Is this related to the sex of the baby? Proportion of female births in general population is
0.485 Early study in Germany found that in 980 placenta
previa births, 437 were female (0.4459) Ho: p = 0.485 versus Ha: p < 0.485
![Page 17: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/17.jpg)
Placenta previa births Assume uniform prior Beta(1,1) Posterior is Beta(438,544) Posterior summaries
Mean = 0.446, Standard Deviation = 0.016 95% confidence interval: [ qbeta(.025,438,544),
qbeta(.975,438,544) ] = [ .415, .477 ]
![Page 18: Introduction to Bayesian statistics](https://reader036.fdocuments.in/reader036/viewer/2022082315/56813a23550346895da1ff92/html5/thumbnails/18.jpg)
Sensitivity of Prior Suppose we took a prior more concentrated
about the null
hypothesis value E.g., Prior ~ Normal(.485,.01) Posterior proportional to
Constant of integration is about 10-294
Mean, summary statistics, confidence intervals, etc., require numerical methods
See S-script: http://www.people.carleton.edu/~rdobrow/courses/275w05/Scripts/Bayes.ssc
2( .485)437 543 2*.01(1 )
p
p p e