Outline - math.ntnu.edu.twmath.ntnu.edu.tw/~rtsai/104/bayes/slides/lecture1.pdf · Slide 13— PhD...

Faculty of Life Sciences

Frequentist and Bayesian statistics

Claus EkstrømE-mail: [email protected]

Outline

1 Frequentists and Bayesians• What is a probability?• Interpretation of results / inference

2 Comparisons

3 Markov chain Monte Carlo

Slide 2— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

What is a probability?Two schools in statistics: frequentists and Bayesians.


Frequentist school

School of Jerzy Neyman, Egon Pearson and Ronald Fischer.


Bayesian school

“School” of Thomas Bayes

P(H|D) =P(D|H) ·P(H)!P(D|H) ·P(H)dH


FrequentistsFrequentists talk about probabilities in relation toexperiments with a random component.Relative frequency of an event, A, is defined as

P(A) =number of outcomes consistent with A

number of experiments

The probability of event A is the limiting relative frequency.

0 20 40 60 80 100

0.0

0.2

0.4

0.6

0.8

1.0

n

Rel

ativ

e fre

quen

cy


Frequentists — 2The definition restricts the things we can add probabilities to:

What is the probability of there being life on Mars100 billion years ago?

We assume that there is an unknown but fixed underlyingparameter, θ , for a population (i.e., the mean height onDanish men).Random variation (environmental factors, measurementerrors, ...) means that each observation does not result in thetrue value.


The meta-experiment ideaFrequentists think of meta-experiments and consider thecurrent dataset as a single realization from all possibledatasets.



167.2 cm



167.2 cm175.5 cm



167.2 cm175.5 cm187.7 cm



167.2 cm175.5 cm187.7 cm182.0 cm


Confidence intervals

Thus a frequentist believes that a population mean is real,but unknown, and unknowable, and can only be estimatedfrom the data.Knowing the distribution for the sample mean, he constructsa confidence interval, centered at the sample mean.

• Either the true mean is in the interval or it is not. Can’tsay there’s a 95% probability (long-run fraction havingthis characteristic) that the true mean is in this interval,because it’s either already in, or it’s not.

• Reason: true mean is fixed value, which doesn’t have adistribution.

• The sample mean does have a distribution! Thus mustuse statements like “95% of similar intervals wouldcontain the true mean, if each interval were constructedfrom a different random sample like this one.”


Maximum likelihood

How will the frequentist estimate the parameter?


Maximum likelihood

How will the frequentist estimate the parameter?Answer: maximum likelihood.


Maximum likelihood

How will the frequentist estimate the parameter?Answer: maximum likelihood.

Basic idea

Our best estimate of the parameter(s) are the one(s) thatmake our observed data most likely. We know what we haveobserved so far (our data). Our best “guess” would thereforebe to select parameters that make our observations mostlikely.

Binomial distribution:

P(Y = y) =

"n

y

#py (1−p)n−y


BayesiansEach investigator is entitled to his/hers personal belief ... theprior information. No fixed values for parameters but adistribution.

All distributions are subjective.Yours is as good as mine.

Can still talk about the mean— but it is the mean of mydistribution.

In many cases trying tocircumvent by using vaguepriors.

Thumb tack pin pointingdown:

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Theta

Prio

r dis

tribu

tion


Credibility intervals

Bayesians have an altogether different world-view.

They say that only the data are real. The population mean isan abstraction, and as such some values are more believablethan others based on the data and their prior beliefs.


Credibility intervals

Bayesians have an altogether different world-view.

They say that only the data are real. The population mean isan abstraction, and as such some values are more believablethan others based on the data and their prior beliefs.

The Bayesian constructs a credibility interval, centered nearthe sample mean, but tempered by “prior” beliefs concerningthe mean.

Now the Bayesian can say what the frequentist cannot:“There is a 95% probability (degree of believability) that thisinterval contains the mean.”


Comparison

Advantages Disadvantages

Frequentist Objective Confidence intervals(not quite the desi-red)

Calculations

Bayesian Credibility intervals(usually the desired)

Subjective

Complex models Calculations


In summary

• A frequentist is a person whose long-run ambition is tobe wrong 5% of the time.

• A Bayesian is one who, vaguely expecting a horse, andcatching a glimpse of a donkey, strongly believes he hasseen a mule.


In summary

• A frequentist is a person whose long-run ambition is tobe wrong 5% of the time.

• A Bayesian is one who, vaguely expecting a horse, andcatching a glimpse of a donkey, strongly believes he hasseen a mule.

A frequentist uses impeccable logic to answer thewrong question, while a Bayesean answers the rightquestion by making assumptions that nobody canfully believe in.

P. G. Hamer


Jury duty


Example: speed of light

What is the speed of light in vacuum “really”?Results (m/s)

299792459.2

299792460.0

299792456.3

299792458.1

299792459.5


Outline - math.ntnu.edu.twmath.ntnu.edu.tw/~rtsai/104/bayes/slides/lecture1.pdf · Slide 13— PhD...

Documents

Transcript of Outline - math.ntnu.edu.twmath.ntnu.edu.tw/~rtsai/104/bayes/slides/lecture1.pdf · Slide 13— PhD...