Mathematical Ideas that Shaped the World Bayesian Statistics.
-
Upload
jaxon-stacy -
Category
Documents
-
view
217 -
download
4
Transcript of Mathematical Ideas that Shaped the World Bayesian Statistics.
Mathematical Ideas that Shaped the World
Bayesian Statistics
Plan for this class
Why is our intuition about probability so bad?
What is the chance that two people in this room were born a few days apart?
What is conditional probability? If someone’s DNA is found at a crime
scene, what is the chance they are guilty? How can we spot bad statistics in the
media?
An unfortunate truth
Humans have an extraordinarily bad
intuition about probability.
Winning the lottery
What do you think your chances of winning the lottery are?
Say whether winning the lottery is more or less likely to happen than this collection of events…
Is winning the lottery more or less likely?
Chance of getting 12 heads in a row when flipping a fair coin.
LESS MORE
1 in 4,096
Is winning the lottery more or less likely?
Dying from a road accident in 1 year
LESS MORE
1 in 24,000
Is winning the lottery more or less likely?
Dying in the next flight you take
LESS MORE
1 in 25 million
Is winning the lottery more or less likely?
Being struck by lightning
LESS MORE
1 in 1 million
Is winning the lottery more or less likely?
Dying from a shark attack
LESS MORE
1 in 300 million
Is winning the lottery more or less likely?
Dying in the next hour from any causes whatsoever
LESS MORE
1 in 2 million
Conclusion
Winning the lottery has surprisingly bad odds: 1 in 13,983,816.
Yet many people are convinced that this could one day be likely to happen to them.
We mix up the probability of someone winning the lottery (which is quite likely) with the probability of us winning the lottery.
The birthday problem
How many people need to be a room together so that there is a more than 50% chance of two people having the same birthday?
A) 300
B) 183
C) 91 D) 23
Number of people
Probability that 2 people share a birthday
10 11.7%
20 41.1%
23 50.7%
30 70.6%
50 97%
57 99%
100 99.99997%
200 99.9999999999999999999999999998%
366 100%
The birthday graph
In this room?
What is the chance that two people in this room have birthdays less than 3 days apart (ignoring the year?)
Answer: more than 50%
Monty Hall
Behind 1 door is a sheep. Behind the other 2 doors are other, non-sheepy, animals.
You choose a door. I open a different door showing a non-sheep.
Given the choice now of sticking with your choice or switching, what should you do?
Suppose you choose Door 1…
Door 1 Door 2 Door 3 Stick Switch
Sheep! Not a sheep
Not a sheep
Sheep! No sheep
Not a sheep
Sheep! Not a sheep
No sheep
Sheep!
Not a sheep
Not a sheep
Sheep! No sheep
Sheep!
If you stick with your choice, you only win 1 time out of 3.
Conditional probability Conditional probability is the chance of
something happening given that another event has already happened.
For example: you throw two dice. What is the probability of the first die being a 6 given that the sum of the two dice is 8?
What if the sum of the two dice was 6 or 7?
How to think about conditional probability Conditional probability is all about
updating your odds in light of new evidence.
There are a priori odds – the initial probability of an event. E.g. the probability of rolling a 6 is a priori 1 in 6.
After new evidence, you have a posteriori odds. E.g. the probability of having a 6, given that the
sum of two dice is 8, is 1 in 5.
Boy or girl?
I know a friend who has 2 children. At least one of the children is a boy.
What is the chance that the other child is also a boy?
Answer: 1 in 3
Explanation A priori, there are 4 possible
combinations of children: Boy – Boy Boy – Girl Girl – Boy Girl - Girl
From our new evidence, we know that Girl-Girl is not possible, leaving only 3 options.
Of these 3 options, only one of them is Boy-Boy.
A paradox?
If you know that the oldest child is a boy, the probability of the other child being a boy is 50%.
If you know that the youngest child is a boy, the probability of the other child being a boy is 50%.
Surely the first boy must be either the youngest or the oldest?!
Homework
I know a friend who has two children. At least one of the children is a boy who
was born on a Tuesday.
What is the chance that the other child is also a boy?
Confusion of the inverse
People have a tendency to assume that a conditional probability and its inverse are similar. For example:
If sheep enjoy eating grass, then an animal who likes grass is likely to be a sheep.
If most accidents happen within 20 miles of home, then you are safest when you are far from home.
Manipulating statistics
A. Taillandier (1828) found that 67% of prisoners were illiterate.
“What stronger proof could there be that ignorance, like idleness, is the
mother of all vices?”
But what proportion of illiterate people were criminals?
Bayesian statistics
The first person we know who looked seriously into conditional probabilities was Thomas Bayes.
He was the first person to write down a formula connecting the two inverse conditional probabilities.
Bayesian statistics is all about updating the odds of an event after receiving new evidence.
Thomas Bayes (1702 – 1761) Son of a London
Presbyterian minister. Studied logic and
theology at the University of Edinburgh.
In 1722 returned to London to assist his father before becoming a minister of his own church in Tunbridge Wells, Kent, in 1733.
Thomas Bayes (1702 – 1761)
During his lifetime, Bayes only published two papers. One was on “Divine Benevolence”. The other was a defence of “The Doctrine
of Fluxions” against the attack of George Berkeley.
His most famous paper was published in 1764, called “An Essay towards solving a problem in the Doctrine of Chances”.
Bayes’ Theorem
P(A) is the prior probability of A. P(B) is the prior probability of B. P(A|B) is the probability of A happening,
given that B has happened. P(B|A) is the probability of B happening,
given that A has happened.
Importance of Bayes’ Theorem Bayes’ Theorem is especially useful in
medicine and in law. Most doctors get the following question
wrong. Let’s see what you think!
A test for breast cancer
1% of women aged 40 will get breast cancer. Out of the women who have breast cancer,
80% of them will have a positive test result. Out of the women who don’t have breast
cancer, 10% of them will get a positive result.
If a woman tests positive for breast cancer, what is the chance she has
actually has it?
Doing the numbers Consider 10,000 women. 100 of them will have breast cancer.
80 of them test positive 20 of them test negative
9900 of them don’t have breast cancer. 990 of them test positive 8910 of them test negative
In total there are (80+990) = 1070 positive results, of which only 80 have cancer.
That’s 7.4%.
The prosecutor’s fallacy Suppose a prosecutor in a court case finds a
piece of evidence – e.g. a DNA sample.
They argue that the probability of finding this evidence if the defendant were innocent is tiny.
Therefore the defendant is very unlikely to be innocent.
Where is the fallacy in this argument?
The prosecutor’s fallacy
If the a priori chance of the defendant’s guilt is very low, then it will still be very low after presentation of this evidence.
Just like with the cancer example, a false positive may be much more likely than a true positive in the absence of other evidence.
Exhibit 1: Sally Clark, 1999
Convicted of murdering both her sons.
Paediatrician Roy Meadow argued that the chance of both children dying naturally was 73 million to 1.
Didn’t take into account that double murder would have been more unlikely.
Conviction overturned in 2003.
Exhibit 2: Denis Adams, 1996 Convicted of rape based on DNA found at the
scene of the crime. Probability of a match said to be 1 in 20
million. There was no other evidence to convict: victim
did not identify Adams in a line-up and Adams had an alibi.
The defence team instructed the jury in the use of Bayes’ Theorem. The judge questioned its appropriateness.
After 2 appeals, Adams is still convicted.
A rule against Bayes
In 2010 a convicted killer known as “T” appealed against his conviction.
Part of the evidence was based on the special markings on his Nike trainers.
The data on how many pairs of such trainers existed was unreliable.
It has now been ruled that Bayes’ Theorem is not allowed in court unless the underlying statistics are “firm”.
Quotes of statistics
“98% of all statistics are made up” “The average human has one breast
and one testicle. “ “Statistics are like bikinis. What they
reveal is suggestive, but what they conceal is vital. “
“There are three kinds of lies: lies, damned lies, and statistics.“
Misuse of statistics We are going to look at some examples
of bad statistics in the media. What things should we look out for to
spot bad maths and stats?
Strange patterns
Matt Parker, of Queen Mary University London, look at 800 ancient sites.
3 sites, around Birmingham, formed a perfect equilateral triangle.
Extending the base of this triangle links up 2 more sites, more than 150 miles apart, with an accuracy of 0.05%.
Ancient sites?
Ancient sites?
What to watch out for
Events assumed to be independent (e.g. ‘6 double yolks’ article).
Patterns found using large amounts of data (e.g. ‘ancient sat-nav’ article)
Other factors not taken into account (e.g. ‘perfect whist deal’ article)
Confusion of the inverse Omission of relevant data Misleading labelling of graphs
Lessons to take home
Don’t play the lottery. Think very carefully when you are
asked a question about probability. Don’t confuse conditional probabilities
with their inverses. Ask questions whenever you see
statistics in the media! (And write in to report bad journalism!)