“Bootstraps Uplift”: Messages and Mythologies of the New Industrial Order.
Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical...
-
Upload
jayde-ventry -
Category
Documents
-
view
214 -
download
0
Transcript of Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical...
![Page 1: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/1.jpg)
Lecture 6
Bootstraps
Maximum Likelihood Methods
![Page 2: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/2.jpg)
Boostrapping
A way to generateempirical probability distributions
Very handy for makingestimates of uncertainty
![Page 3: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/3.jpg)
100 realizations of a normal distribution p(y) with
y=50 y=100
![Page 4: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/4.jpg)
What is the distribution of
yest = i yi
?
N1
![Page 5: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/5.jpg)
We know this should be a Normal distribution with
expectation=y=50and variance=y/N=10
p(y)
y
p(yest)
yest
![Page 6: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/6.jpg)
Here’s an empirical way of determining the distribution
called
bootstrapping
![Page 7: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/7.jpg)
y1
y2
y3
y4
y5
y6
y7
…
yN
y’1
y’2
y’3
y’4
y’5
y’6
y’7
…
y’N
4
3
7
11
4
1
9
…
6
N o
rigi
nal d
ata
Ran
dom
inte
gers
in
the
rang
e 1-
N
N r
esam
pled
dat
aN1
i y’i
Compute estimate
Now repeat a gazillion times and examine the resulting distribution of estimates
![Page 8: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/8.jpg)
Note that we are doing
random sampling with replacement
of the original dataset y
to create a new dataset y’
Note: the same datum, yi, may appear several times in the new dataset, y’
![Page 9: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/9.jpg)
pot of an infinite number of y’s with
distribution p(y)
cup of N y’s drawn from
the pot
Does a cup drawn from the pot
capture the statistical behavior of what’s in the
pot?
![Page 10: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/10.jpg)
More or less the same thing in the 2 pots ?
Take 1 cup
p(y)D
uplic
ate
cup
an in
fini
te
num
ber
of ti
mes
Pour into new pot
p(y)
![Page 11: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/11.jpg)
Random sampling easy to code in MatLab
yprime = y(unidrnd(N,N,1));
vector of N random integers between 1 and N
original dataresampled data
![Page 12: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/12.jpg)
The theoretical and bootstrap results match pretty well !
theoretical
Bootstrap with 105 realizations
![Page 13: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/13.jpg)
Obviouslybootstrapping is of limited utility when we know the theoretical
distribution
(as in the previous example)
![Page 14: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/14.jpg)
but it can be very useful when we don’t
for example
what’s the distribution of yest
where (yest)2 = 1/(N-1) i (yi-yest)2
and yest= (1/N) i yi
(Yes, I know a statistician would know it follows Student’s T-distribution …)
![Page 15: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/15.jpg)
To do the bootstrap
we calculate
y’est= (1/N) i y’i
(y’est)2 = 1/(N-1) i (y’i-y’est)2
and y’est = (y’
est)2
many times – say 105 times
![Page 16: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/16.jpg)
Here’s the bootstrap result …
Bootstrap with 105 realizations
ytrue
I numerically calculate an expected value of 92.8 and a variance of 6.2
Note that the distribution is not quite centered about the true value of 100
This is random variation. The original N=100 data are not quite representative of the an infinite ensemble of normally-distributed values
pyest)
yest
![Page 17: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/17.jpg)
So we would be justified saying
y 92.6 ± 12.4
that is, 26.2, the 95% confidence interval
![Page 18: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/18.jpg)
The Maximum Likelihood Distribution
A way to fitparameterized probability distributions
to data
very handy when you have good reasonto believe the data follow a particular
distribution
![Page 19: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/19.jpg)
Likelihood Function, L
The logarithm ofthe probable-ness of a given dataset
![Page 20: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/20.jpg)
N data y are all drawn from the same distribution p(y)
the probable-ness of a single measurement yi is p(yi)
So the probable-ness of the whole dataset is
p(y1) p(y2) … p(yN) = i p(yi)
L = ln i p(yi) = i ln p(yi)
![Page 21: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/21.jpg)
Now imagine that the distribution p(y) is known up to a vector m of unknown parameters
write p(y; m) with semicolon as a reminder
that its not a joint probabilty
The L is a function of m
L(m) = i ln p(yi; m)
![Page 22: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/22.jpg)
The Principle of Maximum Likelihood
Chose m so that it maximizes L(m)
L/mi = 0
the dataset that was in fact observed is the most probable one that could have been observed
![Page 23: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/23.jpg)
Example – normal distribution of unknown mean y and variance 2
p(yi) = (2)-1/2 -1 exp{ -½ -2 (yi-y)2 }
L = i ln p(yi) =
-½Nln(2) –Nln() -½ -2 i (yi-y)2
L/y = 0 = -2 i (yi-y)
L/ = 0 = - N -1 + -3 i (yi-y)2
N’s arise because sum is
from 1 to N
![Page 24: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/24.jpg)
Solving for y and
0 = -2 i (yi-y) y = N-1 iyi
0 = -N-1 + -3 i (yi-y)2 2 = N-1 i (yi-y)2
![Page 25: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/25.jpg)
y = N-1 iyi
2 = N-1 i (yi-y)2
Sample mean is the maximum likelihood estimate of the expected value of the normal distribution
Sample variance (more-or-less*) is the maximum likelihood estimate of the variance of the normal distribution
*issue of N vs. N-1 in the formula
Interpreting the results
![Page 26: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/26.jpg)
Example – 100 data drawn from a normal distribution
truey=50=100
![Page 27: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/27.jpg)
L(y,)
y
maxat
y=62=107
![Page 28: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/28.jpg)
Another Example – exponential distribution
p(yi) = ½ -1 exp{ - -1 |yi-y| }
Check normalization … use z= yi-y
p(yi)dy = ½-1 -+
exp{ - -1 |yi-y| } dyi
= ½ -1 2 0
+ exp{ - -1 z } dz
= -1 (-) exp{--1z}|0+ = 1
Is this parameter really the expectation ?
Is this parameter really variance ?
![Page 29: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/29.jpg)
Is y the expectation ?
E(yi) = -+
yi ½ -1 exp{ - -1 |yi-y| } dyi
use z= yi-y
E(yi) = ½ -1 -+
(z+y) exp{ - -1|z| } dz
= ½ -1 2 y o
+exp{ - -1 z } dz
= - y exp{ - -1 z }|o+
= y
z exp(--1|z|) is odd function times even function so integral is zero
YES !
![Page 30: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/30.jpg)
Is the variance ?
var(yi) = -+
(yi-y)2 ½ -1 exp{ - -1 |yi-y| } dyi
use z= -1(yi-y)
E(yi) = ½ -1 -+ 2 z2 exp{ -|z| } dz
= 2 0
+ z2 exp{ -z } dz
= 2 2 2
CRC Math Handbook gives this integral as equal to 2
Not Quite …
![Page 31: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/31.jpg)
Maximum likelihood estimate
L = Nln(½) – Nln() - -1 i |yi-y|
L/y = 0 = - -1 i sgn (yi-y)
L/ = 0 = - N -1 + -2 i |yi-y|
y such that i sgn (yi-y) = 0
x
|x|
x
d|x|/dx
+1
-1
Zero when half the yi’s bigger than y, half of them smallery is the median of the yi’s
![Page 32: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/32.jpg)
Once y is known then …
L/ = 0 = - N -1 + -2 i |yi-y|
= N-1 i |yi-y| with y = median(y)
Note that when N is even, y is not unique,
but can be anything between the two middle values in a sorted list of yi’s
![Page 33: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/33.jpg)
Comparison
Normal distribution:
best estimate of expected value is sample mean
Exponential distribution
best estimate of expected value is sample median
![Page 34: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/34.jpg)
ComparisonNormal distribution:
short tailedoutlier extremely uncommonexpected value should be chosen to make
outliers have as small a deviation as possible
Exponential distribution:relatively long-tailedoutlier relatively commonexpected value should ignore actual value of outliers
yi
median mean
outlier
yi
median mean
![Page 35: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/35.jpg)
another important distributionGutenberg-Richter distribution
(e.g. earthquake magnitudes)
for earthquakes greater than some threshhold magnitude m0, the probability that the earthquake will have a magnitude greater than m is
–b (m-m0)
or P(m) = exp{ – log(10) b (m-m0) }
= exp{-b’ (m-m0) } with b’= log(10) b
P(m)=10
![Page 36: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/36.jpg)
This is a cumulative distribution, thus the probability that magnitude is greater than m0 is unity
P(m) = exp{ –b’ (m-m0) } = exp{0} = 1
Probability density distribution is its derivative
p(m) = b’ exp { –b’ (m-m0) }
![Page 37: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/37.jpg)
Maximum likelihood estimate of b’ is
L(m) = N log(b’) – b’ i (mi-m0)
L/b’ = 0 = N/b’ - i (mi-m0)
b’ = N / i (mi-m0)
![Page 38: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/38.jpg)
Originally Gutenberg & Richtermade a mistake …
magnitude, m
Log
10 P
(m)
slope = -b
… by estimating slope, b using least-squares, and not the Maximum Likelihood formula
least-squares fit
![Page 39: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/39.jpg)
yet another important distributionFisher distribution on a sphere
(e.g. paleomagnetic directions)
given unit vectors xi that scatter around some mean direction x, the probability distribution for the angle between xi and x (that is, cos()=xix) is
p() = sin() exp{ cos() }
2 sinh() is called the “precision parameter”
![Page 40: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/40.jpg)
Rationale for functional form
p() exp{ cos() }
For close to zero 1 – ½2 so
p() exp{ cos() } = exp{ exp{ – ½2 }
which is a gaussian
![Page 41: Lecture 6 Bootstraps Maximum Likelihood Methods. Boostrapping A way to generate empirical probability distributions Very handy for making estimates of.](https://reader035.fdocuments.in/reader035/viewer/2022062515/56649c915503460f9494b9af/html5/thumbnails/41.jpg)
I’ll let you figure out the
maximum likelihood estimate of
the central direction, x,
and the precision parameter,