BIO5312 Biostatistics Lecture 5: Estimations...Interval estimation of = E(X): ˙known Interval...

BIO5312 BiostatisticsLecture 5: Estimations

Yujin Chung

September 27th, 2016

Fall 2016

Yujin Chung Lec5: Estimations Fall 2016 1/34

Recap


Today’s lecture and some following lectures

How to infer the properties of the underlying distribution in a dataset.

Two types of statistical inferences:

Estimation: concerned with estimating the values of specificpopulation parameters. These specific values are referred to aspoint estimates. Sometimes, interval estimation is carried outto specify an interval which likely includes the parameter values.

Hypothesis testing: concerned with testing whether the value ofa population parameter is equal to some specific value


Point estimations

Let X1, . . . , Xn be a random sample from a probability distribution.That is, X1, . . . , Xn are independent and identically distributed (iid).

If X1, . . . , Xn ∼ N(µ, σ2), what are the point estimations of µ andσ2, respectively?

If X1, . . . , Xn ∼ Bernoulli(p), what is the point estimation of p?

If X1, . . . , Xn ∼ Poisson(λ), what is the point estimations of λ?


A point estimation of the population mean

Consider a random sample X1, . . . , Xn drawn from a distribution withmean µ = E(X) (unknown).

A natural estimator for the population mean µ is the sample mean:

µ = E(X) = X =1

n

n∑i=1

Xi

If X1, . . . , Xn ∼ N(µ, σ2), X is a point estimation of E(X) = µ.

If X1, . . . , Xn ∼ Bernoulli(p), X (the proportion of success) is apoint estimation of E(X) = p.

If Y ∼ B(n, p), Y = Y (the number of successes) is a pointestimations of E(Y ) = np. That is, p = Y/n = X, whereX1, . . . , Xn ∼ Bernoulli(p).If X1, . . . , Xn ∼ Poisson(λ), X is a point estimations ofE(X) = λ?


Examples

Suppose a random sample of 5000 women is selected from this agegroup, of whom 28 are found to have malignant melanoma. Whatis the probability of having the disease (prevalence)?

Let p be the probability of having the disease. Let the randomvariable Xi represent the disease status for the ith woman, whereXi = 1 if the ith woman has the disease and 0 if she does not fori = 1, . . . , 5000. The random variable Xi was also defined as aBernoulli trial. That is, X1, . . . , X5000 ∼ Bernoulli(p). Then apoint estimation of E(X) = p is p = x = 28

5000 = 0.0056.


Properties of X (1)

An unbiased estimator of µ: E(X) = µ

proof) E(X) = E

(1

n

n∑i=1

Xi

)=

1

n

n∑i=1

E(Xi) =1

n

n∑i=1

µ = µ

We consider infinitely many sets of random sample of size n. Fromeach sample, the sample mean X is computed. Then the average valueof X over infinitely many sets is µ = E(X).

0 2000 4000 6000 8000 10000

9.96

9.98

10.0

0

the number of sets of random sample

Ave

rage

of s

ampl

e m

eans

Sample~N(10,1)Sample size = 100


Properties of X (2)

The minimum variance unbiased estimator of µ:If the underlying distribution is normal, then it can be shown that theunbiased estimator with smallest variance is given by X.

For example, from a random sample X1, . . . , Xn ∼ N(µ, σ2), weconsider two estimators for µ: one is X and the other is the firstobservation X1. Both are unbiased estimators: E(X) = µ andE(X1) = µ. However, X has a smaller variance than X1:

V ar(X) = V ar

(1

n

n∑i=1

Xi

)=

1

n2

n∑i=1

V ar (Xi) =1

n2nσ2 =

σ2

n,

V ar(X1) = σ2.


Properties of X (3)

A consistent estimator of µ:The estimator X converges to the population mean µ, as the samplesize n goes to infinity.

0e+00 2e+04 4e+04 6e+04 8e+04 1e+05

9.90

10.0

010

.10

Sample size (n)

Sam

ple

mea

n

Sample~N(10,1)


A point estimation for V ar(X)

Let X1, . . . , Xn ∼ N(µ, σ2).

A point estimation of V ar(X) = σ2 is1

n

n∑i=1

(Xi − µ)2. (unbiased)

pf) E

(1

n

n∑i=1

(Xi − µ)2

)=

1

n

n∑i=1

E[(Xi − µ)2] =1

n

n∑i=1

σ2 = σ2

Let X1, . . . , Xn ∼ N(µ, σ2) but µ is unknown.

σ2 = S2 =1

n− 1

n∑i=1

(Xi− X)2. (unbiased), σ =√σ2 = S. (biased)

Let X1, . . . , Xn ∼ Bernoulli(p).A point estimation of V ar(X) = p(1− p) isV ar(X) = p(1− p) = X(1− X) (biased)

Let X1, . . . , Xn ∼ Poisson(λ).A point estimation of V ar(X) = λ is λ = X as the estimation ofE(X) = λ. (unbiased)


Consistency of a variance estimator

The estimators of variances on the previous slide are Consistent!Let X1, . . . , Xn ∼ Bernoulli(p). As the sample size goes to the infinity,V ar(X) = p(1− p) = X(1− X) converges to p(1− p).

0e+00 2e+04 4e+04 6e+04 8e+04 1e+05

0.00

0.10

0.20

Sample size

Var

ianc

e es

timat

ion

random sample~Bernoulli(0.7)


Examples

LEAD data exampleWhat are the point estimations of the mean and standard deviation ofthe full IQ of children in the exposed group?

The point estimation of the mean is x = 88.02 and the estimation ofthe standard deviation is s2 = 12.207.


Interval estimation of µ = E(X): σ known

Interval Estimation: specify an interval which likely includes aparameter value of interest.

Point estimates do not reflect our uncertainty when estimating aparameter. We always remain uncertain regarding the true value of theparameter when we estimate it using a sample from the population. Toaddress this issue, we can present our estimates in terms of an intervalof possible values (as opposed to a single value).


Interval estimation: the uncertainty of X

Let X = (X1, . . . , Xn), where Xi ∼ N(µ, σ2) for i = 1, . . . , n. Samplemean X is a point estimator of µ. How is X distributed?

X ∼ N(µ, σ2/n)

What is the interval (u(X), v(X)) such thatPr[(u(X), v(X)) 3 µ] = .95?

StandardizationX − µσ/√n∼ N(0, 1)

Consider a constanta = z0.925 = 1.96 such that

Pr

(−a < X − µ

σ/√n< a

)= 0.95


Interval estimation for µ

Pr

(−a < X − µ

σ/√n< a

)= Pr

(−a σ√

n< X − µ < a

σ√n

)= Pr

(X + a

σ√n> µ & X − a σ√

n< µ

)= Pr

[(X − a σ√

n, X + a

σ√n

)3 µ]

= 0.95

Therefore, u(X) = X − 1.96σ√n

and v(X) = X + 1.96σ√n

.


Confidence interval for µ

Let X = (X1, . . . , Xn), where Xi ∼ N(µ, σ2) for i = 1, . . . , n. Assume σis known.100(1− α)% confidence interval for µ:(

X − z1−α/2σ√n, X + z1−α/2

σ√n

)For short hand, X ± z1−α/2

σ√n

.

1− α: confidence level

z1−α/2: critical value for confidence level 1-α

For example, the 95% confidence interval for µ is(X − 1.96

σ√n, X + 1.96

σ√n

)


Examples

LEAD data exampleThe point estimation of the mean of the full IQ of children in theexposed group is x = 88.02.

Assume the full IQ follows a normal distribution with standarddeviation is σ = 12.207. Compute 95% confidence interval for themean of IQ.

There are 46 children in the exposed group. The standard error isσ/√n = 1.799. Therefore, the 95% CI is

(88.02− 1.96× 1.799, 88.02 + 1.96× 1.799) = (84.494, 91.549)


Confidence interval (CI)

Interpretations of 95% CI:

The probability that the interval contains the true value(parameter) is 0.95

Consider infinitely many sets of random sample of size n andcompute the CIs. 95% of the infinitely many CIs will contain thetrue value.


Factors Affecting the Length of a CI

the 95% confidence interval (CI) for µ is(X − 1.96

σ√n, X + 1.96

σ√n

)The length of the CI indicates the precision of the point estimate X.The length of a 100%(1− α) CI for equals 2z0.975σ/

√n and is

determined by α, the standard error σ/√n.

α: as the confidence desired increases (decreases), the length of theCI increases.

n: as the sample size (n) increases, the standard error decreasesand the length of the CI decreases

σ: As the variability of the distribution increases, the length of theCI increases


Interval estimation of µ = E(X): σ unknown

If X1, . . . , Xn N(µ, σ2), the 95% confidence interval (CI) for µ is(X − 1.96

σ√n, X + 1.96

σ√n

)What if we don’t know σ?

X − µs/√n

is distributed as a t distribution with (n− 1)df.

A 100%× (1− α) CI is given by

(X − tn−1,1−α/2S/√n, X + tn−1,1−α/2S/

√n),

where tn−1,1−α/2 is the (1− α/2)th percentile of tn−1 distribution

If n > 200, use the standard normal distribution instead of tn−1:

(X − z1−α/2S/√n, X + z1−α/2S/

√n)


Examples

LEAD data exampleThe point estimation of the mean of the full IQ of children in theexposed group is x = 88.02. There are 46 children in the exposedgroup.

Assume the full IQ follows a normal distribution Compute 95%confidence interval for the mean of IQ.

The standard error is σ/√n = 1.799. The critical value is

t45,.975 = 2.014. Therefore, the 95% CI is

(88.02− 2.014× 1.799, 88.02 + 2.014× 1.799) = (84.396, 91.643)

Note: The CI with unknown σ is wider than the CI (84.494, 91.549)with known σ.


Interval Estimation of the Variance of a Distribution

Let X1, . . . , Xn ∼ N(µ, σ2). A point estimation of σ2 is S2.

Using(n− 1)S2

σ2∼ χ2

n−1, a 95% CI is (u(X), v(X)) such that

Pr(u(X) < σ2 & v(X) > σ2

)= 0.95

Find (u(X), v(X)):

0.95 = Pr

(χ2n−1,0.025 <

(n− 1)S2

σ2< χ2

n−1,0.975

)= Pr

((n− 1)S2

χ2n−1,0.025

> σ2 &(n− 1)S2

χ2n−1,0.975

< σ2

)

A 100%× (1− α) CI for σ2 is given by((n− 1)S2

/χ2n−1,1−α/2, (n− 1)S2

/χ2n−1,α/2

)Yujin Chung Lec5: Estimations Fall 2016 22/34

Examples

LEAD data exampleThe point estimation of the variance of the full IQ of children in theexposed group is s2 = 149.99. There are 46 children in the exposedgroup.

Assume the full IQ follows a normal distribution. Compute 95%confidence interval for the variance of IQ.

The critical values are χ245,.025 = 28.366 and χ2

45,.975 = 65.41.Therefore, the 95% CI is

(45× 149.99/65.41, 45× 149.99/28.366) = (103.188, 237.945).


CI for a binomial parameter p

Let X be a binomial random variable with parameters n and p. Anunbiased estimator of p is given by the sample proportion of eventsp = X/n. Its standard error is estimated by

√p(1− p)/n.

By the Central limit theorem,p− p√

p(1− p)/n→ Z, where Z ∼ N(0, 1),

as n→∞.

We replace p by p in the standard error:p− p√

p(1− p)/n∼N(0, 1).

When np(1− p) ≥ 5 (that is np(1− p)), an approximate100%× (1− α) CI for the binomial parameter p:

p± z1−α/2√p(1− p)/n


‘Exact’ CI for a binomial parameter p

When np(1− p) and X = x, an ‘exact’ binomial distribution to build aCI. As p increase, Pr(X ≥ x|p) increases, while Pr(X ≤ x|p) decreases.CI for a binomial parameter p is obtained by (p1, p2) such that

p1 = min{p|Pr(X ≥ x|p) > α/2} & p2 = max{p|Pr(X ≤ x|p) > α/2}

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.4

0.8

p=0.2, n=5, X=2 p=0.4, Exact CI (0.15, 0.85)

parameter p

Pro

babi

lity

Pr(X <= 2|p) Pr(X >= 2|p)


Examples

Suppose a random sample of 5000 women is selected from this agegroup, of whom 28 are found to have malignant melanoma. Whatis the probability of having the disease (prevalence) and the 95%CI?

Let p be the probability of having the disease. Let Xi = 1 if theith woman has the disease; 0 otherwise, for i = 1, . . . , 5000. Thena point estimation of E(X) = p is p = x = 28

5000 = 0.0056.

Since np(1− p) = 27.8432 ≥ 5, we use the normal approximationto compute a CI for p. The standard error is estimated by√p(1− p)/n = 0.00106 and hence the 95% CI for p is

(0.0056−1.96×0.00106, 0.0056+1.96×0.00106) = (0.0035, 0.0077).


CI for Poisson Distribution

Let X1, . . . , Xn ∼ Poi(λ). A point estimation of λ is λ = X and itsstandard error is

√λ/n.

By the central limit theorem, an approximate 100%(1− α) CI for λ is

λ± z1−α/2√λ/n

Let S =

n∑i=1

Xi. Then, S =

n∑i=1

Xi ∼ Poi(nλ). An “exact” CI for λ is

(λ1, λ2) such that

λ1 = min{λ|Pr(S ≥ s|λ) > α/2} & λ2 = max{λ|Pr(S ≤ s|λ) > α/2}


Bootstrap confidence interval

Real data has a complex structure and we may be interested in morecomplex parameters or quantity.

We have a large sample(e.g., n = 1000), but its distribution is veryskewed. We’d like to compute a CI for the population mean, butthe Normal approximation may not be good enough.

Histogram of a data

Fre

quen

cy

0 100 200 300 400

020

060

010

00

n=1000

If we are interested in the median of a data, how to compute a CIfor the median?


Bootstrap confidence interval

Let X1, . . . , Xn be randomly sampled from an unknowndistribution. We are interested in estimating a parameter θ andthe estimator of θ is θ = S(X). How can we build a CI for θ?

A confidence interval for θ is in the form of

(point estimation)± z1−α/2 × (standard error of the estimation).

That is, θ ± z1−α/2SE(θ)!

However, we do NOT know the standard error of θ.

We use the bootstrap method to estimate the standard error.


Bootstrap method: resampling method

Goal: estimating the distribution of θ and hence the s.d. of θ

1 Estimating the distribution from which the data was sampled:the population is estimated by the sampled data x = (x1, . . . , xn)

2 Sample many data sets from the estimated distribution P :x∗1, . . . ,x

∗B

3 Compute the estimation of θ: θ(x∗1), . . . , θ(x

∗B) (This forms the

distribution of θ)


Bootstrap confidence intervals

Now we estimated the distribution of θ: θ(x∗1), . . . , θ(x

∗B).

100%(1− α) CI for θ (normal approximation) is

(θ + (θ − θ∗))± z1−α/2se∗(θ),

where θ∗ =1

B

B∑i=1

θ(x∗i ) and se∗(θ) =

√√√√ 1

B − 1

B∑i=1

(θ(x∗i )− θ∗)2.

A percentile CI is using the 100%(α/2)th and 100%(1− α/2)thpercentiles of θ(x∗

1), . . . , θ(x∗B):

(q∗α/2, q∗1−α/2)


Examples

LEAD data exampleThe point estimation of the mean of the full IQ of children in theexposed group is x = 88.02. There are 46 children in the exposedgroup.

Compute the 95% normal- and percentile CIs for the mean IQ.

Resample the data and generate 1,000 replicates.The normal-CI is (84.64, 91.35) and the percentile-CI is (84.78,91.54).

Previously with normal assumption, the CIs are (84.494, 91.549)with known σ = s and (84.396, 91.643) with unknown σ.


Summary

A point or interval estimation of a parameter of interest from a randomsample.

1 Identify the data type: continuous? Normal or non-normal?Bernoulli? Poisson? unknown?

2 What parameter or quantity to estimate? Are there any otherunknown parameters?

3 What is a point estimation of the parameter of interest? Is theestimator good enough?

4 What is the standard error of the estimate?5 What is a CI for the parameter? What is the distribution of the

point estimation?I Normal, Chi-square, t-dist, Binomial dist, etcI Normal approximation?I Difficult to discover or unknown: Bootstrap


Next week

Statistical hypothesis testing: concerned with testing whether thevalue of a population parameter is equal to some specific value.


BIO5312 Biostatistics Lecture 5: Estimations...Interval estimation of = E(X): ˙known Interval...

Documents

Transcript of BIO5312 Biostatistics Lecture 5: Estimations...Interval estimation of = E(X): ˙known Interval...