How sure we really are Confidence intervals for means and proportions FETP India.

How sure we really are

Confidence intervals for means

and proportions

FETP India

Competency to be gained from this lecture

Calculate 95% confidence intervals for means and proportions

Key issues

• Concept of confidence interval• Confidence interval for means• Confidence interval for proportions

What we learnt so far (1/3)

• Population parameters are fixed• We can take samples from the

population• Several samples of size ‘n’ are possible• Each sample give estimates (e.g.,

means) called “statistics” • Statistics vary from sample to sample

This is called “Sampling fluctuation”

Concept of confidence interval


• The distribution of a statistic for all possible samples of given size ‘n’ is called “sampling distribution”

• For large ‘n’, the sampling distribution is ‘normal’ even if the original distribution is not

• If the original distribution is normal, the result is true even for small ‘n’



• The mean of the sampling distribution is the ‘population mean’

• The standard deviation of the sampling distribution is known as standard error SE= Population SD /√n


Easy to estimate the standard deviation, difficult to estimate the

mean• Samples generate sample means and

standard error• The usefulness of these parameters

vary: The standard deviation from a single sample

as an estimate of population SD for large ‘n’ is fair

The mean from a single sample as an estimate of population mean may not be


How can the population mean be estimated?

• It is desirable to give a range of values with a specific level of confidence that the true population mean is one of the values in the range

• We can obtain this using the sampling distribution – which is ‘normal’ using the properties of ‘normal’ distribution Mean Standard deviation


From the standard error (SE) to the confidence interval

• The point estimate x (mean in the sample) is a point in the sampling distribution and there is a 95% chance that it lies in the µ1.96 SE interval

• But µ is not known• Interchanging µ and x we can infer that

there is a 95% chance that µ lies in the interval x 1.96 SE


Inference using various levels of confidence

• Using the properties of the normal distribution, we can infer what proportion of the values lie between values

• Considering the distribution of the means: 68% of sample means will lie within 1 standard

deviation above or below the sample mean 95% of sample means will lie within 1.96

standard deviation above or below the sample mean

• “1.96” come from the standard z table for alpha=0.05


Confidence interval for a mean

• The confidence interval of the mean gives the range of plausible values for the true population mean

€

95%CI=(x - 1.96σ

n, x +1.96

σ

n)


Example of a calculation of a confidence interval for a mean

• Sample of 100 observations, Mean height is 68” SD: 10”

• Standard error of the mean = 10 / 100 = 1

• 95% confidence limits for population mean are 68 1.96 x (1) Approximately 66” to 70”

€

95%CI=(68 -1.9610

100, 68 +1.96

10

100)


Interpretation of the calculation of the confidence interval for a mean

• The 95% confidence interval for the mean of 68 is (66, 70)

• This means that with repeated random sampling, 95% of the intervals will contain the true mean (µ)

• Since we have one of these intervals, we can be 95% confident that this interval contains the true mean


Calculating a 95% confidence interval

for a mean in practice • Epi-Info, “Epitable” module• Open-Epi calculator (Open source)

www.openepi.com

• Excel


Calculating a 95% confidence interval for a mean in OpenEpi: 1/2

(Methods)

1. Choose “Mean, CI”

2. Click “Enter”

3. Enter data

4. Click “calculate”


Calculating a 95% confidence interval for a mean in OpenEpi: 1/2

(Results)


Exercise to calculate the 95% confidence interval for a mean

• Study of gestational age at birth in the past month in a sample of health care facilities

• Results of the study n=350 births Sample mean= 37.5 weeks s=12.2

• What is the 95% confidence interval?

€

95%CI=(37.5 -1.9612.2

350, 37.5+1.96

12.1

350) =(36, 39)


Applying the same methods to generate confidence intervals for

proportions• The central limit theorem also applies to

distribution of sample proportions when the sample size is large enough The population proportion replaces the

population mean The binomial distribution replaces the

normal distribution

Confidence interval for a proportion

Using the binomial distribution

• The binomial distribution is a sampling distribution for p

• Formula of the standard error:

Where n = Sample size, p = proportion

€

SEproportion=p(1−p)

n


Using the central limit theorem

• As the sample n increases, the binomial distribution becomes very close to a normal distribution (Central limit theorem)

• Thus, we can use the normal distribution to calculate confidence intervals and test hypotheses

• If np and n (1-p) and equal to 10 or more, then the normal approximation may be used


Applying the concept of the confidence interval of the mean to

proportions • For means, the 95% confidence interval

was:

• For proportions, we just replace the formula of the standard error of the mean by the standard error of the proportion that comes from the binomial distribution

€

95%CI=(x - 1.96σ

n, x +1.96

σ

n)

€

95%CI=(p - 1.96p(1−p)

n, p+1.96

p(1−p)n

)


Calculation of a confidence interval for a proportion: Prevalence of goiter

in Solan, Himachal Pradesh, India, 2005

• Sample of 363 children: 63 (17%) present with goiter

• Standard error of the proportion

• 95% confidence limits for the proportion are 0.17 1.96 x (0.019) Approximately 13% to 21%

€

SE=0.17(1−0.17)

363=

0.17x0.83363

=0.019

Interpretation of the calculation of the confidence interval for the

proportion• The 95% confidence interval for the

proportion of 17% is (13%, 21%)• This means that with repeated random

sampling, 95% of the intervals will contain the true proportion

• Since we have one of these intervals, we can be 95% confident that this interval contains the true proportion


Calculating a 95% confidence interval

for a proportion in practice • Epi-Info, “Epitable” module• Open-Epi calculator (Open source)

www.openepi.com


Calculating a 95% confidence interval for a proportion in OpenEpi:

1/2 (Methods)

1. Choose “Proportion”

2. Click “Enter”

3. Enter data

4. Click “calculate”


Calculating a 95% confidence interval for a proportion in OpenEpi:

1/2 (Results)


Exercise to calculate the 95% confidence interval for a proportion

• In a sample of 250 HIV infected persons with AIDS, 116 are positive for tuberculosis

• What is the 95% confidence interval?

€

95%CI=(0.46 -1.960.46x0.54

250, 0.46 +1.96

0.46x0.54250

) =(40,53)


From estimation to testing

• Confidence interval is about estimating• The sampling distribution can also be

used to test hypotheses Statistical testing

Dealing with non-normal parent population

• If sample size exceeds 30, we are safe because the sampling distribution will approach the normal distribution

• If the sample size is smaller than 30, the distribution is different

• The 1.96 value will be replaced by another value coming from the t-distribution Slightly different from the normal distribution Depends upon the sample size The degrees of value will be n-1

Take home messages

• Confidence intervals use the central limit theorem to estimate a range of possible values for the population parameter on the basis of the sample estimate, the standard deviation and the sample size

• The 95% confidence intervals lies at +/- 1.92 the standard error, that is calculated using different methods for means (s/√n) and proportions (√[p(1-p)/n)]

How sure we really are Confidence intervals for means and proportions FETP India.

Documents

Transcript of How sure we really are Confidence intervals for means and proportions FETP India.