How sure we really are Confidence intervals for means and proportions FETP India.
-
Upload
jayden-jessup -
Category
Documents
-
view
221 -
download
1
Transcript of How sure we really are Confidence intervals for means and proportions FETP India.
How sure we really are
Confidence intervals for means
and proportions
FETP India
Competency to be gained from this lecture
Calculate 95% confidence intervals for means and proportions
Key issues
• Concept of confidence interval• Confidence interval for means• Confidence interval for proportions
What we learnt so far (1/3)
• Population parameters are fixed• We can take samples from the
population• Several samples of size ‘n’ are possible• Each sample give estimates (e.g.,
means) called “statistics” • Statistics vary from sample to sample
This is called “Sampling fluctuation”
Concept of confidence interval
What we learnt so far (2/3)
• The distribution of a statistic for all possible samples of given size ‘n’ is called “sampling distribution”
• For large ‘n’, the sampling distribution is ‘normal’ even if the original distribution is not
• If the original distribution is normal, the result is true even for small ‘n’
Concept of confidence interval
What we learnt so far (3/3)
• The mean of the sampling distribution is the ‘population mean’
• The standard deviation of the sampling distribution is known as standard error SE= Population SD /√n
Concept of confidence interval
Easy to estimate the standard deviation, difficult to estimate the
mean• Samples generate sample means and
standard error• The usefulness of these parameters
vary: The standard deviation from a single sample
as an estimate of population SD for large ‘n’ is fair
The mean from a single sample as an estimate of population mean may not be
Concept of confidence interval
How can the population mean be estimated?
• It is desirable to give a range of values with a specific level of confidence that the true population mean is one of the values in the range
• We can obtain this using the sampling distribution – which is ‘normal’ using the properties of ‘normal’ distribution Mean Standard deviation
Concept of confidence interval
From the standard error (SE) to the confidence interval
• The point estimate x (mean in the sample) is a point in the sampling distribution and there is a 95% chance that it lies in the µ1.96 SE interval
• But µ is not known• Interchanging µ and x we can infer that
there is a 95% chance that µ lies in the interval x 1.96 SE
Concept of confidence interval
Inference using various levels of confidence
• Using the properties of the normal distribution, we can infer what proportion of the values lie between values
• Considering the distribution of the means: 68% of sample means will lie within 1 standard
deviation above or below the sample mean 95% of sample means will lie within 1.96
standard deviation above or below the sample mean
• “1.96” come from the standard z table for alpha=0.05
Concept of confidence interval
Confidence interval for a mean
• The confidence interval of the mean gives the range of plausible values for the true population mean
€
95%CI=(x - 1.96σ
n, x +1.96
σ
n)
Confidence interval for a mean
Example of a calculation of a confidence interval for a mean
• Sample of 100 observations, Mean height is 68” SD: 10”
• Standard error of the mean = 10 / 100 = 1
• 95% confidence limits for population mean are 68 1.96 x (1) Approximately 66” to 70”
€
95%CI=(68 -1.9610
100, 68 +1.96
10
100)
Confidence interval for a mean
Interpretation of the calculation of the confidence interval for a mean
• The 95% confidence interval for the mean of 68 is (66, 70)
• This means that with repeated random sampling, 95% of the intervals will contain the true mean (µ)
• Since we have one of these intervals, we can be 95% confident that this interval contains the true mean
Confidence interval for a mean
Calculating a 95% confidence interval
for a mean in practice • Epi-Info, “Epitable” module• Open-Epi calculator (Open source)
www.openepi.com
• Excel
Confidence interval for a mean
Calculating a 95% confidence interval for a mean in OpenEpi: 1/2
(Methods)
1. Choose “Mean, CI”
2. Click “Enter”
3. Enter data
4. Click “calculate”
Confidence interval for a mean
Calculating a 95% confidence interval for a mean in OpenEpi: 1/2
(Results)
Confidence interval for a mean
Exercise to calculate the 95% confidence interval for a mean
• Study of gestational age at birth in the past month in a sample of health care facilities
• Results of the study n=350 births Sample mean= 37.5 weeks s=12.2
• What is the 95% confidence interval?
€
95%CI=(37.5 -1.9612.2
350, 37.5+1.96
12.1
350) =(36, 39)
Confidence interval for a mean
Applying the same methods to generate confidence intervals for
proportions• The central limit theorem also applies to
distribution of sample proportions when the sample size is large enough The population proportion replaces the
population mean The binomial distribution replaces the
normal distribution
Confidence interval for a proportion
Using the binomial distribution
• The binomial distribution is a sampling distribution for p
• Formula of the standard error:
Where n = Sample size, p = proportion
€
SEproportion=p(1−p)
n
Confidence interval for a proportion
Using the central limit theorem
• As the sample n increases, the binomial distribution becomes very close to a normal distribution (Central limit theorem)
• Thus, we can use the normal distribution to calculate confidence intervals and test hypotheses
• If np and n (1-p) and equal to 10 or more, then the normal approximation may be used
Confidence interval for a proportion
Applying the concept of the confidence interval of the mean to
proportions • For means, the 95% confidence interval
was:
• For proportions, we just replace the formula of the standard error of the mean by the standard error of the proportion that comes from the binomial distribution
€
95%CI=(x - 1.96σ
n, x +1.96
σ
n)
€
95%CI=(p - 1.96p(1−p)
n, p+1.96
p(1−p)n
)
Confidence interval for a proportion
Calculation of a confidence interval for a proportion: Prevalence of goiter
in Solan, Himachal Pradesh, India, 2005
• Sample of 363 children: 63 (17%) present with goiter
• Standard error of the proportion
• 95% confidence limits for the proportion are 0.17 1.96 x (0.019) Approximately 13% to 21%
€
SE=0.17(1−0.17)
363=
0.17x0.83363
=0.019
Interpretation of the calculation of the confidence interval for the
proportion• The 95% confidence interval for the
proportion of 17% is (13%, 21%)• This means that with repeated random
sampling, 95% of the intervals will contain the true proportion
• Since we have one of these intervals, we can be 95% confident that this interval contains the true proportion
Confidence interval for a proportion
Calculating a 95% confidence interval
for a proportion in practice • Epi-Info, “Epitable” module• Open-Epi calculator (Open source)
www.openepi.com
Confidence interval for a proportion
Calculating a 95% confidence interval for a proportion in OpenEpi:
1/2 (Methods)
1. Choose “Proportion”
2. Click “Enter”
3. Enter data
4. Click “calculate”
Confidence interval for a proportion
Calculating a 95% confidence interval for a proportion in OpenEpi:
1/2 (Results)
Confidence interval for a proportion
Exercise to calculate the 95% confidence interval for a proportion
• In a sample of 250 HIV infected persons with AIDS, 116 are positive for tuberculosis
• What is the 95% confidence interval?
€
95%CI=(0.46 -1.960.46x0.54
250, 0.46 +1.96
0.46x0.54250
) =(40,53)
Confidence interval for a proportion
From estimation to testing
• Confidence interval is about estimating• The sampling distribution can also be
used to test hypotheses Statistical testing
Dealing with non-normal parent population
• If sample size exceeds 30, we are safe because the sampling distribution will approach the normal distribution
• If the sample size is smaller than 30, the distribution is different
• The 1.96 value will be replaced by another value coming from the t-distribution Slightly different from the normal distribution Depends upon the sample size The degrees of value will be n-1
Take home messages
• Confidence intervals use the central limit theorem to estimate a range of possible values for the population parameter on the basis of the sample estimate, the standard deviation and the sample size
• The 95% confidence intervals lies at +/- 1.92 the standard error, that is calculated using different methods for means (s/√n) and proportions (√[p(1-p)/n)]