Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters....

18
Confidence intervals

Transcript of Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters....

Page 1: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

Confidence intervals

Page 2: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

Estimation and uncertainty

Theoretical distributions require input parameters.

For example, the weight of male students in NUS follows a Normal(, 2) distribution. How do we know what should and 2 be?

We can model the hourly number of admissions to the A&E department at NUH using a Poisson(2.8) distribution. How is the figure of 2.8 obtained?

In comparing between the heights of male and female students in NUS, one strategy is to compare the mean heights between the two groups of students. What does this mean and how do we quantify that there is a genuine biological difference versus an artefactual difference?

Page 3: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

1. Data checking, identifying problems and characteristics

2. Understanding chance and uncertainty

3. How will the data for one attribute behave, in a theoretical framework?

4. Theoretical framework assumes complete information, need to address uncertainties in real data

Data exploration and Statistical analysis

Page 4: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

DataData exploration,

categorical / numerical outcomes

Estimation of parameters, quantifying uncertainty

Model each outcome with a theoretical distribution

Page 5: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

Estimation

Generally, before any statistical comparisons can be made, there are always parameters that need to be estimated.

Recall the bridge between the sample and the population.

In most situations in applied research, especially in biomedical sciences, the key interest is what happens in the population. The sample is really a way of estimating what will happen in the population.

Page 6: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

Example 1: Let’s supposed the Science Faculty is interested to compare between the weights of male and female students in NUS. How will this study be designed?

- Key interest is to summarise the weight of all the male students in NUS, and the weight of all the female students in NUS. - Reasonable assumption that the weight of the students for each respective gender will be normally distributed.- Randomly sample 200 male students and 200 female students and measure their weight.- Calculate the mean weight of these 200 male students, and use this quantity to estimate the mean weight of all the male students in NUS. - Similarly calculate the mean weight of these 200 female students and use this to estimate the mean weight of all the female students in NUS.- While we can compare the estimated mean weights of the male and female students, but how do we know any difference is not due to sampling bias? - Can we quantify the uncertainty in the estimation, when we use the calculated sample mean weight to estimate the population mean weight?

Page 7: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

• Not sufficient to just provide an estimated quantity, need to quantify the extent of uncertainty involved in the estimation.

• Assumes data has a bell-shaped / symmetric distribution, confidence intervals calculated about the mean.

Mean age (54.6 years)

20 30 40 50 60 70 80

AGE

Confidence intervals

Page 8: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

• Interval is random, parameter to be estimated is not.

• Width of interval is a measure of precision. Confidence level as a measure of accuracy.

• Width of CI depends on the magnitude of the uncertainty (standard error), and level of confidence required.

• Assumptions must be satisfied before constructing CIs.

Mean age (54.6 years)

20 30 40 50 60 70 80

AGE

Mean age (55.2 years)

20 30 40 50 60 70 80

AGE

Remarks on Confidence Intervals

Page 9: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

• Confidence intervals can be calculated for any estimated quantities

• Fundamentally related to the concept of quantifying the degree of uncertainty in the estimation

• Calculate the quantity of interest (sample mean, sample proportion, etc. – to be covered over the remaining sessions)

• Calculate the standard error associated with estimating the quantity.

Calculating confidence intervals

Quantity of interest

Page 10: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

• Two extremely different concepts!

Standard deviation and standard error

Standard deviation (SD)Used to quantify the variability or dispersion (spread) in a collection of numbers. It quantifies the ‘distance’ from the average/mean of the data. This is used to summarize the distribution of a collection of numbers.

A large SD means the collection of numbers is widely dispersed about the mean, while a small SD means the numbers are concentrated about the mean value.

Standard error (SE)Used to quantify the degree of uncertainty in estimating the population mean with the sample mean.

A large SE indicates that there is considerably uncertainty that the sample mean is a good estimate for the population mean.

Page 11: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

• 95% confidence intervals linked to 2 standard errors away from the mean (or 1.96 SE away from the mean)

• Most common form of CI produced in research.

• Will explore more about CI in subsequent lectures

95% Confidence Intervals

Sample meanStandard deviation90% CI

99% CI

Page 12: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

• If we were to:

• repeat the experiment 100 times

• construct 95% CI for each time

• Then we would expect 95 of the CIs to cover or include the true population value.

Interpreting Confidence Intervals

Page 13: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

Confidence intervals and RExcel / SPSS

Consider the mathematics and omega3 consumption dataset that can be downloaded from

http://www.statistics.nus.edu.sg/~statyy/ST1232/bin/mathematics.xls

Calculate the confidence interval for the mean of the marks before the start of omega3 consumption.

Page 14: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

= (67.95, 70.08)

Page 15: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.
Page 16: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.
Page 17: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

What about the confidence interval for the mean omega 3 consumption?

Page 18: Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.

• understand the concept of estimation and how it leads to uncertainty in statistics

• differentiate between a standard deviation and a standard error

• understand how a confidence interval is constructed

• understand and interpret a confidence interval

• calculate the confidence interval in RExcel and SPSS when given the data

• know the assumption required for the use of a confidence interval

Students should be able to