MBP1010 - Lecture 2: January 14, 2009 1. Density curves and standard normal distribution 2. Sampling...

MBP1010 - Lecture 2: January 14, 2009

1. Density curves and standard normal distribution

2. Sampling distribution of the mean

4. Confidence Interval for the mean

5. Hypothesis testing (1 sample t test)

Reading: Introduction to the Practice of Statistics: 1.3, 3.4, 5.2, 6.1-6.4 and 7.1

Standard deviation vs standard error for describing data

Table 1. Characteristics of study subjects (n=35)

Importance of Normal Distribution*

1. Distributions of real data are often close to normal.

2. Mathematically easy to work with so many statistical tests are designed for normal (or close to normal) distributions).

3. If the mean and SD of a normal distribution are known, you can make quantitative predictions about the population.

* also called Gaussian curve

Red bars = scores 6Proportion = 0.303

Red area under thedensity cure are 6.Proportion = 0.293

Cumulative proportion for value x is the proportion of allobservations that are x; this is the area to the left of the curve.

Mean = 64.5 inchesSD = 2.5 inches

“The 68-95-99.7 Rule”

The standard normal distribution is: a normal distribution with a mean of 0 and a SD of 1. Normal distributions can be transformed to

standard normal distributions by the formula:

where X is a score from the original normal distribution,

μ is the mean of the original normal distribution, and

σ is the standard deviation of original normal distribution.

The standard normal distribution is sometimes called the

z distribution.

Standardized Normal Distribution

A z score always reflects the number of standard deviations

above or below the mean a particular score is.

Ex. If a person scored 70 on a test with mean of 50 and

SD of 10, then they scored 2 standard deviations above

the mean. Converting the test scores to z scores, an X of 70

would be:

So, a z score of 2 means the original score was 2 SD

above the mean.

Z-score

Z Scores

-Provide a meaningful way to compare individuals from different normal distributions – on the same scaleIe. How many SD above or below the mean?

Eg, - bone density measures - growth charts – height of children at different ages - “normalized” data

QQ-plot shows the theoretical quantiles versus the empirical quantiles. If the distribution is “normal”, we should observe a straight line.

Quantile-Quantile (Q-Q) Plot

Rice Virtual Lab in Statistics

http://onlinestatbook.com/rvls/

Hyperstat Online

Section 5. Normal Distribution - theory

Sampling and Estimation

Populations and Samples

Population: entire group of individuals that we want information about

Sample: a part of the population that we actually examine in order to gather information

Goal: to try to draw conclusions about the population from the sample

Whole Population

Sample

Mean = SD =

Mean = xSD = s

Sample Inference

Parameter:

- a number that describes the population- number is fixed but in practice we do not

know its value (eg, μ)

Statistic:

- a number that describes a sample (eg, x). - its value is known when we take a sample,

but it can change from sample to sample. - often used to estimate an unknown parameter .

Statistical inference is the process by which we draw conclusions about the population from the results observed in a sample..

Two main methods used in inferential statistics: estimation and hypothesis testing.

In estimation, the sample is used to estimate a parameter and a confidence interval about the estimate is constructed.

Random Sampling is Key!

- every individual in the population sampled must have a chance of being included in the sample

- the choice of one subject does not influence the chance of other subjects being chosen

- use a method of sampling in which chance alone operates- toss of a coin, draw from a hat- random number generators

- random assignment in clinical trials results in randomlyselected groups

- the chances for each individual in the population to be selected is equal

- every possible sample an equal chance to be chosen

Simple Random Sampling (SRS)

Stratified Sampling

- divide the population into strata- choose SRS in each stratum- combine these SRS to form full sample eg. Strata: prognostic factors in cancer patients;

male/female, age - consult a statistician for more complex sampling

Sample mean (x) as an estimator of the population mean ()

What would happen if we repeated the sample several times?

Sampling variability:- repeated samples from the same populationwill not have the same mean

- depends partly on how variable the underlyingpopulation is and on the size of the sample

selected

Sampling Distribution of X

- the distribution of values taken by the mean (x) in allpossible samples of the same size from the same population

-

1. Mean of sampling distribution of x =

2. SD of sampling distribution = - called standard error of the mean

3. Shape of the sampling distribution is approximately a normal curve, regardless of the shape of the population distribution, provided n is large enough (Central Limit Theorem)

Simulation of Sampling DistributionCentral Limit Theorum

Rice Virtual Lab in Statistics

http://onlinestatbook.com/rvls/

Population: All MBP1010 students

n=37 = 1.00 cup = 1.07 cups

Population One Randomly n=37 Selected Sample n=12

x = 0.875 s = 0.78

= 1.00 = 1.07

Population Sampling Distribution n=37 1000 repeats of n=12

= 1.00 = 1.07

Mean = 1.00SD = 0.26

Population Sampling Distribution One Sample n=37 1000 repeats of n=12 n=12

Mean = 1.00SD = 0.26

x = 0.875 s = 0.78 SEM = 0.23

s/n (SEM)

= 1.00 = 1.07

Confidence Interval of the Mean

Standard Normal Distribution

95% Confidence Interval = 0.95

=0.025=0.025

-1.96 1.9697.5 th 2.5 th

95% Confidence Interval for a population mean

Pr (-1.96 z 1.96) = 0.95 Pr (-1.96 1.96) = 0.95

Pr (x -1.96/n x + 1.96/n ) = 0.95

x - 1.96(/n) and x + 1.96(/n) are the 95 percent confidence intervals on the population mean

Express x in standardized form: z statistic

If population known (not realistic)

x - /n

In the long run, 95% of all samples will have an interval that includes .

24 out of 25 samplesincluded (96%)

90% Confidence Interval = 0.90

=0.05=0.05

-1.645 1.64595 th 5 th

- use sample standard deviation (s) as an estimate of - therefore, /n estimated from sample using: s/n (standard error of the mean;SE)

- SE of the sample is the estimate of the SD that would be obtained from the means of a large number of samples drawn from that population

Confidence Interval for a population meanpopulation NOT known (usual)

x - s/n

-need to consider reliability of both x and s as estimators of and respectively - shape of the distribution depends on the sample size n

Problem:

Critical Ratio = is not normally distributed

Therefore follows the t distributionx - s/n

t - distribution

- degrees of freedom refer to number of independent quantities among a series of numerical quantities

- a family of distributions indexed by the degrees of freedom (n-1)

Degrees of Freedom

For SD:

- there are n deviations around the mean - there is one restriction: sum of deviations = 0- therefore once we have calculated n-1 deviations around the mean, the last number would be already determined as the sum must be 0 (ie. not independent).

- for n deviatons around the mean there are n-1 degrees of freedom (DF)

x - t 24,0.975 x s/n, x + t 24,0.975 x s/n

t 24,0.975 = 2.064 (from tables of t dist)

2.1 - (2.064 x 1.9/ 25), 2.1 + (2.064 x 1.9/ 25)

= 1.32 , 2.88 cm

95% Confidence Interval for a population meanpopulation NOT known (usual)

A sample consists of 25 mice with a mean tumor size of 2.1 cm and SD = 1.9 cm.

Confidence interval for a Mean

Interpretation: - 95% of the intervals that could be constructed from repeated random samples of size 25 contain the true population mean

- we are 95% confident that the mean tumor sizeis between 1.32 and 2.88 cm.

Estimate of mean tumor size = 2.1 cm; n=25.

95% CI = 1.32 , 2.88 cm

Factors affecting the length of the confidence interval

Sample size: as n increases, length of the CI decreases

variation: as s, which reflects variability of the distributionof observations, increases, the length of the CI increases

level ofconfidence: as the confidence desired increases (ie 90,95,

99% CI), the length of the CI increases.

x t n-1, .975 x s/n s/n = SE


Table 1. Characteristics of study subjects (n=35)


If the purpose is to describe the data (eg. to see if subjects are typical): standard deviation

- variability of the observations

If the purpose is to describe the results (outcome) of the Study: standard error

confidence interval- precision of the estimate of a population parameter

Note:-can calculate one from the other - indicate clearly whether reporting SD or SE

What Formal Statistical Inference Cannot Do

-tell you what population you should be interested in

- ensure that you sampled properly from the population

- determine whether measurements made are biased (systematically wrong)

DOES:- give a quantitative indication of how much random variation may have affected your results

Target Population Patients with All rheumatoid votersarthritis

Population Sampled Patients admitted telephone to a particular listings

hospital

Sample Studied Sample of sample ofrecords of above listingsabove patients

What/who are we trying to study?

Hypothesis Testing

Schematic Plots

| 45 + | | | | | | 40 + | | | | | | | 35 + 0 | | 0 | | 0 +-----+ | | | 30 + | | | | *--+--* | | | | | | | | 25 + | | | | | +-----+ | | | | +-----+ | 20 + | | | | | | | | *--+--* | | | | | 15 + | | | | +-----+ | | | | 10 + | | | | | | 5 + ------------+-----------+----------- GROUP 1 2

Low Fat Control

Dietary fat intake in the low fat and control groups(n=151 intervention and 187 control)

Blood HDL-cholesterol levels in the low fat and control groups (n=163 intervention and 199 control)

| 2.6 + | | | | 2.4 + | | | | 0 | | | 2.2 + | | | | | | | | | | 2 + | | | | | | | | | | | 1.8 + | +-----+ | | | | | | | | | +-----+ | | 1.6 + | | | | | | | | + | | | | *-----* | *--+--* | | 1.4 + | | | | | | | | | | | | +-----+ | +-----+ | 1.2 + | | | | | | | | | | | 1 + | | | | | | | | | | 0.8 + ------------+-----------+----------- GROUP 1 2 Low Fat Control

mean = 1684 kcal/daySD = 380.5 kcal/day

Examples of conclusions of hypothesis tests

The mean intake of dietary fat is significantly lower in the low-fat group as compared to the control group (17.5 vs 28.3 percent energy from fat; p 0.001). (2 sample t test)

Does the energy intake of women in a sample differ from the “recommended” level of 1850 kcal?(1 sample t test)

Hypotheses

- hypotheses stated in terms of the population parameters (true means)

- null hypothesis: Ho

- statement of no effect or no difference- assess the strength of evidence against null hypothesis

- alternative hypothesis: Ha

- what we expect/hope to see

- Usually a 2 sided test

Control Intervention c = T

Xc vs XT

Compute the probability of obtaining a difference as large or larger than the observed difference assuming that, in fact, there is no difference in the true means.

If the probability is not very small, we concludethat observing such a difference is plausible, even when true means are equal, I.e. the data do not provide evidence that true means are different.

if probability is very small, we conclude there is a difference between the means.

Overview of hypothesis testing

Significance tests answers the question:

Is chance or sampling variation a likely explanation of the discrepancy betweena sample results and the null hypothesispopulation value?

Yes: sample result is compatible with ideathat sample is from population in which null hypothesis is true

No: discrepancy unlikely due to chance variation - sample result is not compatible with idea that sample is from population in which null hypothesis is true

Steps in Hypothesis Testing

1. State hypothesis.

2. Specify the significance level.

3. Calculate the test statistic.

4. Determine p value.

5. State conclusion.

One Sample T test

One Sample T test: Energy intake in women

For a sample of randomly selected 29 women:

Mean energy intake = 1,684 kcal/dayStandard deviation (s) = 380.5 kcal/day

Does the energy intake of women in this study differ from the “recommended” level of 1850 kcal?

Example of energy intakes

Ho: the true mean energy intake of women in the trial is not different from 1,850 kcal/day

Ha: the true mean energy intake of women in the trial is different from 1,850 kcal/day

Specific Notation:

Ho: = 1,850Ha: 1,850 (2 sided)

1. State hypotheses:

2. Significance Level

- how much evidence against Ho we require to reject Ho (determine in advance)

- compare the p value with a fixed value that is considered decisive

- this value is called significance level - denoted as

- commonly use = 0.05

Significance Level

= 0.05

- require that the data give evidence against Ho so strong that it would happen not more than 5% of the time (1 in 20), when Ho is true.

= 0.01- require that the data give evidence against Ho so strong that it would happen not more than 1% of the time (1 in 100), when Ho is true.

3. Calculate the test statistic

- test statistic measures compatibility between null hypothesis and the data

- to assess how far the estimate is from parameter:standardize the estimate

- z statistic (when known)

- t statistic (when not known)

One Sample t test

- use t distribution when population standard deviation () not known

degrees of freedom = n-1

To test hypothesis Ho: = o based on a SRSof size n, compute the t statistic:

Based on sample of 29 women:

x = 1684 kcal/day; standard deviation (s) = 380.5 kcal/day

x - s/nt =

1684 - 1850380.5/29

= -2.35

Step 3. Calculate test statistic.

=

Determine the p value

- probability of getting an outcome as extreme or more extreme than the actually observed outcome

- extreme: far from would be expected if null hypothesis is true

- smaller the p value, the stronger the evidence against the null hypothesis

t =

Energy Intake in Women

2 sided test:

P(t -2.35 or t 2.34)P(t -2.35) = 0.0130

P(t 2.35) = 1 - 0.9870 = 0.0130

P value = 2P( t -2.35) = 0.026

1684 - 1850380.5/29

= -2.35

p = 0.0130

t = -2.35 t = 2.35

Step 4. Determine p value.

p = 0.0130

2 sided p = 0.026

What does a “small” p value mean?

1. An unlikely event occurred (getting a large value for the test statistic by chance).

2. The null hypothesis is false.

Probability of getting an outcome as extreme ormore extreme than the actually observed outcomein either direction, if the null hypothesis is true.

P value for a 2 sided test:

Statistical Significance

In the example: p value = 0.026

2.6% chance of observing a mean energy intake of 1684 kcal/day in a sample of women even if the true mean is not different from the recommended level of 1850 kcal/day. What do we conclude?


p value = 0.026

We reject the null hypothesis, Ho.

The mean energy intake of women is significantly lower than the recommended intake (p < 0.05).

The mean energy intake of women is significantly lower than the recommended intake (p = 0.03).

(Significant at the 5% but not the 1% level)

One Sample t-test

data: energy.intake t = -2.3493, df = 28, p-value = 0.02610alternative hypothesis: true mean is not equal to 1850 95 percent confidence interval: 1539.260 1828.741 sample estimates:mean of x 1684.001

R code: t.test(energy.intake, mu=1850)

Using R – One Sample t-test

R Output:


If recommended level is 1750 kca/day;then p = 0.36.

36% chance of observing a mean energy intake of 1684 kcal/day in a sample of women even if the true mean is not different from the recommended level of 1750 kcal/day. What do we conclude?


p value = 0.36

We do not reject the null hypothesis, Ho.

The data do not provide evidence that mean energy Intake of women is different from the recommendedlevel.

The mean energy intake of women in the study is not significantly different from recommended level of 1750 kcal/day (p = 0.36).

p = 0.0130

Ho: = 1850Ha: < 1850

One sided test

Probability values for one-tailed tests are one half the value for two-tailed tests as long as the effect is in the specified direction.

One-sided vs two-sided tests

- one sided tests are rarely justified

- decide on appropriate test prior to experiment

- Do not decide on a one-sided test after looking at the data

eg. p value for 2 sided is 0.09 p value for 1 sided is 0.045

If any doubt: choose 2 sided test!

General guidelines for stating significance

0.01 p < 0.05 significant

0.001 p < 0.01 highly significant

p < 0.001 very highly significant

p > 0.05 not statistically significant (NS)

0.05 p < 0.10 trend towards statistical significance

If: results are:

Reporting actual p values

A. p value = 0.0512 Conclude: result is NS, p > 0.05

If the effect is interesting and potentially important would probably want to:- repeat study- check power of study

b. p value = 0.75Conclude: result is NS, p > 0.05- likely no effect

Comments/Cautions about hypothesis testing

Statistical vs clinical significance

- look at the size of effect not just p value

- look at confidence interval for parameter of interest

- with a large sample size, a very small effectmay be statistically significant

Exploratory data analysis vs hypothesis testing

- exploratory data analysis is important

- but cannot test a hypothesis on the same datathat first suggested it

- if report findings - clearly state - post hoc

- need to design a new study to test the hypothesis

Relationship between confidence interval and p value

x - t 24,0.975 x s/n, x + t 24,0.975 x s/n

t 24,0.975 = 2.064 (from tables of t dist)

2.1 - (2.064 x 1.9/ 25), 2.1 + (2.064 x 1.9/ 25)

= 1.32 , 2.88 cm

95% Confidence interval for a population mean

A sample consists of 25 mice with a mean tumor size of 2.1 cm and SD = 1.9 cm.

Ho: = 2.9Ha: 2.9

CI and Hypothesis Test

t = x - s/n

2.1- 2.9 1.9/25

=

x = 2.1 cm s = 1.9 cm

= 2.105

p = 0. 0459

95 % CI for mean tumor size = 1.32 , 2.88 cm

MBP1010 - Lecture 2: January 14, 2009 1. Density curves and standard normal distribution 2. Sampling...

Documents

Transcript of MBP1010 - Lecture 2: January 14, 2009 1. Density curves and standard normal distribution 2. Sampling...