MBP1010 - Lecture 2: January 14, 2009 1. Density curves and standard normal distribution 2. Sampling...

of 89 /89
MBP1010 - Lecture 2: January 14, 2009 Density curves and standard normal distribution Sampling distribution of the mean Confidence Interval for the mean Hypothesis testing (1 sample t test) Reading: Introduction to the Practice of Statistics: 1.3, 3.4, 5.2, 6.1-6.4 and 7.1

Embed Size (px)

description

Importance of Normal Distribution* 1. Distributions of real data are often close to normal. 2. Mathematically easy to work with so many statistical tests are designed for normal (or close to normal) distributions). 3. If the mean and SD of a normal distribution are known, you can make quantitative predictions about the population. * also called Gaussian curve

Transcript of MBP1010 - Lecture 2: January 14, 2009 1. Density curves and standard normal distribution 2. Sampling...

MBP Lecture 2: January 14, Density curves and standard normal distribution 2. Sampling distribution of the mean 4. Confidence Interval for the mean 5.Hypothesis testing (1 sample t test) Reading: Introduction to the Practice of Statistics: 1.3, 3.4, 5.2, and 7.1 Standard deviation vs standard error for describing data Table 1. Characteristics of study subjects (n=35) Importance of Normal Distribution* 1. Distributions of real data are often close to normal. 2. Mathematically easy to work with so many statistical tests are designed for normal (or close to normal) distributions). 3. If the mean and SD of a normal distribution are known, you can make quantitative predictions about the population. * also called Gaussian curve Red bars = scores 6 Proportion = 0.303 Red area under the density cure are 6. Proportion = 0.293 Cumulative proportion for value x is the proportion of all observations that are x; this is the area to the left of the curve. Mean = 64.5 inches SD = 2.5 inches The Rule The standard normal distribution is: a normal distribution with a mean of 0 and a SD of 1. Normal distributions can be transformed to standard normal distributions by the formula: where X is a score from the original normal distribution, is the mean of the original normal distribution, and is the standard deviation of original normal distribution. The standard normal distribution is sometimes called the z distribution. Standardized Normal Distribution A z score always reflects the number of standard deviations above or below the mean a particular score is. Ex. If a person scored 70 on a test with mean of 50 and SD of 10, then they scored 2 standard deviations above the mean. Converting the test scores to z scores, an X of 70 would be: So, a z score of 2 means the original score was 2 SD above the mean. Z-score Z Scores -Provide a meaningful way to compare individuals from different normal distributions on the same scale Ie. How many SD above or below the mean? Eg, - bone density measures - growth charts height of children at different ages - normalized data QQ-plot shows the theoretical quantiles versus the empirical quantiles. If the distribution is normal, we should observe a straight line. Quantile-Quantile (Q-Q) Plot Rice Virtual Lab in StatisticsHyperstat Online Section 5. Normal Distribution - theory Sampling and Estimation Populations and Samples Population: entire group of individuals that we want information about Sample: a part of the population that we actually examine in order to gather information Goal: to try to draw conclusions about the population from the sample Whole Population Sample Mean = SD = Mean = x SD = s Sample Inference Parameter: - a number that describes the population - number is fixed but in practice we do not know its value (eg, ) Statistic: - a number that describes a sample (eg, x). - its value is known when we take a sample, but it can change from sample to sample. - often used to estimate an unknown parameter. Statistical inference is the process by which we draw conclusions about the population from the results observed in a sample... Two main methods used in inferential statistics: estimation and hypothesis testing. In estimation, the sample is used to estimate a parameter and a confidence interval about the estimate is constructed. Random Sampling is Key! - every individual in the population sampled must have a chance of being included in the sample - the choice of one subject does not influence the chance of other subjects being chosen - use a method of sampling in which chance alone operates - toss of a coin, draw from a hat - random number generators - random assignment in clinical trials results in randomly selected groups - the chances for each individual in the population to be selected is equal - every possible sample an equal chance to be chosen Simple Random Sampling (SRS) Stratified Sampling - divide the population into strata - choose SRS in each stratum - combine these SRS to form full sample eg. Strata: prognostic factors in cancer patients; male/female, age - consult a statistician for more complex sampling Sample mean (x) as an estimator of the population mean ( ) What would happen if we repeated the sample several times? Sampling variability: - repeated samples from the same population will not have the same mean - depends partly on how variable the underlying population is and on the size of the sample selected Sampling Distribution of X - the distribution of values taken by the mean (x) in all possible samples of the same size from the same population - 1. Mean of sampling distribution of x = 2. SD of sampling distribution = - called standard error of the mean 3. Shape of the sampling distribution is approximately a normal curve, regardless of the shape of the population distribution, provided n is large enough ( Central Limit Theorem) Simulation of Sampling Distribution Central Limit Theorum Rice Virtual Lab in Statistics Population: All MBP1010 students n=37 = 1.00 cup = 1.07 cups Population One Randomly n=37 Selected Sample n=12 x = s = 0.78 = 1.00 = 1.07 Population Sampling Distribution n= repeats of n=12 = 1.00 = 1.07 Mean = 1.00 SD = 0.26 Population Sampling Distribution One Sample n= repeats of n=12 n=12 Mean = 1.00 SD = 0.26 x = s = 0.78 SEM = 0.23 s/ n (SEM) = 1.00 = 1.07 Confidence Interval of the Mean Standard Normal Distribution 95% Confidence Interval = 0.95 = th 2.5 th 95% Confidence Interval for a population mean Pr (-1.96 z 1.96) = 0.95 Pr (-1.96 1.96) = 0.95 Pr (x / n x / n ) = 0.95 x ( / n) and x ( / n) are the 95 percent confidence intervals on the population mean Express x in standardized form: z statistic If population known (not realistic) x - / n In the long run, 95% of all samples will have an interval that includes . 24 out of 25 samples included (96%) 90% Confidence Interval = 0.90 = th 5 th - use sample standard deviation (s) as an estimate of - therefore, / n estimated from sample using: s/ n (standard error of the mean;SE) - SE of the sample is the estimate of the SD that would be obtained from the means of a large number of samples drawn from that population Confidence Interval for a population mean population NOT known (usual) x - s/ n -need to consider reliability of both x and s as estimators of and respectively - shape of the distribution depends on the sample size n Problem: Critical Ratio = is not normally distributed Therefore follows the t distribution x - s/ n t - distribution - degrees of freedom refer to number of independent quantities among a series of numerical quantities - a family of distributions indexed by the degrees of freedom (n-1) Degrees of Freedom For SD: - there are n deviations around the mean - there is one restriction: sum of deviations = 0 - therefore once we have calculated n-1 deviations around the mean, the last number would be already determined as the sum must be 0 ( ie. not independent). - for n deviatons around the mean there are n-1 degrees of freedom (DF) x - t 24,0.975 x s/ n, x + t 24,0.975 x s/ n t 24,0.975 = (from tables of t dist) (2.064 x 1.9/ 25), (2.064 x 1.9/ 25) = 1.32, 2.88 cm 95% Confidence Interval for a population mean population NOT known (usual) A sample consists of 25 mice with a mean tumor size of 2.1 cm and SD = 1.9 cm. Confidence interval for a Mean Interpretation: - 95% of the intervals that could be constructed from repeated random samples of size 25 contain the true population mean - we are 95% confident that the mean tumor size is between 1.32 and 2.88 cm. Estimate of mean tumor size = 2.1 cm; n=25. 95% CI = 1.32, 2.88 cm Factors affecting the length of the confidence interval Sample size: as n increases, length of the CI decreases variation:as s, which reflects variability of the distribution of observations, increases, the length of the CI increases level of confidence:as the confidence desired increases (ie 90,95, 99% CI), the length of the CI increases. x t n-1,.975 x s/ n s/ n = SE Standard deviation vs standard error for describing data Table 1. Characteristics of study subjects (n=35) Standard deviation vs standard error for describing data If the purpose is to describe the data (eg. to see if subjects are typical): standard deviation - variability of the observations If the purpose is to describe the results (outcome) of the Study: standard error confidence interval - precision of the estimate of a population parameter Note: -can calculate one from the other - indicate clearly whether reporting SD or SE What Formal Statistical Inference Cannot Do -tell you what population you should be interested in - ensure that you sampled properly from the population - determine whether measurements made are biased (systematically wrong) DOES: - give a quantitative indication of how much random variation may have affected your results Target PopulationPatients with All rheumatoid voters arthritis Population Sampled Patients admitted telephone to a particular listings hospital Sample StudiedSample of sample of records of above listings above patients What/who are we trying to study? Hypothesis Testing Low Fat Control Dietary fat intake in the low fat and control groups (n=151 intervention and 187 control) Blood HDL-cholesterol levels in the low fat and control groups (n=163 intervention and 199 control) Low Fat Control mean = 1684 kcal/day SD = kcal/day Examples of conclusions of hypothesis tests The mean intake of dietary fat is significantly lower in the low-fat group as compared to the control group (17.5 vs 28.3 percent energy from fat; p 0.001). (2 sample t test) Does the energy intake of women in a sample differ from the recommended level of 1850 kcal? (1 sample t test) Hypotheses - hypotheses stated in terms of the population parameters (true means) - null hypothesis: H o - statement of no effect or no difference - assess the strength of evidence against null hypothesis - alternative hypothesis: H a - what we expect/hope to see - Usually a 2 sided test Control Intervention c = T X c vs X T Compute the probability of obtaining a difference as large or larger than the observed difference assuming that, in fact, there is no difference in the true means. If the probability is not very small, we conclude that observing such a difference is plausible, even when true means are equal, I.e. the data do not provide evidence that true means are different. if probability is very small, we conclude there is a difference between the means. Overview of hypothesis testing Significance tests answers the question: Is chance or sampling variation a likely explanation of the discrepancy between a sample results and the null hypothesis population value? Yes: sample result is compatible with idea that sample is from population in which null hypothesis is true No: discrepancy unlikely due to chance variation - sample result is not compatible with idea that sample is from population in which null hypothesis is true Steps in Hypothesis Testing 1. State hypothesis. 2. Specify the significance level. 3. Calculate the test statistic. 4. Determine p value. 5. State conclusion. One Sample T test One Sample T test: Energy intake in women For a sample of randomly selected 29 women: Mean energy intake = 1,684 kcal/day Standard deviation (s) = kcal/day Does the energy intake of women in this study differ from the recommended level of 1850 kcal? Example of energy intakes H o : the true mean energy intake of women in the trial is not different from 1,850 kcal/day H a : the true mean energy intake of women in the trial is different from 1,850 kcal/day Specific Notation: H o : = 1,850 H a : 1,850 (2 sided) 1.State hypotheses: 2. Significance Level - how much evidence against H o we require to reject H o (determine in advance) - compare the p value with a fixed value that is considered decisive - this value is called significance level - denoted as - commonly use = 0.05 Significance Level = require that the data give evidence against H o so strong that it would happen not more than 5% of the time (1 in 20), when H o is true. = require that the data give evidence against H o so strong that it would happen not more than 1% of the time (1 in 100), when H o is true. 3. Calculate the test statistic - test statistic measures compatibility between null hypothesis and the data - to assess how far the estimate is from parameter: standardize the estimate - z statistic (when known) - t statistic (when not known) One Sample t test - use t distribution when population standard deviation ( ) not known degrees of freedom = n-1 To test hypothesis Ho: = o based on a SRS of size n, compute the t statistic: Based on sample of 29 women: x = 1684 kcal/day; standard deviation (s) = kcal/day x - s/ n t = / 29 = Step 3. Calculate test statistic. = Determine the p value - probability of getting an outcome as extreme or more extreme than the actually observed outcome - extreme: far from would be expected if null hypothesis is true - smaller the p value, the stronger the evidence against the null hypothesis t = Energy Intake in Women 2 sided test: P(t or t 2.34) P(t -2.35) = P (t 2.35) = = P value = 2P( t -2.35) = / 29 = -2.35 p = t = t = 2.35 Step 4. Determine p value. p = sided p = 0.026 What does a small p value mean? 1. An unlikely event occurred (getting a large value for the test statistic by chance). 2. The null hypothesis is false. Probability of getting an outcome as extreme or more extreme than the actually observed outcome in either direction, if the null hypothesis is true. P value for a 2 sided test: Statistical Significance In the example: p value = % chance of observing a mean energy intake of 1684 kcal/day in a sample of women even if the true mean is not different from the recommended level of 1850 kcal/day. What do we conclude? Statistical Significance p value = We reject the null hypothesis, H o. The mean energy intake of women is significantly lower than the recommended intake (p < 0.05). The mean energy intake of women is significantly lower than the recommended intake (p = 0.03). (Significant at the 5% but not the 1% level) One Sample t-test data: energy.intake t = , df = 28, p-value = alternative hypothesis: true mean is not equal to percent confidence interval: sample estimates: mean of x R code: t.test(energy.intake, mu=1850) Using R One Sample t-test R Output: Statistical Significance If recommended level is 1750 kca/day; then p = % chance of observing a mean energy intake of 1684 kcal/day in a sample of women even if the true mean is not different from the recommended level of 1750 kcal/day. What do we conclude? Statistical Significance p value = 0.36 We do not reject the null hypothesis, H o. The data do not provide evidence that mean energy Intake of women is different from the recommended level. The mean energy intake of women in the study is not significantly different from recommended level of 1750 kcal/day (p = 0.36). p = H o : = 1850 Ha: < 1850 One sided test Probability values for one-tailed tests are one half the value for two- tailed tests as long as the effect is in the specified direction. One-sided vs two-sided tests - one sided tests are rarely justified - decide on appropriate test prior to experiment - Do not decide on a one-sided test after looking at the data eg. p value for 2 sided is 0.09 p value for 1 sided is If any doubt: choose 2 sided test! General guidelines for stating significance 0.01 p < 0.05significant p < 0.01highly significant p < 0.001very highly significant p > 0.05not statistically significant (NS) 0.05 p < 0.10trend towards statistical significance If:results are: Reporting actual p values A. p value = Conclude: result is NS, p > 0.05 If the effect is interesting and potentially important would probably want to: - repeat study - check power of study b. p value = 0.75 Conclude: result is NS, p > likely no effect Comments/Cautions about hypothesis testing Statistical vs clinical significance - look at the size of effect not just p value - look at confidence interval for parameter of interest - with a large sample size, a very small effect may be statistically significant Exploratory data analysis vs hypothesis testing - exploratory data analysis is important - but cannot test a hypothesis on the same data that first suggested it - if report findings - clearly state - post hoc - need to design a new study to test the hypothesis Relationship between confidence interval and p value x - t 24,0.975 x s/ n, x + t 24,0.975 x s/ n t 24,0.975 = (from tables of t dist) (2.064 x 1.9/ 25), (2.064 x 1.9/ 25) = 1.32, 2.88 cm 95% Confidence interval for a population mean A sample consists of 25 mice with a mean tumor size of 2.1 cm and SD = 1.9 cm. H o : = 2.9 Ha: 2.9 CI and Hypothesis Test t = x - s/ n / 25 = x = 2.1 cm s = 1.9 cm = p = % CI for mean tumor size = 1.32, 2.88 cm