Biostatistics-Sampling distribution for mean difference between two groups

36
Biostatistics Lecture 7 

description

Understanding Sampling distribution for mean difference between two groups in Biostatistics for more details check our website athttp://www.helpwithassignment.com/statistics-assignment-help

Transcript of Biostatistics-Sampling distribution for mean difference between two groups

Slide 1

BiostatisticsLecture 7Lecture 6 ReviewComparing two groupsPopulation mean difference1 - 0 ??? x1 x0Sample mean difference =Lecture 6: 95% Confidence IntervalLecture 7: P-valueSample of n1 from group 1Sample of n0 from group 0Example:Randomised controlledtrialUKofweightlossprogrammesintheEstimate of difference in population mean weight lossafter 4 weeks between Atkins & Weight Watchers groups= 4.40 2.86 = 1.54 kg

GroupnSample mean weight loss after4 weeks (kg)Sample standard deviationSample standard errorAtkins - 1574.402.450.32WeightWatchers - 0582.862.230.29Sampling distribution for meandifference between two groupsSuppose we were to:Collect many samples of115 people:Atkins dietWeight Watchers group Allocate 57 Allocate 58Calculate thepeoplepeoplemeanto theto theweight loss in each group afterx0)4 weeks ( x1 andx1 x0Calculate the sample mean differenceWe could then graph all the sample mean differences(this is the sampling distribution)Sampling distributionfor differenceinmeansbetweentwogroups86Percent42around the0from different samplesThe sample differences in means approximately follow a normal distribution centredpopulation mean difference1 - 0Sample difference in meansVariability of sample differencesin meansThe standard deviation of the sampling distribution ofthe sample differences in means is called theerror of the differences in meansstandardThis measures the variability of the sample differencesin means that would arise from repetitions of the same studyThe standard errorofthe differences in means is:x 0 ) = ()2+ ()2s.e./n/n( x 11001which wecan estimate by x 0 ) = (s /)2+ (s /)2s.e.nn( x11001Standard error of the mean in: group 0group 1Sampling distributionfor differenceinmeansbetweentwogroups86Percent42 1.96 s.e.( x1+ 1.96 s.e.( x1 x x 0 )01 0-Sample difference in means from different samples95% of the sample differences in means lie within a distance of 1.96 x s.e.( x1 x 0 )of the populationmean difference1-0.0 )Calculating a 95% confidenceinterval for the differences in meansSo in 95% of the possible samples that we couldcollected, the intervalhaveFromto( x1 x0 ) 1.96 s.e.( x1 x 0 )( x1 x0 )+ 1.96 s.e.( x1 x 0 )contains the (unknown) difference in population means1 - 0.Interpretinginterval for thea 95% confidencedifferences in meansThe estimated differencein population mean weight lossweeks between Atkins & Weight Watchers groups is1.54kg(95% CI: 0.70 kg to 2.38 kg).95% confidence interval for the difference in means:A range of values that we are 95% confidentcontains the true population difference in meansA range of plausible values for the population difference in meansDifference between groups:process of analysisWe follow a standard three-step processEstimateEstimate the size of the difference exposure groupsin outcomebetweenConfidence intervalCalculate a confidence interval forthis differenceP-valueDerive a p-value to test the hypothesis that there is no association between the exposure and the outcome in the populationLecture 7 Using confidence intervals and P-valuesto interpret the results of statisticalanalysesHypothesis tests (P-values)Explanation of null hypothesisWhat is a P-value? Interpreting a P-value Calculating a P-valueType I & II errorInterpreting results of statisticalanalyses usingP-values and confidence intervals

DisprovinghypothesesWecould:Prove the hypothesis by finding every single swan in the world andchecking that they are all white.Disprove the hypothesis by finding just one black swan.It is easier to find evidence against ahypothesis than to prove it to be correct.Hypothesis: All swans are whiteNullhypothesisA null hypothesis is one that proposes there isno difference in outcome between exposed and unexposed in the populationExamples of null hypotheses:The new treatment has no effect on mean rotation of the neck(for people with restricted neck movement)MMR vaccination does not affect a childs subsequent risk ofautismBirthweight is not associated with subsequent IQWe commonly design research to disprove a null hypothesisWeightlossexampleIn the weight loss trial, peopleusing the Atkinsdiet lost,on average, 1.54 kg more weight than people in a WeightWatchers group.Could this be a chance finding?i.e. Could it be consistent with the null hypothesis of no true (population) difference in weight loss?Or does it constitute evidence against the null hypothesis?Null hypothesis:The population mean weight loss under the Atkins diet is the same as the population mean weight loss underthe Weight Watchers program.Weightlossexample:P-valueWe address this question by calculating a P-value:probability of getting a sample difference atleast this bigifthe mean weight loss in the two populationsreally the sameisThis P-value quantifies the evidence against the nullhypothesis.P-value: Comparingtwo groupsWhat is theprobability (P-value) of finding the observed differenceHow likely is itwe would see adifferencethisbigIFIFThe null hypothesisis true?There wasNOrealdifference betweenthe populations?86Weightlossexample:TheP-value-0.8400.84Sample difference in means from different samplesStandard error of the difference = 0.4395% of sample difference in means lie between 1.96 x 0.43Population mean difference is 0 if the null hypothesis is truePercent 42086Percent 420-0.84 0 0.84Sample difference in means from different samples86Percent 42Weightlossexample:TheP-valueProportion ofsamples whereOur sampleobserve ameandifference atdifference= 1.54as our one-0.8400.841.54P-val Prop sawe w obse diffe least as oue =ouldas largeOurdiffe=Ourdiffe=0-0.84 0 0.84Sample difference in means from different samples0.03% of the area is shaded in redi.e. P-value = 0.0003Interpretation of P-values1!The smaller the P-value,the lower the chance of getting a difference as big as the one observedif the null hypothesis is true.Weak evidence againstthe null hypothesis0.1!Increasing evidence againstthe null hypothesis with decreasing P-value0.01!Therefore:the smaller the P-value, the stronger the evidenceagainst the null hypothesis.0.001!Strong evidence againstthe null hypothesis0.0001!P-value!Howdo wecalculatethisP-value?Test statistic: ComparingtwogroupsTo calculate the required tail8area, we transform our observed6sample mean difference intostandard normal deviate.aPercent420-0.8400.84 1.54Sample difference in means from different samples1.96 3.58-1.960This is called a test statisticWe can look up the tail area instandardtables.For z=3.58 we get a (two-sided) tail area of 0.0003.The t-testIn small samples we need to use the t distribution instead of the normal distribution (because ourestimated standard error may notbeagoodestimateofthepopulationstandarderror).This test is valid in large and smallsamples (since inlarge samples the t distribution is almost identical to thenormal distribution). This gives rise to the name t-test. The t-test tests the null hypothesis that two population means are equal. The test statistic is sometimes called the t-statistic.Stata t-test. ttesti 57 4.40 2.45 58 2.86 2.23Two-sample t test with equal variances------------------------------------------------------------------------------|ObsMeanStd. Err.Std. Dev.[95% Conf. Interval]---------+--------------------------------------------------------------------x |y |57584.42.86.3245104.29281332.452.233.7499272.2736515.0500733.446349---------+--------------------------------------------------------------------combined |1153.623304.22904532.4562373.1695674.077041---------+--------------------------------------------------------------------diff |1.54.4367293.67476052.40524------------------------------------------------------------------------------diff = mean(x)Ho: diff = 0-mean(y)t =degrees of freedom =3.5262113Ha:XdiffHa:Xdiff< 00.9997Ha: diff != 0Pr(|T| > |t|) = 0.0006> 00.0003Pr(T < t) =Pr(T > t) =NB: Slight numerical differences are due to rounding error in our lecture calculations!GroupnSample meanWeight loss after4 weeks (kg)Sample standard deviationSample standard errorAtkins (group 1)574.402.450.32Weight Watchers (group 0)582.862.230.29Interpretation of t-test P-value:Weight loss trialNull hypothesis that population mean weight loss underthe Atkins diet and Weight Watchers group are equal.T-test gives:P=0.0006The probability of observing a sample mean difference1.54 kg if there truly is no difference, is 0.06%.ofTherefore, our data provides strong evidence against thenull hypothesis.Confidence intervals and p-valuesInformation used to estimate a confidence interval is the same as for a p-value so can show they are related to each other.For example, we seemeans for the weight As 0 is not contained null hypothesis at thea 95% CI for the difference ofloss groups of (0.67, 2.38).in this interval we can reject the5% significance level.Ingeneral, for example, if p=0.0006 (0.06%) then the largestconfidence interval that would not contain 0 (and thus we canreject the null hypothesis) is 99.94%.TypeIandIIerrorStatisticalsignificanceYou may have learned that when P0.05 they accept the null hypothesisWe do NOT recommend this approachIs P=0.051 very different from P=0.049? Both contain similar amounts of evidenceagainst the null hypothesis. So why reject one and accept the other?Type I errorType II errorInvestigator concludesfrom sample: The new treatment improves neck rotation(i.e. reject the null hypothesis)Investigator concludesfrom sample:The new treatment does not improve neck rotation (i.e. do not reject the null hypothesis)WHENWHENThere is a REALdifference in neck rotation between the new treatment & placebo(i.e. Null Hyp. is not true)There was NO REALdifference in neckrotation between the new treatment & placebo(i.e. Null Hyp. is true)Interpretationofresultsfromstatistical analyses:COMPARISON OF TWO MEANSFivetrials ofdrugsto reduceserumcholesterolTrialDrugCostNo. ofpatients per groupObserveddifference in mean cholesterol (mmol/L)s.e.of difference (mmol/L)95% CI forpopulation difference in mean cholesterolP-value1ACheap30-1.001.00-2.96 to 0.960.322ACheap3000-1.000.10-1.20 to -0.80