Malimu statistical significance testing.

42
STATISTICAL STATISTICAL SIGNIFICANCE TESTING SIGNIFICANCE TESTING BY MALIMU BY MALIMU Dept of Epidemiology/Biostatistics, Dept of Epidemiology/Biostatistics, School of Public Health and Social School of Public Health and Social Sciences, Sciences, MUHAS/KIU MUHAS/KIU

Transcript of Malimu statistical significance testing.

Page 1: Malimu statistical significance testing.

STATISTICAL STATISTICAL SIGNIFICANCE TESTINGSIGNIFICANCE TESTING

BY MALIMUBY MALIMUDept of Epidemiology/Biostatistics,Dept of Epidemiology/Biostatistics,School of Public Health and Social School of Public Health and Social

Sciences,Sciences,MUHAS/KIUMUHAS/KIU

Page 2: Malimu statistical significance testing.

OUTLINEOUTLINE

• Introduction– why significance tests (with examples)– how significance tests work– significance levels– critical region and critical values– concept of the P-value– hypotheses

• Significance test for 1 mean (z-test and t-test)• Significance test for 1 proportion (z-test)• Significance test for 2 means (z-test and t-

test)• Significance test for 2 proportions (χχ22 -test) -test)

derived from a 2 x 2 contingency tablederived from a 2 x 2 contingency table

Page 3: Malimu statistical significance testing.

WHY A SIGNIFICANCE TESTWHY A SIGNIFICANCE TESTExample 1: mean age at first Example 1: mean age at first sexual intercourse; sexual intercourse; do males do males start earlier than females?start earlier than females?

Male Female Overallmean

(n)16.9

(93)17.2

(99)17.0 (192)

std. dev.

2.1 2.0 2.0

Page 4: Malimu statistical significance testing.

INTRODUCTION: why INTRODUCTION: why significance testssignificance tests

Example 2: Prevalence of HIV infection: comparing in-school youth and the general youth population

• Large studies have indicated that the proportion of youth infected with HIV is 10%. A study done involving 400 school youth showed a prevalence of 7%. Do these results provide evidence that the prevalence of HIV among school youth is lower than that in the general youth population?

Page 5: Malimu statistical significance testing.

Example 3: Utilization of VCT services by marital status

• In a study on VCT utilization, we find that 60% of married people in the sample utilize VCT services compared to 30% of unmarried people. How should we interpret this result?

INTRODUCTION: why INTRODUCTION: why significance testssignificance tests

Page 6: Malimu statistical significance testing.

INTRODUCTION: why INTRODUCTION: why significance testssignificance tests

• We note that in all examples, there are differences

• However, the observed difference in each example might:– reflect a TRUE DIFFERENCE (i.e. the

difference also exists in the total population from which the sample was drawn

– be due to CHANCE (i.e. in reality there is no difference, but the observed difference is due to sampling variation)

– be due to BIAS (e.g. due to defects in the study methodology)

Page 7: Malimu statistical significance testing.

• With an appropriate study design, we can feel confident that an observed difference between two groups cannot be explained by BIAS

• We would like to find out whether this difference can be considered as a TRUE difference

• We can only conclude that this is the case if we can rule out the CHANCE explanation

• We accomplish this by applying a significance test

INTRODUCTION: why INTRODUCTION: why significance testssignificance tests

Page 8: Malimu statistical significance testing.

• A SIGNIFICANCE TEST estimates the likelihood that an observed study result (e.g. a difference between two groups) is due to chance

• A significance test is used to assess whether a study result, which is observed in a sample can be considered as a result which indeed exists in the study population from which the sample was drawn

INTRODUCTION: why INTRODUCTION: why significance testssignificance tests

Page 9: Malimu statistical significance testing.

INTRODUCTION: How INTRODUCTION: How significance tests work significance tests work

• Suppose we observed a difference between two groups in a study sample.

• We want to know whether this observed difference between the two groups represents a real difference in the total study population from which the sample was drawn, or whether it just occurred by chance (due to sampling variation).

Page 10: Malimu statistical significance testing.

• To find this out, we determine how likely it is that this difference could have occurred by chance.

• We can never be 100% sure that an observed difference is true, but in general, we are happy if we can be 99% or 95% sure (confident).

• If we are 95% sure, there is a less than 5% likelihood that the observed difference occurred by chance.

INTRODUCTION: How INTRODUCTION: How significance tests worksignificance tests work

Page 11: Malimu statistical significance testing.

Important TerminologiesImportant Terminologies• Statistical hypothesisStatistical hypothesis

– This is a statement about the parameter(s) of the population(s) from which the sample(s) were taken.

– Null hypothesis, H0 : hypothesis of “no difference”. This is the one to be tested.

– Alternative hypothesis, H1: hypothesis that disagrees with the null hypothesis.

Page 12: Malimu statistical significance testing.

• Test statisticTest statistic– This is a mathematical function

(expression) of sample values which provides a basis for testing a statistical hypothesis.

– It has a known sampling distribution with tabulated percentage points (e.g. standard normal deviate (SND), z; chi-squared, 2; t)

Important TerminologiesImportant Terminologies

Page 13: Malimu statistical significance testing.

• Significance level (Significance level (αα):):– The probability of rejecting H0 when it is true– Often expressed in percentage form (i.e.

probability, α, is multiplied by 100)– In social sciences, we choose a commonly

accepted level of allowing that our conclusion may have occurred by chance of 0.05 (5%). In clinical trials involving new drugs, a higher significance level (e.g. of 0.01=1%) would be chosen

– Generally, 0.01 and 0.05 values are most commonly used in scientific studies

Important TerminologiesImportant Terminologies

Page 14: Malimu statistical significance testing.

• Critical (Rejection) RegionCritical (Rejection) Region– This is the region that

encompasses values of the test statistic leading to rejection of the null hypothesis

– Location of the critical region is dependent on the test statistic and the specified significance level

Important TerminologiesImportant Terminologies

Page 15: Malimu statistical significance testing.

• Critical ValueCritical Value– This is the value of the test statistic

corresponding to a given significance level– The critical value changes as the confidence

level alters: e.g. corresponding critical values for confidence levels of 90%, 95%, 99% are 1.64, 1.96 (≈2), 2.58, respectively

– If the test statistic value computed from the data is greater than the critical value, H0 is rejected

– It is the boundary value of the critical region

Important TerminologiesImportant Terminologies

Page 16: Malimu statistical significance testing.

CRITICAL REGION & CRITICAL REGION & VALUEVALUE

Page 17: Malimu statistical significance testing.

CConcept of P-valueoncept of P-value• In any study looking for differences

between groups or associations between variables, the likelihood or PROBABILITY of observing a certain result by chance has to be calculated by a statistical test

• This PROBABILITY of observing a result by chance is usually expressed as a P-VALUE

• If it is unlikely (<0.05) that the difference occurred by chance, we reject the chance explanation and accept that there is a real difference. We then say that the difference is statistically significant

Page 18: Malimu statistical significance testing.

• If it is likely ( 0.05) that the difference occurred by chance, we cannot conclude that a real difference exists We then say that the difference is not statistically significant

•Therefore a difference is considered significant if P < 0.05

Concept of P-valueConcept of P-value

Page 19: Malimu statistical significance testing.

HypothesesHypotheses• In statistical terms the assumption that in the

total study population no real difference exists between groups (or that no real association exists between variables) is called the NULL HYPOTHESIS (H0)

• The ALTERNATIVE HYPOTHESIS (H1) is that there exists a difference between groups or that a real association exists between variables

• If the result is statistically significant, we reject the NULL HYPOTHESIS (H0) and accept the ALTERNATIVE HYPOTHESIS (H1) that there is a real difference between two groups, or a real association between two variables

Page 20: Malimu statistical significance testing.

One sample significance test One sample significance test for a mean (for a mean ( known): the known): the

Standard Normal Deviate or Standard Normal Deviate or z-testz-test

• Problem: Is it reasonable to conclude that a sample of n observations, with mean could have been from a population with mean µ and standard deviation ?

x

Page 21: Malimu statistical significance testing.

One sample significance test One sample significance test for a mean (for a mean ( known): the known): the

Standard Normal Deviate or Standard Normal Deviate or z-testz-test

• H0: The difference between µ and is merely due to sampling error

( - µ)• Calculate SND, z = -------- SE( )

( - µ) = -------- /n• Consider the numerical value of z

x

x

xx

Page 22: Malimu statistical significance testing.

INTERPRETATION OF P-INTERPRETATION OF P-VALUE VALUE

• If z < 1.96 then P > 0.05:– we have no strong evidence against H0– suggests that difference being due to

chance is more likely– hence, difference is not statistically

significant• If z > 1.96 then P < 0.05:

– we have evidence against H0 – it is unlikely that the difference between µ

and is due only to sampling error– hence, difference is statistically

significant • If z > 2.58 then P < 0.01:

– we have strong evidence against H0

x

Page 23: Malimu statistical significance testing.

ExampleExample• Results of a study investigating medical risks

associated with a certain occupation show that in random sample of 20 men aged 30-39 years the mean systolic blood pressure is 141.4 mmHg.

• Suppose that the mean systolic blood pressure in the general population of men aged 30-39 years is known to be 133.2 mmHg with a standard deviation of 15.1 mmHg.

• Do the results of the study provide evidence of an increased blood pressure associated with this occupation?

Page 24: Malimu statistical significance testing.

SolutionSolution• Null hypothesis, H0: there is no increase in blood

pressure in this occupation, and the sample of 20 men can be regarded as a random sample from the general population of men aged 30-39 years.

( - µ)• Calculate SND, z = -------- SE( )

( - µ) = -------- /n

xx

x

Page 25: Malimu statistical significance testing.

SolutionSolution• n = 20• = 141.4 mmHg• µ = 133.2 mmHg• σ = 15.1 mmHg• SE( ) = 3.38• SND, z = 2.43; P=0.015<0.05• Conclusion:

– The difference is statistically significant!– That is, there is enough evidence of an increase in

systolic blood pressure among men in this occupation

x

x

Page 26: Malimu statistical significance testing.

PRACTICALPRACTICALThe mean level of prothrombin in the normal

population is known to be 20.0mg/100 ml of plasma and the standard deviation is 4mg/100 ml. A sample of 40 patients showing vitamin K deficiency has a mean prothrombin level of 18.5mg/100 ml.

• (a) How reasonable is it to conclude that the true mean for patients with vitamin K deficiency is the same as that for a normal population?

• (b) Within what limits would the mean prothrombin level be expected to lie for all patients with vitamin K deficiency? (Give the 95% confidence limits)

Page 27: Malimu statistical significance testing.

ONE SAMPLE SIGNIFICANCE ONE SAMPLE SIGNIFICANCE TEST FOR A PROPORTIONTEST FOR A PROPORTION

Problem: Is it reasonable to conclude that a sample of n observations in which the proportion p have a characteristic, could have been taken from a population in which the proportion with the characteristic is ?

• H0: the difference between p and is merely due to sampling error (i.e. by chance)

• If n is reasonably large (? >40), then calculatez = p- SE(p)z = p - (1-)/n

OR z = p -

(100-)/n• Conclusions follow like before

Page 28: Malimu statistical significance testing.

PRACTICAL: PRACTICAL: one sample one sample proportionproportion

• In a clinical trial to compare two systems of TB treatment: A (hospital based DOTS) and B (home based DOTS), 100 patients were each tried the two systems on different occasions. Of the 100 patients, 65 say they prefer A, 35 prefer B. Is this reasonably good evidence that more patients prefer A than B?

Page 29: Malimu statistical significance testing.

One sample significance test One sample significance test for a mean (for a mean ( unknown): the unknown): the

t-testt-testUse of SND, z, applies when the population

standard deviation, , is known• If is unknown, it can be estimated from the

sample by the standard deviation s• With small samples, and replacing by s in the

formula for SND, leads to a new quantity t, given by

( - µ)• t = -------- SE( ) ( - µ) = -------- s/n

xx

x

Page 30: Malimu statistical significance testing.

END OF ONE SAMPLE END OF ONE SAMPLE SIGNIFICANCE TESTSIGNIFICANCE TEST

• s = 5.18s = 5.18• t = 2.65 on 11 df; P<0.05;• probably G. secundum

Page 31: Malimu statistical significance testing.

DETERMINING SIGNIFICANT DETERMINING SIGNIFICANT DIFFERENCES BETWEEN DIFFERENCES BETWEEN

GROUPS IN CATEGORICAL GROUPS IN CATEGORICAL DATA DATA

• For NOMINAL data the significance test to be used depends on whether the sample is small or large

• Generally, the Chi-square test ((χχ22)) will be used

• However, for small samples (? < 40) in a 2 x 2 table, or if any cell of the cross-table, has an expected value of less than 5, it is better to use Yate’s corrected χχ22 or Fisher exact test

Page 32: Malimu statistical significance testing.

COMPARING TWO COMPARING TWO PROPORTIONS: the chi-PROPORTIONS: the chi-

square (square (χχ22) test ) test • The chi-square test is used for

CATEGORICAL data to test for independency between two or more variables, basically comparing two or more proportions

• With categorical data the chi-square test is used to find out whether observed differences between proportions of events in two or more groups may be considered statistically significant

Page 33: Malimu statistical significance testing.

THE CHI-SQUARE (THE CHI-SQUARE (ΧΧ22) TEST: ) TEST: ExampleExample

• Suppose that in a cross-sectional study of the factors affecting the utilization of antenatal clinics we find that 64% of the women who lived within 10 km of the clinic came for antenatal care, compared to only 47% of those who lived more than 10 km away

• This suggests that antenatal care (ANC) is used more often by women who live close to the clinics. The complete results are presented in the following table

Page 34: Malimu statistical significance testing.

UTILIZATION OF ANTENATAL UTILIZATION OF ANTENATAL CLINICS BY DISTANCE FROM A CLINICS BY DISTANCE FROM A

CLINICCLINIC

Page 35: Malimu statistical significance testing.

• We now want to examine if this observed difference is statistically significant or not.

• The chi-square test can be used to give us the answer.

• To perform a χ2 test we need to complete the following 3 steps:– calculate the χ2, – use a χ2 table to obtain the P-value and – interpret the χ2

THE CHI-SQUARE (THE CHI-SQUARE (ΧΧ22) ) TESTTEST

Page 36: Malimu statistical significance testing.

TABLE OF ΧTABLE OF Χ22 VALUES VALUES df P =

0.05P = 0.01

1 2 3 4 5 6 7 8 9101112

3.84 5.99 7.81 9.4911.0712.5914.0715.5116.9218.3119.6821.03

6.63 9.2111.3413.2815.0916.8118.4820.0921.6723.2124.7226.22

Page 37: Malimu statistical significance testing.

CALCULATING ΧCALCULATING Χ22 VALUE: VALUE:the 2 X 2 tablethe 2 X 2 table

• Consider the following general cross-table:

Page 38: Malimu statistical significance testing.

• The quick formula for calculating χχ22 in a 2 x 2 table is as follows:

• χχ22 = N(ad-bc)2/(EFGH)• General formula for larger contingency

tables is time consuming to perform, but computers are very useful for this

• For larger contingency tables, we shall therefore learn how to use Epi-Info for statistical significance testing

CALCULATING ΧCALCULATING Χ22 VALUE: VALUE:the 2 X 2 tablethe 2 X 2 table

Page 39: Malimu statistical significance testing.

USING A ΧUSING A Χ22 TABLE TABLE (1) Decide on the significance level you want to use (e.g.

0.05)(2) Calculate the degrees of freedom (df) as:• df = (r-1) x (c-1), where r is the number of rows and c

is the number of columns• for a simple two-by-two table the number of degrees of

freedom is 1 (df = (2-1) x (2-1) = 1)(3) If the calculated χ2 > the tabulated χ2 then

P<0.05• In this case, we reject the null hypothesis and conclude

that there is a statistically significant difference between the groups

(4) If the calculated χ2 < the tabulated χ2 then P>0.05• In this case, we accept the null hypothesis and

conclude that the observed difference is not statistically significant

Page 40: Malimu statistical significance testing.

TABLE OF ΧTABLE OF Χ22 VALUES VALUES df P =

0.05P = 0.01

1 2 3 4 5 6 7 8 9101112

3.84 5.99 7.81 9.4911.0712.5914.0715.5116.9218.3119.6821.03

6.63 9.2111.3413.2815.0916.8118.4820.0921.6723.2124.7226.22

Page 41: Malimu statistical significance testing.

APPLYING THE ΧAPPLYING THE Χ22

Example: use of χ2 test with the data on utilization of antenatal care:

• χ2 = 4.57; P = 0.03 <0.05• Conclusion:

– women living within a distance of 10 km from the clinic utilize antenatal care more often (64%) than the women living more than 10 km away (48%);

– this difference is statistically significant (χ2 = 4.57; P = 0.03)

Page 42: Malimu statistical significance testing.

PRACTICAL: two sample PRACTICAL: two sample proportionsproportions

• From each of 509 vaginal swabs taken at an STI clinic, isolation of Candida albicans and culture of Trichomonas vaginalis were attempted.

• There were 347 swabs negative for both Candida and Trichomonas. Candida was isolated from 7 of the 44 swabs positive for Trichomonas:– Present this information in a 2 x 2 table.– Is there an evidence of association between

Candidiasis and Trichomoniasis?