HYPOTHESIS TESTING Lulu Eva Rakhmilla Epidemiology and Biostatistics 2012 Basic Concept Approach.
Hypothesis - Biostatistics
-
Upload
helpwithassignment -
Category
Education
-
view
254 -
download
4
description
Transcript of Hypothesis - Biostatistics
Biostatistics
Lecture 8
Lecture 7 Review–
Using confidence intervals and p-values tointerpret the results of statistical analyses
• •
•
Null hypothesisP-value
Interpretation of confidence intervals & p- values
Null hypothesis
• A null hypothesis is one that proposes there isno difference in outcomes
• We commonly design research to disprove a nullhypothesis
P-value:- comparing two groups
What is theprobability (P-
value) of finding the
observed difference
How likely is itwe would see
adifference this big
IFIF
The null hypothesisis true?
There was NO realdifference betweenthe populations?
Interpretation of p-values1!
Weak evidence againstthe null hypothesis0.1!
Increasing evidence againstthe null hypothesis with
decreasing P-value0.01!
0.001!Strong evidence against
the null hypothesis
0.0001!
P-v
alue
!
Objective To assess the effect of combined hormone replacement therapyon health related quality of life.
Design Randomised placebo controlled double blind trial.
(HRT)
Ta ble 3EuroQoL Visual Analogue Scores (EQ-VAS) by treatment group.Figures are means (SE)
one year (95%
Combined HRT
(n=1043*)
Placebo
(n=1087*)Adjusted
difference at
CI)
P-value
EQ-VAS 77.9 (0.5) 78.5 (0.4) -0.59
(-1.66 to 0.47)
0.28
Five trials of drugs to reduce serum cholesterol
A reduction of 0.5 mmol/L or more correspondsto a clinically important effect of the drug
Trial Drug Cost No. of patients
per group
Observed difference in mean
cholesterol (mmol/L)
s.e. of difference (mmol/L)
95% CI for population
difference in mean
cholesterol
P-value
1 A Cheap 30 -1.00 1.00 -2.96 to 0.96 0.32
2 A Cheap 3000 -1.00 0.10 -1.20 to -0.80 <0.001
3 B Cheap 40 -0.50 0.83 -2.13 to 1.13 0.55
4 B Cheap 4000 -0.05 0.083 -0.21 to 0.11 0.55
5 C Expensive 5000 -0.125 0.05 -0.22 to -0.03 0.012
Lecture 8 – Proportions andintervals
Binary variables (RECAP)
confidence
•
• Single proportion– Standard error, confidence interval
• Incidence & prevalence
• Difference in two proportions– Standard error, confidence interval
Categorical variables - Binary
Binary variable – two categories only
(also termed – dichotomous variable)
Examples:-
Outcome – Diseased or Healthy; Alive
or Dead…
Exposure - Male or Female; Smoker or non-smoker;
Treatment or control group….
Inference
Proportion of population diseased – π??
Proportion of sample diseased, p=d/n
Number of subjects who do experience outcome (diseased) = dNumber of subjects who do not experience outcome (healthy) = h
Total number in sample = n = h + d
Inference - example
Proportion of population with vivax malaria - π
Proportion of sample with vivaxp = d/n = 15/100 = 0.15 (15%)
malaria,
Number of sample with vivax malaria = d = 15Number of sample without vivax malaria = h = 85Total number in sample = n = 15 + 85 = 100
Single proportion - Inference
• Obtain a sample estimate, p, of the population proportion, π
• REMEMBER different samples would give different estimatesof π (e.g. sample 1 p1, sample 2 p2,…)
• Derive:
– Standard error
– Confidence interval
Standard error & confidence intervalof a single proportion
• Standard error (SE) for single proportion:-(from the Binomial distribution)
π (1 − π ) p(1 − p)s.e.( p ) = ~
n n
• 95% CI for single proportion:-(approximate method based on the normal distribution)
–
–
Lower limit = p - 1.96×s.e.(p)
Upper limit = p + 1.96×s.e.(p)
Standard error & confidence intervala single proportion – malaria exampleof
•
•
Estimated proportion of vivax
Standard error of p
malaria (p) = 15/100 = 0.15
p(1 − p)
0.15(1 − 0.15)s e ( p ). . = = 0.036=
n 100
• 95% Confidence interval for population proportion (π)
–
–
Lower limit = p - 1.96×s.e.(p) = 0.15 – 1.96×0.036 = 0.079
Upper limit = p + 1.96×s.e.(p) = 0.15 + 1.96×0.036 = 0.221
Interpretation..
“We are 95% confident, the population proportion of people
vivax malaria is between 0.079 and 0.221
(or between 7.9% and 22.1%)”
with
Definition of a confidenceREMEMBER…..
interval
If we were to draw several independent,
random samples (of equal size) from the
sample population and calculate 95%confidence intervals for each of them, 0.
4
0.35
0.3
Population0.2
5
then on average 19 out of every 20 (95%)
such confidence intervals would
contain the true population
proportion (π), and one of every 20
0.2
0.15
0.1
(5%) would not.0.05
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Sam ple
Sam
ple
pro
po
rtio
n a
nd
95%
CI proportion = 0.16
(16%)
WARNING….Confidence Interval of a
single proportionThe normal approximation method breaks
down1)
2)
if:Sample
Sample
size (n) is small
proportion (p) is close to 0 or 1
Require:
np ≥ 10 or n(1-p) ≥ 10
Stata lets you calculate an ‘exact’ CI
Confidence Interval for a single proportion in Stata
• cii 100 15
•
•••
-- Binomial Exact --
[95% Conf. Interval]Variable | Obs Mean Std. Err.
-------------+---------------------------------------------------------------
| 100 .15 .0357071 .0864544 .2353075
• cii 100 1
•
••
•
-- Binomial Exact --
[95% Conf. Interval]Variable | Obs Mean Std. Err.
-------------+---------------------------------------------------------------
| 100 .01 .0099499 .0002531 .0544594
Interpretation of proportions:
Incidence versus Prevalence
Prevalence
Proportion of people in a defined population that have a given disease at a specified point in time
•
Prevalence = no. of people with the disease at particular point in time
no. of people in the population at a particular point in time
Examples:-
• Prevalenceliving in the
Prevalence
Thailand.
Prevalence
of chronic pain among people aged 25+ years andGrampian region, UK.
of typhoid among villagers living in Tak province,•
• of diagnosed asthma in individuals aged 15 to 50
years, registered with a particular general practice in Carlton.
Incidence risk(Cumulative incidence)
Proportion of new cases in a disease free population in a given time period
•
Incidence risk = no. of new cases of disease in a given time period
no. of people disease-free at beginning of time period
Examples:-
• Incidence risk of death in five years following diagnosis withprostate cancer
Incidence risk of breast cancer over 10 years of follow-up in
women 40-69 years of age and free from breast cancer in
1990
•
Incidence rate(NOT a proportion)
Number of new cases in a disease free population per person per unit time
• that occur
Incidence rate = no. of new cases of disease
total person-years of observation
Examples:-
• Incidence rate of all-cause mortality of men in the Melbourne
Collaborative Cohort Study = 9.0 per 1000 men per year
‘9 out of every 1000 men die each year’
(
Comparing two proportions
Comparing two proportions2×2 table
•••
ProportionProportionProportion
of all subjects experiencing outcome, p = d/nin exposed group, p1 = d1/n1
in unexposed group, p0 = d0/n0
Be alert (not alarmed): watch for transposing the table and swapping columns or rows
With outcome
(diseased)
Without outcome
(disease-free)
Total
Exposed
(group 1)
d1 h1 n1
Unexposed
(group 0)
d0 h0 n0
Total d h n
Comparing two proportionsExample:- TBM trial (Thwaites GE et al 2004)
Adults with tuberculous meningitis randomly allocated intotreatment groups:
2
1.
2.
Dexamethasone
Placebo
Outcome measure: Death during nine months following start of
treatment.
Research question:
Can treatment with dexamethasone reduce the risk of deathadults with tuberculous meningitis?
among
Comparing two proportionsExample – TBM trial
Death during 9 months post start of treatment
Treatment group Yes No Total
Dexamethasone
(group 1)
87 187 274
Placebo
(group 0)
112 159 271
Total 199 346 545
Difference in two population proportions, π1-π0
Estimate of difference in population proportions = p1 – p0
Example:- TBM trial
Dexamethasone
p1 = d1/n1 = 87/274 = 0.318
Placebo
p0 = d0/n0 = 112/271 = 0.413
p1 – p0 = 0.318 – 0.413 = -0.095 (or -9.5%)
Difference in two proportions - Inference
• Obtain a sample estimate, p1-p0, of the difference in population proportions, π1Dπ0
• REMEMBER different samplesof π1Dπ0 (e.g. sample 1 p11-p10,
would give different estimatessample 2 p21-p20,…)
• Derive:
– Standard error of difference in sample proportions
– Confidence interval of difference in population proportions
Standard error & confidence intervalfor difference between two
proportions• Standard error (SE) for difference between sample proportions:-
[s.e.( p )]2 + [s.e.( p )]2s.e.( p ) =− p1 0 1 0
• 95% CI for difference between population
Lower limit = (p1-p0) - 1.96×s.e.(p1-p0)
Upper limit = (p1-p0) + 1.96×s.e.(p1-p0)
proportions:-
Standard error & confidence interval
for difference between two proportions
Example:- TBM trial
Estimate of difference in population proportions
= p1-p0 = -0.095
s.e.(p1-p0) = 0.041
95% CI for difference in population proportions (π1-π0):
-0.095 ± 1.96×0.041
-0.175 up to -0.015 OR -17.5% up to -1.5%
Interpretation:-
“We are 95% confident, that the difference in population proportions is
between -17.5% (dexamethasone reduces the proportion of deaths by a
large amount) and -1.5% (dexamethasone marginally reduces the
proportion of deaths)”.
Comparing proportions usingcsi 87 112 187 159
Stata
| Exposed Unexposed | Total-----------------+------------------------+------------
Cases |
Noncases |87
187112
159|
|199
346-----------------+------------------------+------------
Total |
||||
274 271 |
||||
545
Risk .3175182 .4132841 .3651376
Point estimate [95% Conf. Interval]|------------------------+------------------------
Risk difference
Risk ratio Prev.
frac. ex. Prev.
frac. pop
|
|||
-.0957659
.7682808
.2317192
.1164974
|
|||
-.1762352 -.0152966.6139856
.0386495.9613505
.3860144
+-------------------------------------------------chi2(1) = 5.39 Pr>chi2 = 0.0202
Remember the warning about how the table is presented-Stata requires presentation with outcome by rows and exposure by columns
Results are close to those obtained by hand
Difference between two proportions:-
Risk difference
Example:- TBM trial
Outcome measure: Death during nine months
treatment.
following start of
Dexamethasone
p1 (incidence risk) = d1/n1 = 87/274 = 0.318
Placebo
p0 (incidence risk) = d0/n0 = 112/271 = 0.413
p1 – p0 (risk difference) = 0.318 – 0.413 = -0.095 (or -9.5%)
Lecture 8 – Objectives
• Define binary variables, prevalence and incidence risk
• Calculate and interpret a proportion and 95% confidenceinterval for the population proportion
• Calculate and interpret the difference in sample proportionsand 95% confidence interval for difference in population proportions
Thank You
www.HelpWithAssignment.com