Statistics for the Terrified Talk 4: Analysis of Clinical Trial data 30 th September 2010
description
Transcript of Statistics for the Terrified Talk 4: Analysis of Clinical Trial data 30 th September 2010
Statistics for the TerrifiedTalk 4: Analysis of Clinical Trial data
30th September 2010
Janet DunnLouise Hiller
Data types
What type of data do you have?
Categorical
2- levels More than 2 levels
Ordered
Non-
ordered
Continuous
Normally
distributed
Non-normally
distributed
Time to
event
Data types
What type of data do you have?
Categorical
2- levels More than 2 levels
Ordered
Non-
ordered
Continuous
Normally
distributed
Non-normally
distributed
Time to
event
2-level categorical (binary) data
N (%) 1 2 Row total
1 a (%) b (%) a+b
2 c (%) d (%) c+d
Column total a+c b+d n
Variable 1
Variable 2
Frequency Table
2-level categorical (binary) data - Test of association
Null hypothesis: The 2 factors are independent
Chi-squared test, with continuity correction c2=11.4 p=0.0007 Treatment and gender are NOT independent
N (%) 1 2 Row total
Male 55 (58%) 32 (33%) 87
Female 40 (42%) 66 (67%) 106
Column total 95 98 193
Treatment
Gender
2-level categorical (binary) data - Test of association
Null hypothesis: The 2 factors are independent
Commonly used with small numbers, Fisher’s exact test p=0.51 Treatment and gender are independent
N (%) 1 2 Row total
Male 4 (10%) 6 (17%) 10
Female 35 (90%) 30 (83%) 65
Column total 39 36 75
Treatment
Gender
2-level categorical (binary) data – Measure of agreementA measure of agreement between reviewers, above
that expected by chance
Kappa k=0.71 (95%CI 0.60-0.83) There is good agreement betweenreviewers
Response No response Row total
Response 74 12 86No response 8 50 58Column total 82 62 144
Reviewer 1
Reviewer 2
Altman guidelines<0.20 poor0.21 - 0.40 fair0.41 - 0.60 moderate0.61 - 0.80 good0.81 - 1.00 very good
2-level categorical (binary) data – Measure of agreementA measure of agreement between reviewers, above
that expected by chance
Kappa k=-0.04 (95%CI -0.24 - 0.15) There is poor agreement betweenreviewers
Response No response Row total
Response 35 25 60No response 25 15 40Column total 60 40 100
Reviewer 1
Reviewer 2
Altman guidelines<0.20 poor0.21 - 0.40 fair0.41 - 0.60 moderate0.61 - 0.80 good0.81 - 1.00 very good
2-level categorical (binary) data – Exploring patterns in the data
Odds ratio (OR): the ratio of the odds of an event occurring in the 1st gp to the odds of it occurring in the 2nd gp
OR=1 - event is equally likely to occur in both gpsOR>1 - event is more likely to occur in 1st gpOR<1 - event is less likely to occur in 1st gp
OR=4.1 (95%CI 2.2-7.9) The odds of a male having a response are 4 times those of a female having a response
Yes No Row totalMale 55 20 75
Female 40 60 100Column total 95 80 175
Response
Gender
2-level categorical (binary) data – Exploring patterns in the data
Relative Risk (RR): the ratio of the risk of an event occurring in the 1st gp to the risk of it occurring in the 2nd gp
RR=1 - event is equally likely to occur in both gpsRR>1 - event is more likely to occur in 1st gpRR<1 - event is less likely to occur in 1st gp
RR=1.7 (95%CI 0.64-4.50) New trt patients are 1.7 times more likely to suffer an SAE than control patients
Yes No Row totalNew trt 10 88 98Control 6 94 100
Column total 16 182 198
SAE suffered
Treatment
Exploring patterns in multivariate data - Logistic Regression• A statistical modelling method that describes the
relationship between a categorical response variable and 1 or more categorical and/or continuous variables
e.g. Association between bearing grudges & medical conditions
OR 95%CI pHeart attack 2.09 1.51 - 2.89 0.0001
High blood pressure 1.47 1.27 - 1.71 0.0001Heart disease 1.64 1.24 - 2.18 0.001
Epilepsy 0.86 0.55 - 1.38 0.59Stroke 0.99 0.66 - 1.49 0.96
Ordered categorical data – Test for trend Null hypothesis: No linear trend between groups
Chi-squared tests for trend c2=10.8 p=0.001 There is a linear trend between groups
N (%) 1 2 Row total
Mild 17 (20%) 32 (38.5%) 49Moderate 29 (35%) 32 (38.5%) 61Severe 38 (45%) 19 (23%) 57
Column total 84 83 167
Treatment
Toxicity
Ordered categorical data – Test for trend (>2 rows & columns) Null hypothesis: No linear trend between rows and
columns
Chi-squared tests for trend c2=7.1 p=0.008 There is a linear trend between rows & columns
N (%) 1mg 2mg 3mg Row total
Mild 30 (36%) 19 (23%) 18 (22%) 67Moderate 31 (37%) 32 (38.5%) 27 (33%) 90Severe 22 (27%) 32 (38.5%) 37 (45%) 91
Column total 83 83 82 248
Treatment dose
Toxicity
Ordered categorical data – Measure of agreement
A measure of agreement between reviewers, above that expected by chance
CR PR SD Row totalCR 30 12 8 50PR 17 32 20 69SD 5 22 37 64
Column total 52 66 65 183
Reviewer 1
Reviewer 2
Altman guidelines<0.20 poor0.21 - 0.40 fair0.41 - 0.60 moderate0.61 - 0.80 good0.81 - 1.00 very good
Weighted kappa k=0.38 (95%CI 0.27-0.49) There is fair agreement between reviewers
Non-ordered categorical data - Test of association
Null hypothesis: The 2 factors are independent
Chi-squared test c2=0.51 p=0.78 Treatment and disease site are independent
N (%) 1 2 Row total
Head & Neck 26 (23%) 29 (26%) 55Limbs 32 (28%) 33 (30%) 65Body 55 (49%) 49 (44%) 104
Column total 113 111 224
Treatment
Disease site
Non-ordered categorical data – Measure of agreement
A measure of agreement between reviewers, above that expected by chance
A B C Row totalA 30 12 8 50B 17 32 20 69C 5 22 37 64
Column total 52 66 65 183
Reviewer 1
Reviewer 2
Altman guidelines<0.20 poor0.21 - 0.40 fair0.41 - 0.60 moderate0.61 - 0.80 good0.81 - 1.00 very good
Kappa k=0.31 (95%CI 0.20-0.42) There is fair agreement between reviewers
Categorical data – RECAP.
Levels Test of association Measure of agreement
Exploring patterns in the data
2 c2 test with continuity correction;Fisher’s exact test
Kappa Odds Ratio & Relative Risk; Logistic regression
>2 (ordered) c2 test for trend Weighted kappa Not covered
>2 (non-ordered) c2 test Kappa Not covered
Data types
What type of data do you have?
Categorical
2- levels More than 2 levels
Ordered
Non-
ordered
Continuous
Normally
distributed
Non-normally
distributed
Time to
event
Normally distributed data
• Data forms a bell-shaped curve• Non-significant Shapiro-Wilk test result
Mean & Standard Deviation graphTreatments
Cha
nge
over
tim
e in
QO
L (%
)
Parametric tests
• Differences between means of 2 groups– T-tests
• Differences between means of >2 groups– ANOVA– Linear regression
• Correlation– Pearson’s correlation coefficient, r
Non-normally distributed data
Box and Whisker graphs
• Outliers (observations that lie outside of the 95% CIs) are sometimes plotted individually
Box and Whisker graphs
• Parallel box plots show the differences between groups
Non-parametric tests
• Differences between medians of 2 groups– Wilcoxon rank sum test
• Differences between medians of >2 groups– Kruskal-Wallis 1-way analysis of variance test
• Correlation– Spearman’s rank order correlation coefficient, r
Transforming data
• Can transform non-normally distributed data (e.g. logarithm, square root, reciprocal) to make create normally distributed data
• Then analyse transformed data using parametric methods
Data types
What type of data do you have?
Categorical
2- levels More than 2 levels
Ordered
Non-
ordered
Continuous
Normally
distributed
Non-normally
distributed
Time to
event
Time-to-event data
• Why is this different to other continuous data?– Censoring
TNO123456
KEY Randomisation date Date of event Censor date
Time 20* 8 8* 14 1* 16*
What time? What event?
• Start date?– Diagnosis– Surgery
• Event?– Onset / worsening of pain– Hospital discharge– Death (OS)– Relapse (RFI/DFI/ Plateau)– Relapse or death (RFS/DFS)
You need to know what you’re looking at to know how to interpret it / what to compare it to
– Randomisation– Start/End of treatment
Time-to-event data analysis (‘Survival Analysis’)
• Can be used to measure time to any event– Arthritic joint remaining pain-free post steroid injections– Elderly patient with a fractured hip remaining in hosp.
• Calculate ‘survival’ time for each patient (some may be censored times)– Recruitment takes place over time so varying lengths of
follow-up are expected• Rank these times and calculate proportions alive at
certain points, with due allowance for incomplete follow-up
• These proportions and times are plotted and overall distributions of curves compared
Time-to-event data
• Why is this different to other continuous data?– Censoring
TNO123456
KEY Randomisation date Date of event Censor date
Time 20* 8 8* 14 1* 16*
Kaplan-Meier Curves
Median survival = 1.3 years
Minimum & median FU indicate the maturity of the data
Kaplan-Meier Curves
0 1 2 3 4 5 6 7 8 9 100
25
50
75
100
ECMFCMF
Years from Surgery
% s
urvi
ving
CMF
Numbers at Risk: ECMF 1189 1171 1120 1073 1020 965 826 606 380 196 53 CMF 1202 1178 1099 1024 957 888 759 564 352 176 55
78%
84%
Undesirable comparisons of survival rates
Statistical tests for time-to-event data • Log-rank tests compare the overall distributions of
the curves (c2 and p-value presented)– Null hypothesis: all curves are samples from
populations with the same risk of the event– Compares the number of deaths observed on each
treatment arm with the number expected under the null hypothesis that the 2 survival distributions are identical
• Cox proportional hazards model (Hazard Ratio, 95% CI’s and p-value presented)– Identifies which variables from a group of several
are independently related to survival– In what order of importance– Gives you a measure of their relation to survival
Forest plots
Deaths/PatientsECMF CMF
ECMF events(O-E) Var
*Hazard Ratio & CI(ECMF : CMF)
*HR & CI(ECMF : CMF)
0.0 0.5 1.0 1.5 2.0ECMF better CMF better* 95% CI99% CI
10:1328JAN08
NOT FOR PUBLICATION OR CITATION
Trial
NEAT 231/1009 278/1012 -29.7 127.1(22.9%) (27.5%)
0.79 (0.63, 0.99)BR9601 47/180 70/190 -12.4 29.2
(26.1%) (36.8%)0.65 (0.41, 1.05)
Subtotal 278/1189 348/1202 -42.1 156.4(23.4%) (29.0%)
0.76 (0.65, 0.89)(P<.001)
= 0.9; P=.35 Interaction between 2 groups 21
Age<=50 151/713 197/699 -30.4 86.8
(21.2%) (28.2%)0.70 (0.53, 0.93)
>50 127/476 151/503 -11.3 69.5(26.7%) (30.0%)
0.85 (0.62, 1.16)
Subtotal 278/1189 348/1202 -41.8 156.3(23.4%) (29.0%)
0.77 (0.65, 0.90)(P<.001)
= 1.4; P=.24 Interaction between 2 groups 21
Menopausal Status
Pre/Peri 156/675 186/679 -18.5 85.4(23.1%) (27.4%)
0.81 (0.61, 1.06)
Post 109/444 149/467 -21.9 64.4(24.5%) (31.9%)
0.71 (0.52, 0.98)
Unknown 13/70 13/56 -1.5 6.4(18.6%) (23.2%)
0.79 (0.28, 2.18)
Subtotal 278/1189 348/1202 -41.9 156.3(23.4%) (29.0%) 0.76 (0.65, 0.89)(P<.001) = 0.6; P=.76Heterogeneity between 3 groups 2
2Performance Status
0 189/837 226/834 -25.2 103.6(22.6%) (27.1%)
0.78 (0.61, 1.01)
1/2 54/202 76/217 -9.6 32.5(26.7%) (35.0%)
0.74 (0.47, 1.17)Unknown 35/150 46/151 -7.4 20.1
(23.3%) (30.5%)0.69 (0.39, 1.23)
Subtotal 278/1189 348/1202 -42.2 156.1(23.4%) (29.0%)
0.76 (0.65, 0.89)(P<.001) = 0.3; P=.86Heterogeneity between 3 groups 2
2Surg ery
Mastectomy 163/615 217/634 -30.5 94.9(26.5%) (34.2%)
0.73 (0.56, 0.94)BCS 113/569 130/563 -12.0 60.7
(19.9%) (23.1%)0.82 (0.59, 1.14)
Subtotal 276/1184 347/1197 -42.5 155.6(23.3%) (29.0%)
0.76 (0.65, 0.89)(P<.001)
= 0.6; P=.45 Interaction between 2 groups 21
Unstratified 278/1189 348/1202 -42.3 156.4(23.4%) (29.0%)
0.76 (0.65, 0.89)(P<.001)
Deaths/PatientsECMF CMF
ECMF events(O-E) Var
*Hazard Ratio & CI(ECMF : CMF)
*HR & CI(ECMF : CMF)
0.0 0.5 1.0 1.5 2.0ECMF better CMF better* 95% CI99% CI
10:1328JAN08
NOT FOR PUBLICATION OR CITATION
Trial
NEAT 231/1009 278/1012 -29.7 127.1(22.9%) (27.5%)
0.79 (0.63, 0.99)
BR9601 47/180 70/190 -12.4 29.2(26.1%) (36.8%)
0.65 (0.41, 1.05)
Subtotal 278/1189 348/1202 -42.1 156.4(23.4%) (29.0%)
0.76 (0.65, 0.89)(P<.001) = 0.9; P=.35 Interaction between 2 groups 2
1Age
<=50 151/713 197/699 -30.4 86.8(21.2%) (28.2%)
0.70 (0.53, 0.93)
>50 127/476 151/503 -11.3 69.5(26.7%) (30.0%)
0.85 (0.62, 1.16)
Subtotal 278/1189 348/1202 -41.8 156.3(23.4%) (29.0%)
0.77 (0.65, 0.90)(P<.001) = 1.4; P=.24 Interaction between 2 groups 2
1Menopausal Status
Pre/Peri 156/675 186/679 -18.5 85.4(23.1%) (27.4%)
0.81 (0.61, 1.06)
Post 109/444 149/467 -21.9 64.4(24.5%) (31.9%)
0.71 (0.52, 0.98)
Unknown 13/70 13/56 -1.5 6.4(18.6%) (23.2%)
0.79 (0.28, 2.18)
Subtotal 278/1189 348/1202 -41.9 156.3(23.4%) (29.0%) 0.76 (0.65, 0.89)(P<.001) = 0.6; P=.76Heterogeneity between 3 groups 2
2Performance Status
0 189/837 226/834 -25.2 103.6(22.6%) (27.1%)
0.78 (0.61, 1.01)
1/2 54/202 76/217 -9.6 32.5(26.7%) (35.0%)
0.74 (0.47, 1.17)Unknown 35/150 46/151 -7.4 20.1
(23.3%) (30.5%)0.69 (0.39, 1.23)
Subtotal 278/1189 348/1202 -42.2 156.1(23.4%) (29.0%)
0.76 (0.65, 0.89)(P<.001) = 0.3; P=.86Heterogeneity between 3 groups 2
2Surgery
Mastectomy 163/615 217/634 -30.5 94.9(26.5%) (34.2%)
0.73 (0.56, 0.94)
BCS 113/569 130/563 -12.0 60.7(19.9%) (23.1%)
0.82 (0.59, 1.14)
Subtotal 276/1184 347/1197 -42.5 155.6(23.3%) (29.0%)
0.76 (0.65, 0.89)(P<.001) = 0.6; P=.45 Interaction between 2 groups 2
1
Unstratified 278/1189 348/1202 -42.3 156.4(23.4%) (29.0%)
0.76 (0.65, 0.89)(P<.001)
Deaths/PatientsECMF CMF
ECMF events(O-E) Var
*Hazard Ratio & CI(ECMF : CMF)
*HR & CI(ECMF : CMF)
0.0 0.5 1.0 1.5 2.0ECMF better CMF better* 95% CI99% CI
10:1328JAN08
NOT FOR PUBLICATION OR CITATION
Trial
NEAT 231/1009 278/1012 -29.7 127.1(22.9%) (27.5%)
0.79 (0.63, 0.99)
BR9601 47/180 70/190 -12.4 29.2(26.1%) (36.8%)
0.65 (0.41, 1.05)
Subtotal 278/1189 348/1202 -42.1 156.4(23.4%) (29.0%)
0.76 (0.65, 0.89)(P<.001) = 0.9; P=.35 Interaction between 2 groups 2
1Age
<=50 151/713 197/699 -30.4 86.8(21.2%) (28.2%)
0.70 (0.53, 0.93)
>50 127/476 151/503 -11.3 69.5(26.7%) (30.0%)
0.85 (0.62, 1.16)
Subtotal 278/1189 348/1202 -41.8 156.3(23.4%) (29.0%)
0.77 (0.65, 0.90)(P<.001) = 1.4; P=.24 Interaction between 2 groups 2
1Menopausal Status
Pre/Peri 156/675 186/679 -18.5 85.4(23.1%) (27.4%)
0.81 (0.61, 1.06)Post 109/444 149/467 -21.9 64.4
(24.5%) (31.9%)0.71 (0.52, 0.98)
Unknown 13/70 13/56 -1.5 6.4(18.6%) (23.2%)
0.79 (0.28, 2.18)
Subtotal 278/1189 348/1202 -41.9 156.3(23.4%) (29.0%)
0.76 (0.65, 0.89)(P<.001) = 0.6; P=.76Heterogeneity between 3 groups 2
2Performance Status
0 189/837 226/834 -25.2 103.6(22.6%) (27.1%)
0.78 (0.61, 1.01)
1/2 54/202 76/217 -9.6 32.5(26.7%) (35.0%)
0.74 (0.47, 1.17)Unknown 35/150 46/151 -7.4 20.1
(23.3%) (30.5%)0.69 (0.39, 1.23)
Subtotal 278/1189 348/1202 -42.2 156.1(23.4%) (29.0%)
0.76 (0.65, 0.89)(P<.001) = 0.3; P=.86Heterogeneity between 3 groups 2
2Surgery
Mastectomy 163/615 217/634 -30.5 94.9(26.5%) (34.2%)
0.73 (0.56, 0.94)BCS 113/569 130/563 -12.0 60.7
(19.9%) (23.1%)0.82 (0.59, 1.14)
Subtotal 276/1184 347/1197 -42.5 155.6(23.3%) (29.0%)
0.76 (0.65, 0.89)(P<.001) = 0.6; P=.45 Interaction between 2 groups 2
1
Unstratified 278/1189 348/1202 -42.3 156.4(23.4%) (29.0%)
0.76 (0.65, 0.89)(P<.001)
[Bars=95% confidence interval. Size of boxes can represent sample size]
Longitudinal data analysis
• A variable can be measured on the same patient over time (e.g. Baseline, 3 month, 6 month …)
• Can be any type of data (categorical, continuous)
Longitudinal data analysis – Summary Measures
Change from Baseline in Global QOL
CMFECMF
Change at 1 year (p=0.01)
Change at 2 years (p=0.06)
Impr
ovem
ent
Det
erio
ratio
n
TRT ATRT B
Longitudinal data analysis – Modelling
TLCO
SCO
RE (m
mol
/min
/kPa
)
Pulmonary function (TLCO score) over time
Graphs show each patient as a separate line Solid line = Trt A ptsDashed line = Trt B pts
Random effects modelling predicts the average patient score on each treatment arm
Cluster Randomised Trial data
• Patients within 1 cluster are often more likely to respond in a similar manner, and thus can not be assumed to act independently
• ICC = Intracluster Correlation Coefficient. A statistical measure of this dependence– Takes values between 0 and 1– Higher values = greater between-cluster variation.
e.g. Management within sites are consistent but, across different sites, there is wide variation
• Analysis must incorporate the effects of clustering i.e. the values of the ICC and design effect
Useful References
• Gore & Altman – Statistics in Practice
• Bland - An Introduction to Medical Statistics
• Altman - Practical Statistics for Medical Research
• Peto et al - Design and Analysis of Randomized Clinical-Trials Requiring Prolonged Observation of each patient
– 1/ Introduction and Design. British Journal of Cancer 1976. 34(6) 585-612
– 2/ Analysis and Examples. British Journal of Cancer 1977. 35(1) 1-39