An Empirical Study of Bias in
Randomized Controlled Trials and
Non-randomized Studies of Surgical Interventions
by
Lakhbir Sandhu
A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy in Clinical Epidemiology
Institute of Health Policy, Management & Evaluation University of Toronto
© Copyright by Lakhbir Sandhu, 2013
ii
An Empirical Study of Bias in
Randomized Controlled Trials and
Non-randomized Studies of Surgical Interventions
Lakhbir Sandhu
Doctor of Philosophy
Institute of Health Policy, Management & Evaluation University of Toronto
2013
Abstract
Objectives: The aim of this dissertation was to examine bias in randomized controlled trials
(RCTs) and non-randomized studies (NRS) in surgery using the literature evaluating
laparoscopy and conventional (i.e. open) surgery for the treatment of colon cancer as a case
study. The objectives were 1) to develop a conceptual framework for bias in comparative
NRS; 2) to compare effect estimates from NRS with those from RCTs at low risk of bias;
3) to explore the impact of NRS-design attributes on estimates of treatment effect.
Methods: The methods included a modified framework synthesis, systematic review of the
literature, random-effects meta-analyses, and frequentist and Bayesian meta-regression. The
Cochrane Risk of Bias Tool was used to classify trials as Strong RCTs (i.e. low risk of bias)
or Typical RCTs (i.e. unclear or high risk of bias).
iii
Results: A conceptual framework for bias in comparative NRS was developed and it
contains 37 individual sources of bias or “items”. These items were organized within 6
overarching “domains”: selection bias, information bias, performance bias, detection bias,
attrition bias, and selective reporting bias. Our analyses revealed that NRS were associated
with more extreme estimates of benefit for laparoscopy than Strong RCTs when examining
subjective outcomes. The odds ratios from NRS were 36% smaller (i.e. demonstrating more
benefit for laparoscopy) than those from Strong RCTs for the outcome post-operative
complications (Ratio of Odds Ratios, ROR 0.64, [0.42, 0.97], p=0.04). Similar exaggerated
benefit was seen among NRS when assessing length of stay, (Difference in Mean
Differences, -2.15 days, [-4.08, -0.21], p=0.03). This pattern was not observed with the
objective outcomes peri-operative mortality and number of lymph nodes harvested. Analyses
adjusted for period effects and between-study case-mix yielded similar findings. Finally,
effect estimates in NRS did not consistently vary according to the presence or absence of
nine design characteristics identified from the conceptual framework.
Conclusions: We have demonstrated that the results of surgical NRS can be significantly
biased as compared with those of low risk of bias RCTs when evaluating subjective
outcomes. However, none of the nine NRS-design characteristics examined was consistently
associated with biased effect estimates.
iv
Acknowledgments
I would like to express my gratitude to members of my thesis committee for their insight,
support, and guidance throughout this endeavor. Drs. Erin Kennedy and Nancy Baxter, your
feedback has been invaluable. Dr. George Tomlinson, you have spent countless hours
teaching me the finer points of quantitative analysis and simultaneously shared your
enthusiasm for science and discovery. My research supervisor, Dr. David Urbach, has
provided especially unwavering support and encouragement — this dissertation work would
not have been possible without your expertise.
I am thankful to the Surgeon Scientist Training Program, the Division of General Surgery
and the Department of Surgery for supporting my dissertation work and career. In particular,
I would like to thank Drs. Najma Ahmed, Andrew Smith, Zane Cohen, Lorne Rotstein and
James Rutka for their direction.
Thank you to my colleagues and friends who have supported me every step of the way;
Dr. Barbara Haas, Dr. Robert Bisken, Dr. Claire Trottier, Aron Klein, Dr. Lorie Kloda,
Dr. Anna Shawyer, Dr. Anna Bendzak, Dr. Marvin Hsiao, Dr. Charles de Mestral,
Dr. Boris Zevin, Dr. Anusha Jegatheeswaran, Dr. Sapna Rawal, Marina Englesakis and
Dr. Anna Gagliardi. A very special thank you to Suna Girn for her encourgament. Finally,
thank you to my family for their love and encouragement.
v
Funding
This research would not have been possible without the generous support of the Surgeon
Scientist Training Program, the Department of Surgery, Division of General Surgery and
Faculty of Medicine at the University of Toronto. This work was also supported by the
Post-MD Fellowship Program of the National Cancer Institute of Canada/Canadian Cancer
Society (Grant # 20019) and the Johnson & Johnson Medical Products/Surgeon Scientist
Training Program Fellowship.
I would also like to express my sincere gratitude to Val Cabral and Nancy Condo for their
guidance with identifying funding opportunities.
vi
Table of Contents
Acknowledgments .......................................................................................................................... iv Funding ............................................................................................................................................ v Table of Contents ............................................................................................................................ vi List of Abbreviations ...................................................................................................................... ix List of Tables ................................................................................................................................... x List of Figures .............................................................................................................................. xiii List of Appendices ........................................................................................................................ xiv Thesis Overview ........................................................................................................................... xv Chapter 1 Literature Review ........................................................................................................ 1
1.1 The hierarchy of study design.............................................................................................. 1 1.1.1 The limitations of RCTs .......................................................................................... 2
1.2 The infrequency of surgical trials ........................................................................................ 3 1.3 Barriers to the conduct of surgical trials .............................................................................. 4
1.3.1 Issues with patient and physician accrual ................................................................ 4 1.3.2 Funding surgical research ........................................................................................ 5
1.4 Challenges in surgical trials ................................................................................................. 6 1.4.1 Blinding in surgical RCTs ....................................................................................... 6 1.4.2 Standardizing surgical technique ............................................................................. 7
1.5 The Balliol Collaboration .................................................................................................... 8 1.6 Do NRS and RCTs yield comparable results? ..................................................................... 9
1.6.1 Empirical comparisons of effect estimates from NRS and RCTs ......................... 11 1.7 Study characteristics of RCTs associated with bias .......................................................... 14
1.7.1 The methodological shortcomings of surgical RCTs ............................................ 22 1.8 Study characteristics of NRS associated with bias ............................................................ 23 1.9 Summary of gaps in knowledge ........................................................................................ 24 1.10 Dissertation rationale ......................................................................................................... 25 1.11 Research aims .................................................................................................................... 26
Chapter 2 Laparoscopic colon surgery – an opportunity to study bias ................................. 27 Chapter 3 Development of a conceptual framework for bias in non-randomized
studies: results of a modified framework synthesis ............................................................. 31 3.1 Summary ............................................................................................................................ 31 3.2 Introduction........................................................................................................................ 33 3.3 Methods ............................................................................................................................. 35
3.3.1 Search strategy ....................................................................................................... 36 3.3.2 Data collection ....................................................................................................... 37 3.3.3 Analytic approach .................................................................................................. 37 3.3.4 Framework refinement........................................................................................... 38
3.4 Results ............................................................................................................................... 38 3.4.1 Included studies ..................................................................................................... 38 3.4.2 Conceptual framework........................................................................................... 41
vii
3.4.3 Excluded items....................................................................................................... 50 3.5 Discussion .......................................................................................................................... 52 3.6 Conclusion ......................................................................................................................... 55
Chapter 4 Common Methods for Chapters 5 & 6 .................................................................... 56 4.1 Overview............................................................................................................................ 56 4.2 Literature search ................................................................................................................ 57 4.3 Data abstraction and management ..................................................................................... 58 4.4 Categorizing studies as RCTs or NRS ............................................................................... 60 4.5 Outcome selection and definition ...................................................................................... 61
4.5.1 Subjective versus objective outcomes ................................................................... 62 4.5.2 Summary effect measures ...................................................................................... 64
4.6 Handling multiple publications of the same cohort ........................................................... 67 4.7 Approach to missing data for continuous outcomes .......................................................... 67 4.8 Identifying a referent group – Strong RCTs ...................................................................... 70
4.8.1 Why categorize RCTs as Typical versus Strong?.................................................. 70 4.8.2 Cochrane Risk of Bias Tool ................................................................................... 71 4.8.3 Validating risk of bias assessments ....................................................................... 74
4.9 Statistical analyses ............................................................................................................. 75 4.10 Results 75
4.10.1 Data cohort............................................................................................................. 75 4.10.2 Strong RCTs .......................................................................................................... 86
4.11 Risk of bias assessment summary...................................................................................... 90 Chapter 5 Comparing effect estimates from non-randomized studies and
randomized controlled trials .................................................................................................. 91 5.1 Summary ............................................................................................................................ 91 5.2 Introduction........................................................................................................................ 93 5.3 Methods ............................................................................................................................. 94
5.3.1 Statistical analyses ................................................................................................. 94 5.4 Results ............................................................................................................................. 103
5.4.1 Included studies ................................................................................................... 103 5.4.2 Binary outcomes .................................................................................................. 104 5.4.3 Continuous outcomes .......................................................................................... 110
5.5 Discussion ........................................................................................................................ 121 5.6 Conclusion ....................................................................................................................... 127
Chapter 6 Empirically identifying the study attributes of non-randomized studies associated with bias: a meta-epidemiology study ............................................................... 128 6.1 Summary .......................................................................................................................... 128 6.2 Introduction...................................................................................................................... 130 6.3 Methods ........................................................................................................................... 131
6.3.1 Included studies ................................................................................................... 131 6.3.2 NRS study characteristics .................................................................................... 131 6.3.3 Statistical analyses ............................................................................................... 134
6.4 Results ............................................................................................................................. 137 6.4.1 Included studies ................................................................................................... 137 6.4.2 Subjective outcomes ............................................................................................ 138
viii
6.4.3 Objective outcomes ............................................................................................. 150 6.5 Discussion ........................................................................................................................ 163 6.6 Conclusion ....................................................................................................................... 165
Chapter 7 General Discussion and Future Directions............................................................ 166 7.1 Summary of findings ....................................................................................................... 166 7.2 Implications ..................................................................................................................... 168
7.2.1 Implications for the meta-analysis of surgical RCTs .......................................... 169 7.2.2 Implications for the interpretation of surgical NRS ............................................ 169 7.2.3 Implications for future meta-epidemiological studies of NRS study
characteristics. ..................................................................................................... 170 7.3 Limitations ....................................................................................................................... 171
7.3.1 Limitations of available data ............................................................................... 171 7.3.2 Limitations of data analysis ................................................................................. 172 7.3.3 Limitations of generalizability ............................................................................. 174
7.4 Future Directions ............................................................................................................. 174 7.4.1 Outcome Reporting in NRS and RCTs ................................................................ 174 7.4.2 Investigating the relationship between reporting and actual RCT quality .......... 175 7.4.3 Ongoing evaluations of NRS study characteristics ............................................. 176
7.5 Conclusions ..................................................................................................................... 176 References.................................................................................................................................... 178 Appendix A .................................................................................................................................. 195 Appendix B .................................................................................................................................. 196 Appendix C .................................................................................................................................. 203 Appendix D .................................................................................................................................. 208 Appendix E .................................................................................................................................. 218 Appendix F .................................................................................................................................. 232 Appendix G .................................................................................................................................. 237
ix
List of Abbreviations
ARR – absolute risk reduction
DMD – difference in mean differences
DVT – deep vein thrombosis
IQR – interquartile range
LAP – laparoscopy
LOS – length of stay
MD – mean difference
NRS – non-randomized studies
OPEN – open (i.e. conventional) surgery
OR – odds ratio
RCTs – randomized controlled trials
RoM – ratio of means
ROR – ratio of odds ratios
RD – risk difference
RR – relative risk
SID – study identification number
SMD – standardized mean difference
x
List of Tables
Table1.1 Oxford Centre for Evidence-based Medicine – Levels of Evidence for studies of therapy/prevention/ aetiology/harm (2009)............................................................................................................................................2
Table 1.2 Results of meta-analyses of NRS and RCTs appearing in Concato et al………………………………...11
Table 1.3 Meta-epidemiological studies of RCT-study attributes.......................................................................16
Table 1.4 Meta-analyses of meta-epidemiological studies..................................................................................21
Table 3.1 Definitions of key constructs…………………………………………………………………………,34
Table 3.2 Characteristics of included studies………………………………………………………………...…,39
Table 3.3 Bias domains extracted from systematic reviews of quality assessment tools for NRS……….……..40
Table 3.4 Bias domains in the conceptual framework…………………………………………………….…,….41
Table 3.5 Frequency of included items…………………………………………………………………………,.43
Table 3.6 Items abstracted from reviews but not related to bias………………………………………………,..51
Table 4.1 Definitions for abstracted variables…………………………………………………………………,..59
Table 4.2 Cochrane Risk of Bias Tool…………………………………………………………………………,..71
Table 4.3 Approach for summary assessments of risk of bias for an item, within a study and within a meta-analysis………………………………………………………………………………………………………...,...73
Table 4.4 Characteristics of included studies………………………………………………………………...,…77
Table 4.5 Non-randomized studies meeting inclusion criteria………………………………………………,….78
Table 4.6 Randomized controlled trials meeting inclusion criteria……………………………………………,..84
Table 4.7 Summary of risk of bias item responses for RCTs reporting post-operative complications……...….86
Table 4.8 Summary of risk of bias item responses for RCTs reporting peri-operative mortality……..………..87
Table 4.9 Summary of risk of bias item responses for RCTs reporting peri-operative mortality……………....88
Table 4.10 Summary of risk of bias item responses for RCTs reporting number of lymph nodes harvested…..89
Table 5.1 Characteristics of included studies…………..……………………………………..………………...105
Table 5.2 Random-effects meta-analysis results for studies reporting post-operative complications……..…...106
Table 5.3 Results of Strong Randomized Controlled Trials……………………….………………………...….107
Table 5.4 Meta-regression results comparing effect estimates for post-operative complications from different study designs…………………………………………………………………………………………………,…108
Table 5.5 Random-effects meta-analysis results for studies reporting peri-operative mortality………,………109
Table 5.6 Meta-regression results comparing effect estimates for peri-operative mortality from different study designs……………………………………………………………………………………………………...,,…..110
Table 5.7 Random-effects meta-analysis results for studies reporting length of stay (days)…………………,,.112
Table 5.8 Meta-regression results comparing effect estimates for length of stay from different study designs………………………………………………………………………………………………………,,….113
xi
Table 5.9 Random-effects meta-analysis results for studies reporting number of lymph nodes harvested…,…114
Table 5.10 Meta-regression results comparing effect estimates for number of lymph nodes harvested from different study designs……………………………………………………………………………….………….115
Table 5.11 Median year of publication and baseline event rates in studies reporting the outcomes of interest……………………………………………………………………………………………………….….116
Table 5.12 Bayesian meta-regression results comparing effect estimates for post-operative complications from different study designs, adjusted for year of publication and baseline event rate……………………………...120
Table 5.13 Bayesian meta-regression results comparing effect estimates for peri-operative mortality from different study designs, adjusted for year of publication and baseline event rate……………………………...120
Table 5.14 Bayesian meta-regression results comparing effect estimates for length of stay from different study designs, adjusted for year of publication and baseline event rate……………………………………………...121
Table 5.15 Bayesian meta-regression results comparing effect estimates for number of lymph nodes harvested from different study designs, adjusted for year of publication and baseline event rate………………………..121
Table 6.1 NRS study characteristics – definitions and relationship to the conceptual framework for bias in NRS………………………………………………………………………………………………………….…133
Table 6.2 Measures of inter-rater agreement…………………………………………………………………..135
Table 6.3 Characteristics of included studies………………………………………………………………….138
Table 6.4 Distribution of study attributes among NRS reporting post-operative complications (n=79)..........139
Table 6.5 Study characteristics patterns across NRS reporting post-operative complications (n=79 studies)..141
Table 6.6 Random-effects meta-analyses results among NRS reporting post-operative complications (n=79)...............................................................................................................................................................142
Table 6.7 Univariable meta-regression results among NRS reporting post-operative complications.............142
Table 6.8 Random-effects meta-analysis results across outcomes of interest and different study designs......143
Table 6.9 Univariable meta-regression results comparing NRS with or without study characteristics with Strong RCTs.................................................................................................................................................................144
Table 6.10 Distribution of study attributes among NRS reporting length of stay (n=106).............................145
Table 6.11 Study characteristics patterns across NRS reporting length of stay (n=106 studies)…………….147
Table 6.12 Random-effects meta-analysis results among NRS reporting length of stay (n=106)..................148
Table 6.13 Univariable meta-regression results among NRS reporting length of stay...................................148
Table 6.14 Univariable meta-regression results comparing NRS with or without study characteristics with Strong RCTs.....................................................................................................................................................150
Table 6.15 Distribution of study attributes among NRS reporting peri-operative mortality (n=79)...............151
Table 6.16 Study characteristics patterns across NRS reporting peri-operative mortality (n=79 studies)..….152
Table 6.17 Random-effects meta-analysis results among NRS reporting peri-operative mortality (n=79).....153
Table 6.18 Univariable meta-regression results among NRS reporting peri-operative mortality...................153
Table 6.19 Univariable meta-regression results comparing NRS with or without study characteristics with Strong RCTs.....................................................................................................................................................155
Table 6.20 Distribution of study attributes among NRS reporting number of lymph nodes harvested (n=59)...............................................................................................................................................................156
xii
Table 6.21 Study characteristics patterns across NRS reporting number of lymph nodes harvested (n=59 studies)…………………………………………………………………………………………………………157
Table 6.22 Random-effects meta-analysis results among NRS reporting number of lymph nodes (n=59)....158
Table 6.23 Univariable meta-regression results among NRS reporting number of lymph nodes harvested...158
Table 6.24 Univariable meta-regression results comparing NRS with or without study characteristics with Strong RCTs....................................................................................................................................................159
xiii
List of Figures
Figure 1.1. Results from Wood et al…………………………………………………………………………….20
Figure 3.1 Flow diagram of included studies………………………………………………………………..….39
Figure 3.2 Conceptual framework for bias in non-randomized studies…………………………………………41
Figure 4.1 Database Structure…………………………………………………………………………………...60
Figure 4.2 Flow diagram for the identification of eligible studies......................................................................76
Figure 5.1 Relationship between observed data, true study effects and the common treatment effect in fixed and random-effects meta-analysis...............................................................................................................................96
Figure 5.2 Relationship between the overall true effect (µ), the true effect in a given study (θ) and the observed effect (Yi)…………………………………………………………………………………………………………98
Figure 5.3 Forest plot of meta-analysis results for studies reporting post-operative complications……………………………………………………………………………………………..…….108
Figure 5.4 Forest plot of ratios of odds ratios (ROR) from meta-regression analysis comparing study designs………………………………………………………………………………………………………….109
Figure 5.5 Forest plot of meta-analysis results for studies reporting peri-operative mortality………………..110
Figure 5.6 Forest plot of ratios of odds ratios (ROR) from meta-regression analysis comparing study designs………………………………………………………………………………………………………….111
Figure 5.7 Forest plot of meta-analysis results for studies reporting length of stay…………………………..112
Figure 5.8 Forest plot of difference in mean differences (DMD) from meta-regression analysis comparing study designs……………………………………………………………………………………………………….....113
Figure 5.9 Forest plot of meta-analysis results for studies reporting number of lymph nodes harvested……..114
Figure 5.10 Forest plot of difference in mean differences (DMD) from meta-regression analysis comparing study designs………………………………………………………………………………………………..…..115
Figure 5.11 Baseline event rates over time…………………………………………………………………….117
Figure 5.12 Baseline event rates over time……………………………………………………………….……118
Figure 5.13 Funnel plots………………………………………………………………………………….……123
Figure 6.1 Forest plot of meta-analysis results, stratified according to the presence or absence of specific NRS study characteristics for the outcome post-operative complications………………………………………...…143
Figure 6.2 Forest plot of meta-analysis results, stratified according to the presence or absence of specific NRS study characteristics for the outcome length of stay……………………………………………………………150
Figure 6.3 Forest plot of meta-analysis results, stratified according to the presence or absence of specific NRS study characteristics for the outcome peri-operative mortality…………………………………………...……156
Figure 6.4 Forest plot of meta-analysis results, stratified according to the presence or absence of specific NRS study characteristics for the outcome number of lymph nodes harvested………………………...……………160
xiv
List of Appendices
Appendix A – Literature Search Strategy for the Development of a Conceptual Framework of Bias in Non-Randomised Studies
Appendix B – Literature Search Strategy for the Identification of Comparative Studies Evaluating Laparoscopy Versus Conventional Surgery For Colon Cancer
Appendix C – The Cochrane Risk of Bias Tool
Appendix D – Studies of Laparoscopy versus Conventional Surgery for Colon Cancer Meeting a priori Exclusion Criteria
Appendix E – Comparative Studies of Laparoscopy versus Conventional Surgery for Colon Cancer Meeting a priori Inclusion Criteria
Appendix F – Bayesian Models
Appendix G –Bayesian Meta-Analysis Results
xv
Thesis Overview
Chapter 1. Literature Review
The merits of randomized controlled trials and non-randomized studies (NRS) are reviewed.
The challenges of conducting surgical randomized controlled trials (RCTs) are discussed.
The literature comparing effect estimates from NRS and RCTs is described. The limitations
of these comparisons are explored. Since significant strides have been made in identifying
the attributes of study design associated with bias among RCTs, I propose that similar studies
are required for NRS.
Chapter 2. Laparoscopic colon surgery – an opportunity to study bias
The proposed case study of bias required the abundance of both NRS and RCTs evaluating a
single surgical intervention. Studies examining the surgical treatment of colon cancer met
these criteria. In this Chapter, I review the history of laparoscopic colon surgery and explain
how case-reports of port site metastases led to controversy in the surgical community. This in
turn led to numerous high-quality, multi-national, publicly funded RCTs in the area. There
are few surgical interventions that have been as thoroughly studied. The literature in this area
was used to conduct the study of bias in this dissertation.
Chapter 3. A conceptual framework for bias in non-randomized studies:
results from a modified framework synthesis
As there is no comprehensive framework for bias in NRS, I conducted a modified framework
synthesis to develop one. Sources of bias were extracted from systematic reviews of quality
xvi
assessments tools for NRS. These sources of bias were analyzed thematically and organized
into a framework for bias in comparative, NRS.
Chapter 4. Common Methods for Chapters 5 & 6
A common data set is used for the analyses in Chapters 5 and 6. Chapter 4 outlines the
literature search strategy, inclusion/exclusion criteria and approaches to data abstraction and
outcome selection. The Cochrane Risk of Bias Tool was employed to categorize RCTs as
“Strong” (i.e. at low risk of bias) versus “Typical” (i.e. unclear and high risk of bias). The
strengths and weaknesses of this approach are described.
Chapter 5. Comparing effect estimates from non-randomized studies
and randomized controlled trials
Combined effect estimates from NRS were compared with those from i) all RCTs, ii) Typical
RCTs and iii) Strong RCTs evaluating laparoscopic and open colon surgery. The impact of
period effects and between-study case-mix were explored using Bayesian meta-regression
methods.
Chapter 6. Empirically identifying the study attributes of non-randomized
studies associated with bias: a meta-epidemiology study
Using the conceptual framework developed in Chapter 3, a meta-epidemiology study was
conducted to examine the relationship between NRS-design characteristics and effect
estimates. Effect estimates were compared across NRS with and without specific design
characteristics. These estimates were in turn compared with the results of Strong RCTs.
xvii
Chapter 7. General Discussion and Future Directions
In this chapter, I summarize the main findings of this dissertation. The methodological
strengths and limitations of this work are described. The implications of this dissertation are
described and opportunities for future research are explored.
1
Chapter 1 Literature Review
1.1 The hierarchy of study design
The evidence from randomized controlled trials (RCTs) is the gold-standard against which
all other study designs are compared because RCTs are considered inherently less biased
than non-randomized studies (NRS) (Sackett and Sackett 1991). The process of
randomization ensures that all patients have an equal probability of receiving the
intervention. In contrast, treatment decisions in NRS are seldom determined by a study
protocol. Instead, patients and their physicians weigh the pros and cons of receiving
treatment. Physicians recommend therapy based on a patient’s likelihood of success.
Numerous patient characteristics that sway treatment decisions may also influence outcome.
Such variables are referred to as confounders; a confounder is i) associated with the exposure
ii) is an independent determinant of outcome, and iii) is not an intermediate in the causal
pathway (Fletcher and Fletcher 2005). For example, consider a hypothetical NRS study
comparing treatment A with treatment B. More deaths are observed in the first group. If
older patients were more likely to get treatment A, did age confound the relationship between
the exposure and the outcome (death)? Randomization balances both known and unknown
confounders in RCTs. In NRS however, treatment assignment is not random and so NRS are
more prone to bias arising from confounding.
NRS are accordingly regarded as an inferior study design in all systems rating quality of
evidence (Hadorn et al. 1996; Evans 2003). The best known of these systems, the Oxford
Center for Evidence-based Medicine Evidence Hierarchy, was developed in 1998 and later
updated in 2009 (Oxford Centre for Evidence-Based Medicine 2009). In this tool, systematic
2
reviews of RCTs (level 1a) appear at the top of the hierarchy (Table 1.1). At the other end of
the spectrum, level 5 evidence is based on expert opinion, studies of physiology or “first
principles.”
Table 1.1 Oxford Centre for Evidence-based Medicine – Levels of Evidence for studies of therapy/prevention/ aetiology/harm (2009).
Level Evidence 1 Systematic Review (with homogeneity§) of RCTs
1b Individual RCT (with narrow confidence interval)
1c All or none case-series†
2a Systematic review (with homogeneity§) of cohort studies
2b Individual cohort study (including low quality RCT; e.g., <80% follow-up
2c “Outcomes” Research; Ecological studies
3a Systematic review (with homogeneity§) of case-control studies
3b Individual case-control Study
4 Case-series (and poor-quality cohort and case-control studies)
5 Expert opinion without explicit critical appraisal, or based on physiology, bench research or “first principles”
§A review that is free of heterogeneity in the directions and degrees of results between individual studies. † Met when all patients died before the treatment became available, but some now survive on it; or when some patients died before the treatment became available, but none now die on it. Adapted from www.cebm.net
Whereas evidence hierarchies traditionally focus on study design, strength of evidence
systems (e.g. the GRADE guidelines) also incorporate other considerations such as the
quantity of evidence, the consistency of results and precision (Owens et al. 2010). These
systems also regard high-quality RCTs as the best source of evidence for the evaluation of
interventions (Atkins et al. 2004; Guyatt et al. 2008).
1.1.1 The limitations of RCTs
The aim of any RCT is to generate an estimate of treatment effect that is both accurate and
precise. However, there are a number of limitations of this study design. First, not all clinical
3
questions can be investigated via RCTs. This is especially true when studying exposure to
harms such as smoking. RCTs also yield limited information about adverse events –
generally, sample sizes are too small and follow-up too short to detect important adverse
outcomes (Ernst and Pittler 2001). Most importantly, the stringent inclusion criteria of RCTs
can often lead to results with limited external validity. Van Spall et al. found that among
RCTs published in high impact-factor journals between 1994 and 2006, common medical
conditions formed the grounds for exclusion in 81.3% of trials (Van Spall et al. 2007). Sex
and age formed the basis for exclusion in 72.1% and 39.2% of RCTs, respectively. Konrat
and colleagues have also demonstrated that patients aged 65 years and older are poorly
represented in RCTs of drugs they are likely to receive (Konrat et al. 2012). Additional
studies in cardiology (Gurwitz, Col, and Avorn 1992; Lee et al. 2001; Masoudi et al. 2003)
and oncology (Hutchins et al. 1999; Lewis et al. 2003) have demonstrated similar exclusions
of the elderly. While such exclusions often augment internal validity, this occurs at the
expense of generalizability. In contrast, NRS often have less strict inclusion criteria.
Interventions in these studies are also delivered in a variety of “real-word” settings.
Accordingly, NRS may produce more pragmatic estimates of treatment effect.
1.2 The infrequency of surgical trials
RCTs are under-represented in the surgical literature. Solomon et al. have shown that among
articles published in three leading surgical journals in 1990, 7% reported the outcomes of
RCTs. Moreover, only 17% of studies were comparative, underscoring how case-series
dominate the literature landscape in surgery (Solomon and McLeod 1993). According to
more recent assessments, the frequency of RCTs and other comparative studies has remained
unchanged (Wente et al. 2003; Chang, Matsen, and Simpkins 2006; Panesar et al. 2006).
Some may argue that surgical RCTs are more likely to appear in higher impact-factor,
general medical journals. A more broad survey of articles indexed in MEDLINE (1966-
2000) demonstrated that 15% of published RCTs were surgical (Wente et al. 2003).
4
Moreover, surgical trials often evaluated medical therapies in surgical patients as opposed to
head-to-head comparisons of surgical technique; 55.9% of RCTs investigated peri-operative
analgesia, antibiotics or neo-/adjuvant chemotherapy (Wente et al. 2003). Why are RCTs
involving surgical interventions so rare?
1.3 Barriers to the conduct of surgical trials
1.3.1 Issues with patient and physician accrual
In a cohort study of RCTs funded by the UK Medical Research Council and the Health
Technology Assessment Programme between 1994 and 2002, only 31% of trials achieved
their original accrual target (McDonald et al. 2006). In a systematic review of studies
examining barriers to participation, clinicians cited time constraints, insufficient training,
lack of research personnel, loss of autonomy, worry about patients and the impact on the
doctor-patient relationship as obstacles (Ross et al. 1999). Difficulty with the consent
procedure was also a hurdle for physicians, as was the lack of rewards or recognition for
recruiting patients. Patients cited the uncertainty associated with treatment as a prominent
reason for not enrolling. Other patient barriers included the demands associated with
participation (e.g. additional procedures/ appointments, travel and cost) and preference for a
particular treatment.
Strong beliefs about treatment options can also impede the recruitment of surgeons (Mills et
al. 2003; Campbell et al. 2010). For example, in a study by Harrison et al., five different
treatments for locally advanced rectal cancer were presented to patients, colorectal surgeons,
and medical and radiation oncologists. The treatment options included pre-operative
radiotherapy, post-operative radiotherapy, chemotherapy, combined chemo-radiotherapy or
surgery (i.e. an abdominal perineal resection). Participants were then asked about their
willingness to enter a RCT evaluating the five treatments. Whereas 31% of patients would
5
enter a trial of pre or post-radiotherapy, only 19% would agree to participate in the surgical
RCT. Even fewer surgeons would allow their patients to be involved in a surgical trial (16%)
but radiation and medical oncologists were the most enthusiastic (23 and 31%, respectively).
Some argue that perhaps the surgical community does not hold RCTs in high regard.
However, it has been shown that surgeons and other consultant physicians have equally
positive attitudes towards RCTs (McCulloch et al. 2005). In the same study, surgeons were
found to be more intolerant of uncertainty. This discomfort with uncertainty might be the
reason why some surgeons decline to participate in trials.
1.3.2 Funding surgical research
Whereas RCTs in internal medicine are often industry-sponsored, an analogous funding
infrastructure is lacking in surgery. Drug companies must produce phase I-III trial data
before regulatory agencies will approve new medications (McLeod 1999). These private
entities thus use their vast resources, in collaboration with internists, to bring their product to
market. In contrast, surgical techniques do not require regulatory approval (Cook 2009). In
the absence of industry funding, surgeons must rely on operating grants to fund trials. These
funding opportunities however, are far more readily captured by departments of medicine
(Jackson et al. 2004). In a review of National Institute of Health funding between 1992 and
1999, funding increased to medical departments seven times more quickly as compared with
departments of surgery (21.2% per year or $73 million US per year for medicine versus 3.1%
or $5.8 million US per year for surgery). The relative lack of both private and public funding
places surgery at a distinct disadvantage.
6
1.4 Challenges in surgical trials
Investigators conducting RCTs in surgery must also overcome a number of challenges
related to blinding and intervention delivery. Some mistake the challenges outlined below for
barriers; whereas challenges add to the complexity of conducting trials, barriers instead make
it unlikely that RCTs will take place at all (Garas et al. 2012).
1.4.1 Blinding in surgical RCTs
The best trials employ blinding of participants, clinicians, data collectors, outcome
adjudicators and data analysts (Karanicolas, Farrokhyar, and Bhandari 2010). In
pharmacological trials, a placebo resembling active treatment in appearance, taste and
consistency can be delivered to achieve effective blinding. Blinding is unquestionably more
difficult in trials of surgical interventions. Boutron et al. examined 110 RCTs evaluating
pharmacological and non-pharmacological interventions in patients with hip or knee
osteoarthritis (Boutron et al. 2004). They examined these trials not for the occurrence but for
the feasibility of blinding. They found that blinding patients, providers and outcome
assessors could be achieved in 96%, 96% and 98% of pharmacological trials, respectively. In
comparison, only 12% of patients and 34% of health care providers could be blinded in non-
pharmacological RCTs. The blinding of outcome assessors was equally infeasible (42%).
When comparing surgery with non-surgical interventions, blinding can be achieved with
sham surgery; surgeons make the same incisions for both groups of patients but while those
in the active group receive the intervention, those in the control group do not. The first
instances of sham surgery appeared in RCTs of internal mammary artery ligation in the
1950s (Cobb et al. 1959; Dimond, Kittle, and Crockett 1960). Sham surgery raised ethical
concerns because making non-therapeutic incisions was seen as contravening the principle of
"do no harm" (Wolf and Buckwalter 2006). Not surprisingly, trials making use of sham
7
surgery remain rare (Freed et al. 2001; Moseley et al. 2002; Swank et al. 2003; McRae et al.
2004; Kallmes et al. 2009; Shikora et al. 2009). Notably, all RCTs employing sham surgery
failed to show benefit associated with the active treatment. Therefore, one should not
underestimate the importance of the placebo effect when comparing surgery with non-
surgical interventions.
1.4.2 Standardizing surgical technique
Standardizing interventions is also a necessary step in any RCT. In pharmacological trials,
the dose and timing of a drug can be protocolized so that patients will receive the
intervention in the same way. However, it is more challenging to ensure that complex
interventions, like surgery, are delivered in a uniform manner (Meakins 2002). The surgical
encounter is a multifaceted process involving numerous steps that collectively make up the
surgical intervention. Patients will receive medications in the pre-operative area, undergo a
multi-step surgical procedure and afterwards, be cared for in the post-operative area and the
clinical ward. At what point does the surgical “intervention” begin and end? Some argue that
it encompasses only what occurs in the operating room. Others would broaden this definition
to include the care provided immediately before and after surgery. Moreover, how does a
surgeon’s skill influence study outcomes? Boutron et al. evaluated RCTs evaluating either
pharmacological and non-pharmacological interventions for knee osteoarthritis and found
that the care provider’s skill level could influence treatment effects in 84% of non-
pharmacological RCTs vs 23.3% of pharmacological trials (Boutron et al. 2003). Achieving
standardization in a surgical RCT therefore requires specifying how surgery should be
performed and how patients should be cared for in the peri-operative period.
When a new surgical technique emerges, a learning curve is often observed. For example,
Yamamoto et al. examined surgeons who had differing levels of experience with performing
left-sided, laparoscopic colon surgery (Yamamoto et al. 2013). They found that surgeons
8
achieved proficiency once 30 procedures had been completed. Up to this point, operations
lasted longer and were associated with more blood loss. Surgeons who had performed 30
procedures had patients resume a solid diet earlier and return home sooner. Therefore, effect
estimates obtained in an RCT of laparoscopic surgery might be influenced by whether the
trial takes place early on in the learning curve or later on. This consideration applies to all
surgical RCTs (Farrokhyar et al. 2010). Learning curves have also been demonstrated with
inguinal hernia repair (Neumayer et al. 2005); hernias recur more frequently when surgery is
performed by inexperienced surgeons. In addition to individual learning curves, many
surgical procedures are associated with better outcomes when performed in high-volume
centers (Urbach and Baxter 2004). Moreover, surgical techniques continue to evolve over
time and so period effects can be prominent in surgical RCTs (Barkun et al. 2009; Lassen,
Hvarphiye, and Myrmel 2012).
Therefore, designing rigorous surgical trial requires i) standardizing operative procedures
and peri-operative care, ii) recruiting surgeons who have achieved proficiency with the
operation in question and iii) involving centers that meet certain volume thresholds. These
hurdles are not insurmountable but do add to the complexity of conducting RCTs in surgery.
1.5 The Balliol Collaboration
Between 2007 and 2009, a group of surgeons and methodologists took part in three
conferences on the topic of surgical innovation and evaluation at Oxford University. This
international group of renowned experts named themselves the Balliol Collaboration. Their
primary goal was to draft a special series of articles for The Lancet that would describe the
relationship between innovation and clinical practice in surgery. The first article in the series
focused on the process of innovation and assessment of novel surgical interventions (Barkun
et al. 2009). The second article described the challenges faced by those designing RCTs and
NRS evaluating surgical interventions (Ergina et al. 2009). In the third article, a paradigm
9
was proposed that outlines the “timely and appropriate assessment of surgical innovation
along its different stages” (McCulloch et al. 2009). This paradigm, the IDEAL
recommendations, divides the stages of surgical innovation into 5 phases: 1) idea;
2a) development; 2b) exploration; 3) assessment and 4) long-term study. These stages
progress from the proof of concept phase of an innovation through to surveillance once it has
been widely accepted. Authors suggest that “research database” NRS in conjunction with
explanatory or feasibility RCTs are the study designs of choice for Stage 2b evaluation. This
recommendation however is predicated on the assumption that the results of surgical NRS
are generally valid and perhaps even comparable to the results of RCTs. However, evidence
in support of this position is currently lacking.
1.6 Do NRS and RCTs yield comparable results?
As a result of all the challenges and barriers outlined in Sections 1.3 and 1.4, the surgical
community has relied heavily on evidence from NRS and is likely to continue to do so. Does
relying on NRS however, lead to misleading conclusions? This question has been raised by
the medical community on a number of occasions - especially when the results of NRS have
been later contradicted by the findings of RCTs. For example, consider the controversy
surrounding the use of hormone-replacement therapy (HRT) among post-menopausal
women. Two large NRS suggested that HRT could reduce the risk of risk of coronary heart
disease (Grodstein et al. 1996; Varas-Lorenzo et al. 2000). HRT was also associated with
fewer fractures in post-menopausal women. However, the Women’s Health Initiative trial
later demonstrated that HRT may instead increase the risk of cardiac events (Harrison et al.
2007). This RCT was the first large-scale, placebo-controlled study of HRT. The results of
the study were so alarming that the trial was stopped three years early. There was strong
reaction to these findings by patients and health care practioners. The use of HRT
subsequently declined rapidly (Krieger et al. 2005).
10
Discordant results between NRS and RCTs have also been encountered in investigations of
activated protein C (Baillie 2007; Marti-Carvajal et al. 2012) and pulmonary-artery
catheterization (National Heart Lung Blood Institute Acute Respiratory Distress Syndrome
Clinical Trials Network et al. 2006; Frazier and Skinner 2008). Comparable controversy has
arisen in surgery as well. For example, consider the literature evaluating arthroscopic knee
interventions. Many patients with osteoarthritis of the knee would undergo surgery when
medical therapy failed to control their pain. Surgery involved making small incisions
(<1 cm) around the knee and using a camera to visualize the movement of specialized
instruments in the joint space. Then surgeons would lavage or “wash” the joint space with
10 liters of fluid to remove loose debris and degenerated joint fragments. Some patients also
received debridement which entails shaving joint cartilage (i.e. chrondroplasty) and trimming
and smoothing the tissue that cushions the knee (i.e. meniscus). Multiple NRS demonstrated
pain relief with lavage and debridement (Baumgaertner et al. 1990; Gross et al. 1991;
McLaren et al. 1991) and another showed it to be superior to medical therapy (Livesley et al.
1991). However, in a RCT by Mosley et al., authors reached a remarkably different
conclusion (Moseley et al. 2002). Patients were randomized to receive arthroscopic lavage,
arthroscopic debridement or a sham procedure. The efforts taken by investigators to blind
patients in the sham arm are noteworthy; patients had three 1 cm incisions made around the
knee, surgeons called out for instruments and saline was splashed to simulate the sounds of
lavage. Whereas surgeons were aware of the treatment assignment, patients and nurses
providing post-operative care were not. This study followed patients for 2 years and showed
no benefit with active surgery at all time points evaluated. Without a sham surgery arm, the
study would have failed to control for the placebo effect.
These examples underscore the importance of generating reliable and valid evidence.
Without the emergence of RCTs, the results of earlier NRS would not have been cast in
doubt. Empirical comparisons of NRS and RCTs are necessary however, to determine
whether the aforementioned discrepancies are outliers or truly representative of the average
relationship between NRS and RCTs.
11
1.6.1 Empirical comparisons of effect estimates from NRS
and RCTs
In 2000, the New England Journal of Medicine published a sentinel article by Concato et al.
that questioned the superiority of RCTs (Concato, Shah, and Horwitz 2000). Authors found
meta-analyses of RCTs or NRS published between 1991 and 1995 and 99 articles addressing
the following five clinical topics; i) Bacille Calmette–Guérin vaccine and active tuberculosis,
ii) screening mammography and mortality from breast cancer iii) cholesterol levels and death
due to trauma among men, iv) treatment of hypertension and stroke among men, and v)
treatment of hypertension and coronary heart disease among men. Summary estimates for
NRS and RCTs were separately generated using random-effects meta-analysis. A consistent
trend was observed; the combined point estimates and 95% CI (confidence intervals) for
each study design were remarkably similar (Table 1.2).
Table 1.2 Results of meta-analyses of NRS and RCTs appearing in Concato et al.
Clinical Topic Studies Odds Ratio (95% CI) Bacille Calmette–Guérin vaccine and tuberculosis
13 RCTs 10 Case–control
0.49 (0.34–0.70) 0.50 (0.39–0.65)
Mammography and mortality from breast cancer
8 RCTs 4 Case–control
0.79 (0.71–0.88) 0.61 (0.49–0.77)
Cholesterol levels and death due to trauma
6 RCTs 14 Cohort
1.42 (0.94–2.15) 1.40 (1.14–1.66)
Treatment of hypertension and stroke
14 RCTs 7 Cohort
0.58 (0.50–0.67) 0.62 (0.60–0.65)
Treatment of hypertension and coronary heart disease
14 RCTs 9 Cohort
0.86 (0.78–0.96) 0.77 (0.75–0.80)
The authors concluded that the results of high-quality NRS are generally similar to those of
RCTs. The qualification of “high-quality” is especially important because the results of this
study may only apply to a subset of all RCTs and NRS. First, meta-analyses were identified
from among the highest ranking journals in clinical medicine (Annals of Internal Medicine,
the British Medical Journal, the Journal of American Medical Association, the Lancet, and
12
the New England Journal of Medicine). Articles that appear in these journals may be of a
different caliber than the vast majority of RCTs and NRS indexed in Medline. Second,
Concato et al. did not seek out all of the NRS or RCTs published for a specific clinical topic
but instead, relied on the inclusion/exclusion criteria employed by the authors of the meta-
analyses. Conducting a meta-analysis involves selecting articles based on certain criteria.
This process may lead to the inclusion of primary articles of higher quality. The conclusions
in this study were also drawn from comparisons involving few RCTs and NRS and none
examining surgical technique. For these reasons, Concato’s findings may not apply to
comparisons of NRS and RCTs in surgery.
Another study comparing effect estimates from NRS and RCTs (Benson and Hartz 2000).
Benson and colleagues examined 83 RCTs and 53 NRS spanning 19 topics. They examined a
diverse set of outcomes including mortality, stroke, infection, residual stones, pregnancy,
percent change in lumbar bone density, recurrent otitis and so on. They found that combined
effect estimates for NRS were very similar to those for RCTs for all topics. Authors
cautioned however that “there were insufficient data to exclude the possibility of clinically
important differences between the results of the two types of study.”
While these studies compared NRS with RCTs for mostly non-surgical interventions, Shikata
et al. focused exclusively on the field of surgery (Shikata et al. 2006). Meta-analyses of
RCTs in digestive surgery were identified from searches of PubMed (1996 to April 2004),
EMBASE (1986 to April 2004) and the Cochrane Database of Systematic Reviews (Issue 2,
2004). Thereafter, data for NRS were identified from published meta-analyses or if none
were available, the authors conducted their own meta-analysis. A total of 276 original
articles (96 RCTs and 180 NRS) were selected for inclusion in this study. These articles
spanned 18 surgical topics and a variety of outcomes including mortality, morbidity, wound
infection, etc... Shikata and colleagues performed fixed-effect and random-effect meta-
analyses for each topic. The combined effect estimates for RCTs were compared to those
from NRS using Z scores. They found significant discrepancies between RCTs and NRS for
13
4 of 16 primary outcomes. Heterogeneity was also more frequent in meta-analyses of NRS as
compared with meta-analyses of RCTs.
The study by Shikata et al. stirred debate in the surgical community over the comparability
of NRS and RCTs. One of the strengths of the study is the number of surgical topics
assessed. However, because authors relied on meta-analyses of RCTs and NRS in a manner
similar to Concato et al., selection bias might influence the findings. Moreover, the largest
comparison involved 5 RCTs and 25 NRS, but the average number of studies in any
comparison was 4 RCTs and 6 NRS. Again, it appears that a subset of NRS was compared to
a selected group of RCTs.
While Concato et al. and Benson et al. found NRS and RCTs to be comparable, Shikata and
colleagues found important differences in 25% of comparisons. Studies comparing a single
RCT with an individual NRS have also found a range of results from RCTs showing more
benefit (Reimold et al. 1992; Shapiro and Recht 1994; Nicolaides et al. 1994) to NRS
showing a larger benefit (1984; Jha et al. 1995; Pyorala, Huttunen, and Uhari 1995). Others
have instead found that NRS and RCTs reached opposite conclusions (Antman et al. 1985;
1994; Yamamoto et al. 1992). Two meta-analyses have combined these individual studies
and produced interesting findings. In the meta-analysis by Britton et al., authors compared
the results of RCTs with those of prospective NRS (Britton et al. 1998). Eighteen studies met
their inclusion criteria and seven found no significant difference between RCTs and NRS. In
another seven studies, effect estimates were in the same direction but significantly different
in magnitude. The results of NRS and RCTs reached opposite conclusions in four studies. A
more recent meta-analysis by Kunz et al. described the results of 15 studies comparing RCTs
and NRS for the same intervention (Kunz, Vist, and Oxman 2007). They identified 35
comparisons within these studies. In 22 of 35 comparisons, effect estimates were larger in
NRS. Kunz et al. found that control groups in NRS had a poorer prognosis than the controls
in RCTs. They hypothesized that differences in patient case-mix between studies may have
contributed to the observed differences between NRS and RCTs. In general, this literature
14
suggests that there are notable differences in effect estimates from NRS and RCTs and that
these differences may be influenced by factors other than study design.
One of the limitations of these meta-analyses is the inclusion of older RCTs, published in the
1980s and 1990s. The bias arising from inadequate random-sequence generation, allocation
concealment and the lack of double-blinding was demonstrated in studies published after
1998. RCTs with these methodological shortcomings are becoming less common (Wang et
al. 2011). Therefore, comparisons involving older studies have likely contrasted the results of
NRS with RCTs of varying methodological quality. Moreover, most comparisons have
focused on non-surgical interventions. None of the analyses have accounted for between-
study heterogeneity stemming from differences in case-mix; NRS and RCTs can vary in the
types of patients studied, with some enrolling older patients or those with more advanced
disease, and so results between NRS and RCTs might have differed for this reason.
Therefore, the comparability of NRS and RCTs in surgery remains unclear.
1.7 Study characteristics of RCTs associated with bias
In 1995, Schulz et al. presented the results of a study that markedly changed the way we
evaluate RCTs. They found empirical evidence of an association between inadequate trial
methodology and biased effect estimates (Schulz et al. 1995). Using meta-analyses indexed
in the Cochrane Pregnancy and Childbirth Database, Schulz et al. conducted an observational
study to determine the association between estimates of treatment effect and various study
attributes. This study included 250 trials from 33 meta-analyses covering a broad range of
interventions during pregnancy, preterm labor and delivery, induction of labor, labor and
delivery, cesarean delivery, puerperium and the early neonatal period. Authors did not focus
on a single outcome measure but instead included any binary outcome reported by all studies
within any one meta-analysis. Data was abstracted so that an odds ratio (OR) < 1 indicated
benefit. Schulz and colleagues chose to focus their analysis on four study attributes:
15
randomization sequence, allocation concealment, exclusions after randomization and double
blinding. The main study outcome was a ratio of odds ratios (ROR) for each attribute;
ROR =combined ORstudies 𝐰𝐢𝐭𝐡𝐨𝐮𝐭 characteristic
combined ORstudies 𝐰𝐢𝐭𝐡 characteristic
For example, the combined effect estimate from studies without adequate allocation
concealment was compared to the combined effect estimate from studies with adequate
concealment. A ROR < 1.0 indicates that trials that were methodologically inferior
(i.e. lacking the study attribute) had, on average, yielded larger estimates of treatment effect
as compared with the referent group (i.e. studies where the attribute is present). Conversely, a
ROR > 1.0 indicates that studies without the study characteristic have, on average, smaller
estimates of treatment effect compared with the referent group (i.e. methodologically
superior studies).
While authors did not find a statistically significant trend with appropriate random-sequence
generation, adequate allocation concealment appeared to protect against bias. Schulz et al.
divided studies into the following three categories: adequately concealed allocation,
inadequately concealed allocation and unclearly concealed allocation. Studies with
inadequate (n=21 trials) or unclear allocation concealment (n=150 trials) were in turn
compared with the referent group, studies with adequate allocation concealment (n=79).
Exaggerated estimates of benefit were found with both; inadequate allocation concealment
(ROR 0.59 [95% CI, 0.48 to 0.73]) was worse than having unclear allocation concealment
(ROR 0.67 [95% CI, 0.60 to 0.75]). Studies without double-blinding also appeared to be
susceptible to bias (ROR 0.83 [95% CI, 0.71 to 0.96]). Since 73% of RCTs in this study had
adequate double-blinding, Schulz’s results more readily apply to pharmacological trials then
surgical RCTs where double-blinding is often impossible. The study by Schulz is referred to
as a meta-epidemiological study; a study that uses meta-analyses to explore the impact of
study attributes on bias. Other meta-epidemiological studies have examined a variety of
RCT-design characteristics (Table 1.3).
16
Table 1.3 Meta-epidemiological studied of RCT study attributes.
Year Author Studies Attributes Outcome Findings 1995 Schulz et al.
(Schulz et al. 1995) 33 meta-analyses 250 RCTs
Random-sequence generation Allocation concealment Double-blinding Exclusions
Binary outcomes RCTs with inadequate or unclear allocation concealment showed more benefit than trials with adequate allocation concealment, ROR 0.59 [95% CI, 0.48 to 0.73] and ROR 0.67 [95% CI, 0.60 to 0.75], respectively. Trials without double-blinding also showed more benefit, ROR 0.83 [95% CI, 0.71 to 0.96]. No systematic bias detected in RCTs with inadequate random-sequence generation (p=0.58) or exclusions after randomization (p=0.01).
1998 Moher et al. (Moher et al. 1998)
11 meta-analyses 127 RCTs
Random-sequence generation Allocation concealment Double-blinding
Binary outcomes While no effect was seen with inadequate random-sequence generation or double-blinding, studies with inadequate concealment of allocation showed exaggerated benefit, ROR 0.63 [95% CI, 0.45-0.88].
2001 Kjaergard et al. (Kjaergard, Villumsen, and Gluud 2001)
14 meta-analyses 190 RCTs
Random-sequence generation Allocation concealment Blinding Description of dropouts and withdrawals
Binary outcomes In small RCTs (<1000 participants), more benefit was seen in trials with inadequate random-sequence generation, ROR 0.46 [95% CI, 0.25 to 0.83]; inadequate allocation concealment, ROR 0.49 [95% CI, 0.27 to 0.86]; and without double-blinding, ROR 0.52 [95% CI, 0.28 to 0.96]. A similar effect was not seen in trials that failed to describe dropouts or withdrawals (p=0.2).
17
When comparing trials with adequate random-sequence generation, there was no difference in estimates between smaller and larger RCTs (p>0.2). Similar results were obtained for RCTs with adequate allocation concealment (p>0.2), double-blinding (p>0.2) and follow-up (p>0.2).
2002 Balk et al. (Balk et al. 2002)
36 meta-analyses 276 RCTs
24 quality measures including random-sequence generation, allocation concealment, double-blinding, intention-to-treat analysis, power calculations, description of drop-outs, etc...
Binary outcomes (mortality in cardiovascular studies. For remaining studies, only outcomes with heterogeneous treatment effects).
Examined impact of study characteristics across 4 medical areas (cardiovascular disease, infectious disease, pediatrics and surgery). No consistent patterns of association across the medical areas.
2003 Als-Nielsen et al. (Als-Nielsen et al. 2003)
48 meta-analyses 523 RCTs
Funding
Binary outcomes In RCTs funded solely by for-profit organizations, conclusions were more likely to recommend experimental drugs than RCTs funded by non-profit organizations, OR 5.3 [95% CI, 2.0-14.4]. Bias was not detected in RCTs not specifying funding (p=0.10), or those with mixed non-profit and for-profit funding (p=0.09).
2005 Tierney et al. (Tierney and Stewart 2005)
14 meta-analyses 133 RCTs
Exclusions Survival Studies that included all patients were less likely to show benefit than RCTs with exclusions (p=0.03).
2007 Pildal et al. (Pildal et al. 2007)
70 meta-analyses 286 RCTs
Allocation concealment Double-blinding
Binary outcomes RCTs with unclear or inadequate allocation concealment showed more benefit, ROR 0.90 [95% CI, 0.81 to 1.01].
18
A similar trend was seen with the absence of double-blinding, ROR 0.94 [95% CI, 0.80 to 1.10].
2009 Nüesch et al. (Nuesch, Reichenbach, et al. 2009)
16 meta-analyses 175 RCTs
Allocation concealment Double-blinding
Continuous outcome Effect sizes tended to be more beneficial in RCTs with inadequate or unclear allocation concealment, standardized mean difference -0.15 [95% CI, -0.31 to 0.02]. The exaggerated effects seen with blinding disappeared after accounting for allocation concealment.
2009 Nüesch et al. (Nuesch, Trelle, et al. 2009)
14 meta-analyses 167 RCTs
Exclusions after randomization Continuous outcome Restricting meta-analyses to RCTs without exclusions resulted in smaller estimates of treatment effect, large p values and notable decreases in between-trial heterogeneity.
2010 Bassler et al. (Bassler et al. 2010)
545 RCTs Early stopping RCTs vs trial completion
Binary outcomes RCTs stopped early showed more benefit than matching non-truncated RCTs, ratio of relative risks 0.71 [95% CI, 0.65-0.77].
2010 Nüesch et al. (Nuesch et al. 2010)
13 meta-analyses 153 RCTs
Small RCTs (<100 patients per arm) vs Large RCTs
Continuous outcomes Small trials showed more benefit than large RCTs, standardized mean difference -0.21 [95% CI, -0.34 to -0.08]. Effect persistent when adjusting for concealment of allocation, blinding of patients and intention-to-treat analysis.
2011 Dechartres et al. (Dechartres et al. 2011)
48 meta-analyses 421 RCTs
Single centre vs multi-centre Objective binary outcomes (e.g. all-cause mortality or result of a biological test)
Effect estimates are larger in single-centre studies, ROR 0.73 [95% CI, 0.64 to 0.83]
2012 Bafeta et al. (Bafeta et al. 2012)
26 meta-analyses 292 RCTs
Single centre vs multi-centre Continuous outcomes Single-centre trials showed more benefit than multi-centre RCTs, standardized mean difference -0.17 [95% CI, -0.17 to -0.01].
19
2012 Hrobjartsson et al. (Hrobjartsson et al. 2012)
8 RCTs Blinded outcome assessment Binary outcomes Non-blinded assessors exaggerated treatment effects by 36%, ROR 0.64 [95% CI, 0.43 to 0.96].
2013 Hrobjartsson et al. (Hrobjartsson et al. 2013)
24 RCTs Blinded outcome assessment Continuous outcomes Non-blinded assessors exaggerated the pooled effect size by 68% [95% CI, 14% to 230%).
20
20
Some of these individual studies have been combined in meta-analyses – proverbial “meta-
meta-epidemiological studies.” One of the most impactful of these studies was conducted by
Wood et al. (Table 1.4). They too examined the influence of certain study characteristics on
bias but did so separately for objective and subjective outcomes (Wood et al. 2008). They
classified all-cause mortality and standardized laboratory procedure measures
(e.g. hemoglobin concentration) as objective outcomes. Subjective outcomes included patient
reported outcomes a physician-assessed disease outcomes (e.g. wound infection, pneumonia
and other complications). Fifty-three percent of the extracted outcomes were objective. The
remaining subjective outcomes were divided between patient-reported (11%), physician-
assessed (29%), a combination of both (0.7%), or patient withdrawal (6%). While bias was
detected with each of these study attributes for subjective outcomes (Figure 1.1), the same
could not be said for objective outcomes.
Figure 1.1 Results from Woods et al. Inadequate or unclear allocation concealment compared with adequate allocation concealment among objective outcomes (A) and subjective outcomes (B). Studies without blinding compared with studies with adequate blinding among objective outcomes (C) and subjective outcomes (D). ROR, ratio of odds ratios. A ROR<1.0 implies bias associated with the absence of a characteristic.
21
Table 1.4 Meta-analyses of meta-epidemiological studies.
Year Author Studies Attributes Outcomes 2008 Wood et al.
(Wood et al. 2008) 146 meta-analyses 1346 RCTs Included following individual studies: Schulz et al. (1995) Kjaergard et al. (2001) Egger et al.(2003)
Allocation Concealment Blinding (non-blinded vs any blinding)
Objective binary outcomes examined separately from subjective outcomes
2012 Savović et al. (Savovic et al. 2012)
234 meta-analyses 1973 RCTs Included following individual studies: Schulz et al. (1995) Kjaergard et al. (2001) Egger et al.(2003) Balk et al. (2002) Als-Nielsen et al. (2003) Contopoulos et al. (2005) Pildal et al. (2007)
Randomization sequence generation Allocation concealment Double-blinding
Objective binary outcomes examined separately from subjective outcomes
22
Similar results were found by Savović et al. (Table 1.4). This literature as a whole has
established that inappropriate random-sequence generation, inadequate allocation
concealment, the absence of double-blinding, stopping trials early, for-profit funding and not
describing exclusions all contribute to bias. This evidence has changed how we both conduct
and evaluate RCTs. Furthermore, this literature has provided the evidence for the
development of a risk of bias tool for RCTs endorsed by the Cochrane Collaboration.
1.7.1 The methodological shortcomings of surgical RCTs
As outlined above, a number of study characteristics have been associated with less biased
estimates of treatment effect for RCTs. Trials in surgery however, often lack some of these
characteristics or fail to report them. A survey of 364 surgical RCTs published between
1988 and 1994 demonstrated that only 27% provided a description of the randomization
technique (Hall et al. 1996). However, the results of this study may not apply to more recent
surgical trials because Hall et al. examined RCTs published before the emergence of the
Consolidated Standards of Reporting Trials (CONSORT) statement. This consensus
statement outlines what should be included in a published report of an RCT (Moher et al.
2001).While reporting quality had moderately improved (Gray et al. 2012), surgical RCTs
continue to have major methodological shortcomings (Walter et al. 2007). For example, it
has been shown that adequate randomization (30.4%), allocation concealment (35.3%), and
blinding of patients (29.2%), health care providers (45.8%) and outcome assessors (30.6%)
remains low (Als-Nielsen et al. 2003). Notably, authors reviewed 69 RCTs published in
surgical journals (Annals of Surgery, British Journal of Surgery, World Journal of Surgery,
Journal of Surgery, Journal of the American College of Surgeons and the American Journal
of Surgery) and prestigious medical journals (British Medical Journal, Lancet, Journal of the
American Medical Association, and the New England Journal of Medicine). Similar
suboptimal conduct has been documented in the areas of pediatric surgery (Curry, Reeves,
and Stringer 2003), urology (Ross et al. 1999), and orthopaedics (Jacquier et al. 2006;
Chaudhry et al. 2008).
23
Since RCTs are infrequent in surgery and many of these studies are methodologically flawed,
perhaps RCTs should not be automatically held in higher regard than NRS. Some argue that
a well-designed NRS may provide more accurate and less biased estimates for treatment
effect than poorly conducted RCTs (Grossman and Mackenzie 2005). Ultimately, studies
should be assessed on the basis of conduct and not study design labels (Hannan 2008).
However, while numerous studies have examined the sources of bias in RCTs, there is a
scarcity of similar studies for NRS.
1.8 Study characteristics of NRS associated with bias
While numerous meta-epidemiological studies have evaluated the association between study
design and bias for RCTs, there have been no comparable studies for NRS. However, there
have been two groups that examined NRS-design characteristics and performed subgroup
meta-analyses to determine if the presence or absence of specific characteristics yielded
results that are similar to those of RCTs. In a study by Bhandari et al., the results of NRS
evaluating arthroplasty versus internal fixation for hip fracture were compared with the
results of RCTs (Bhandari et al. 2004). For the outcome mortality, 13 NRS overestimated the
relative risk (RR) associated with arthroplasty by 40% (RR 1.44 versus 1.04) as compared
with 12 RCTs. On the other hand, the RR for risk of revision surgery was 0.38 among NRS
and 0.23 among RCTs. Thus, for the outcome risk of revision surgery, NRS underestimated
the benefit associated with arthroplasty by 15%. Further analysis suggested that there were
four NRS with point estimates of relative risk for mortality similar to the combined effect
estimate for RCTs. All four of these NRS had analyses that controlled for patient age,
gender, and fracture displacement.
In a study by Abraham et al., separate meta-analyses were conducted of NRS and RCTs
comparing laparoscopy with open surgery for the treatment of colon cancer (Abraham et al.
2010). They concluded that the OR for three dichotomous outcomes (mortality, morbidity
and reoperation rates) overlapped widely between the NRS and RCTs. For both study
designs, laparoscopy was associated with a statistically significant reduction in post-
24
operative morbidity. Authors then performed subgroup meta-analyses of NRS, stratifying
their analyses according to the presence or absence of certain design characteristics. NRS
with fewer than 50 patients per study arm, aims that were “not well defined,” end points that
were “not well defined,” non-consecutive patients, non-contemporaneous controls or
“inadequate” controls all failed to detect a statistically significant difference between
laparoscopy and open surgery with regards to morbidity. Authors also cautioned that there
was a possibility of a Type II error driven by the rather small sample size of the groups being
compared.
Along with the study by Shikata et al., these two innovative studies represent the some of the
first attempts to assess the comparability of NRS and RCTs in the surgical literature.
Moreover, both groups have attempted to identify which NRS yield results that are
comparable to those of RCTs. However, the analyses in these studies treated the pooled
effect estimates from RCTs as the referent or gold-standard value and overlooked the
important issue of methodological heterogeneity among these surgical trials. It is likely that
some of the RCTs in study by Bhandari et al. and Abraham and colleagues included RCTs at
unclear or high risk of bias. Therefore, additional studies are necessary to determine which
aspects of NRS design are associated with biased estimates of treatment effect — studies that
use low risk of bias RCTs as the referent group.
1.9 Summary of gaps in knowledge
In summary, there are important gaps in our understanding of bias in surgical NRS and
RCTs. These gaps include:
(1) The degree to which NRS and RCTs yield similar estimates of treatment effect is
uncertain. Comparisons of NRS and RCTs to date have rarely examined surgical studies and
most analyses have involved studies conducted in the 1980s and 1990s. These comparisons
have also not accounted for variation in RCT study quality, period effects or differences in
case-mix between individual studies.
25
(2) The relationship between study characteristics and effect estimates in NRS is unknown.
Whereas numerous meta-epidemiological studies have examined the association between
design attributes and bias in RCTs, there are no analogous studies for NRS.
1.10 Dissertation rationale
RCTs are rare in surgery and this trend has been consistent over time. Surgical trials are
uncommon because of the lack of an established funding infrastructure. The uncertainty
associated with randomization, on the part of both patients and physicians, also impedes
recruitment to surgical RCTs. Moreover, investigators must overcome challenges with
blinding and standardizing surgical technique when conducting a surgical trial. For these
reasons, NRS heavily inform the evidence base in surgery and will continue to do so. Should
this reliance be a cause for concern? After all, numerous studies have found that NRS yield
findings similar to those in RCTs. However, other studies have shown the opposite or been
inconclusive. These comparisons have been limited by including selections of NRS and
comparing them with RCTs of varying quality. Comparisons have not accounted for period
effects or patient case-mix. Most comparisons have also evaluated non-surgical interventions
and studies mostly conducted in the 1980s and 1990s. Thus, the comparability of NRS and
RCTs in surgery remains unclear.
Multiple meta-epidemiological studies have identified study characteristics associated with
bias for RCTs. The Cochrane Collaboration has used this empirical data to guide the
development of a risk of bias tool for RCTs. Similar work is needed in the field of NRS.
Identifying the attributes of NRS that are associated with bias will help those conducting,
reviewing and meta-analyzing NRS. Such advancements are necessary to understand how to
best use the evidence from NRS. To study bias in NRS, we have focused on the surgical
treatment of colorectal cancer due to the abundance of both NRS and RCTs in this area.
26
1.11 Research aims
Specific Aims
(1) To develop a conceptual framework for bias in comparative NRS.
(2) To compare effect estimates from NRS with those from RCTs at low risk of bias.
(3) To explore the impact of NRS-design attributes on estimates of treatment effect.
As there is no comprehensive framework for bias in NRS, a modified framework synthesis
was conducted to develop one. Sources of bias were identified from previously published
systematic reviews of quality assessment tools (e.g. scales and checklists) for NRS and
synthesized thematically into a framework.
In Chapter 5, pooled effect estimates from NRS were compared with those from i) all RCTs,
ii) Typical RCTs and iii) Strong RCTs evaluating laparoscopy and open colon surgery.
Random-effects meta-analysis and meta-regression methods were used to determine the
impact of study design. The Cochrane Risk of Bias Tool was used to categorize RCTs as
Strong or Typical.
In Chapter 6, a meta-epidemiology study was conducted to examine the relationship between
NRS-design characteristics and effect estimates. The conceptual framework developed in
Chapter 3 was used to identify potential study characteristics. Effect estimates were
compared across NRS with and without specific design characteristics. These estimates were
in turn compared with the results of Strong RCTs.
27
Chapter 2 Laparoscopic colon surgery – an opportunity
to study bias
Colorectal cancer is the third most common malignancy in men and women worldwide. In
2008, there were over 1.2 million new cases of colorectal cancer and 609,000 deaths (Ferlay
et al. 2010). In Canada, colorectal cancer is the second leading cause of cancer mortality
(Canadian Cancer Society’s Steering Committee on Cancer Statistics 2012). Surgery is the
cornerstone of treatment; in Ontario, 87% of patients diagnosed with colorectal cancer
undergo surgery (Rahima Nenshi et al. 2008). Prior to the early 1990s, surgical removal of
malignancies in the colon involved making a large abdominal incision. This traditional
approach, the “open” technique, allows surgeons to directly visualize and manipulate intra-
abdominal organs. However, in the late 1980s, surgeons began to consider laparoscopy for
the surgical management of colon cancer (Martel and Boushey 2006). Laparoscopy involves
inserting a fibre-optic laparoscope, a proverbial camera, into the abdominal cavity via a small
incision at the umbilicus. Thus, intra-abdominal organs are visualized indirectly - on
television monitors in the operating room. Additional small incisions, each less than 1.5 cm,
are made in the abdominal wall to allow the passage of specialized instruments. These small
incisions are referred to as “port sites.” At the end of the operation, the tumor is removed
through an incision that is much smaller than the one made with the “open” technique.
Laparoscopy was looked upon favourably because it was less invasive than open surgery and
thus was associated with less post-operative pain and a shorter post-operative hospital stay
(Schwenk et al. 2005).
The goal of any colon cancer operation is to remove the i) tumour, ii) a certain amount of
normal tissue adjacent to the tumour and iii) lymph nodes draining the tumour. While the
rationale of removing the tumour is obvious, the reasons for removing normal tissue and
lymph nodes are less so. The amount of adjacent normal tissue removed along with the tumor
is referred to as the margin. For colon cancer, surgeons aim to resect 5 cm of normal tissue
28
on either side of the tumor. With rectal cancer, they aim for a 2 cm distal margin (in the
fresh specimen) (Smith et al. 2010). Removing this normal tissue reduces the risk of local
recurrence.
All patients with colon cancer undergo pre-operative imaging (e.g. computerized tomography
scans of the abdomen and pelvis) to determine the extent to which the cancer has grown or
spread. The American Joint Committee on Cancer (AJCC) TNM pathology classification
scheme is commonly used to describe the extent of disease progression in cancer patients and
it consists of three components; T-category, the depth of tumor invasion through the colon
wall, N-category, the number of involved lymph nodes and M-category, the presence or
absence of distant metastases (Edge and American Joint Committee on Cancer. 2010).
Surgeons therefore aim to remove a minimum of 12 lymph nodes to allow for proper nodal
staging (i.e. N-category) of colon cancers (Compton et al. 2000; Nelson et al. 2001).
Combinations of T, N and M categories are in turn used to define disease stage (e.g. stage I
to IV). With increasing levels of stage, the overall prognosis worsens and the probability of
recurrent disease increases. Staging information is used to determine prognosis and identify
which patients will benefit from adjuvant treatment (i.e. chemotherapy) to prevent disease
recurrence.
The first case reports of laparoscopic colon surgery appeared in 1991 (Fowler and White
1991; Jacobs, Verdeja, and Goldstein 1991). Up to this point, laparoscopy had typically been
performed for the treatment of benign conditions including appendicitis and cholecystitis (i.e.
inflammation of the gallbladder). With the advent of laparoscopic colon surgery, many
questioned whether this new technique could be used to remove colon cancer. Multiple early
NRS suggested that laparoscopy was associated with fewer post-operative complications,
shorter hospital length of stay and less pain. Proponents of the “open” technique however
cautioned that despite these benefits, laparoscopy may be an inferior surgical approach for
cancer patients; during conventional, open colon surgery, surgeons could feel tumours and
periodically confirm that sufficient tissue was being removed. Laparoscopy could not allow
for such tactile feedback. Thus, many were concerned that laparoscopy may be associated
with inadequate margins and insufficient removal of lymph nodes.
29
Concern about laparoscopic colon surgery grew further with the publication of case reports
detailing cancer recurrence at port sites or wounds. In 1993, Alexander et al. described the
clinical course of a 67 year-old woman with right-sided colon cancer (Alexander, Jaques, and
Mitchell 1993). She unfortunately experienced recurrence at one of her wound sites at
3 months following laparoscopic surgery. In the two years following this initial report, an
additional 35 cases of port site recurrences were documented. In 1995, Wexner et al.
conducted a review of all published case series and found a recurrence rate of 6.3%
(CI 1.5 to 21%) (Wexner and Cohen 1995). In contrast, there were only 11 recurrences
among a series of 1,711 patients (0.64%, CI 0.32 to 1.15%) undergoing traditional surgery
between 1986 and 1989 (Reilly et al. 1996). Laparoscopy appeared to be associated with a
nearly 10 fold increase in disease recurrence. Since laparoscopy was also associated with
higher operating room costs, the benefits of using this technology for the treatment of colon
cancer were questioned by the surgical community.
Due to mounting concerns over the oncologic safety of the procedure, the American Society
for Colon and Rectal Surgeons recommended that laparoscopic colon cancer surgery should
only be performed within the context of a prospective trial (American Society of Colon and
Rectal Surgeons 1995). Over the following 15 years, multiple NRS and RCTs were
conducted to determine if there were appreciable differences between the two techniques in
terms of peri-operative mortality, post-operative complications, pain, length of stay, and so
on. Of note, a number of high-quality, publicly-funded, rigorous RCTs were conducted – a
relative rarity in surgery. Since 1998, fifty reviews (i.e. systematic reviews or meta-analyses)
have been published comparing laparoscopy with open surgery for the treatment of colon
cancer. This literature includes two separate Cochrane reviews of the short-term and long-
term outcomes following these operations. The first review found that laparoscopy was
associated with longer operative time but less total morbidity and shorter length of stay
(Schwenk et al. 2005). There was also evidence for less post-operative pain with
laparoscopy. An analysis of long-term outcomes found similar rates of port site/wound
occurrence and cancer-related mortality (Kuhry et al. 2008). However, the quality of
included RCTs has been noted to “greatly” vary (Kuhry et al. 2008). Most quality
assessments were undertaken using instruments that have not been validated. For example,
the Cochrane review by Schwenk et al. used the modified Evans and Pollock Questionnaire
30
to assess quality – this questionnaire has not been evaluated for face or content validity nor
intra or inter-observer reliability (Evans and Pollock 1985). Including studies of varying
quality probably increases heterogeneity but may also lead to aggregate estimates that are
biased. Furthermore, authors did not explore how differences in case-mix between studies
may have impacted the results of the meta-analyses.
Since the early 1990s, laparoscopic colon surgery has evolved from an experimental
technique to the preferred approach for the surgical treatment of colon cancer. The numerous
NRS and RCTs evaluating laparoscopic and open colon surgery facilitated this evolution.
The breadth of studies comparing the two techniques also provides for a unique opportunity
to compare the results of NRS with those of RCTs. There is also an opportunity to examine
the impact of quality, case-mix and period effects on observed results. To achieve these
goals, a case study of bias was undertaken.
31
Chapter 3 Development of a conceptual framework for bias in non-randomized studies: results of a
modified framework synthesis
3.1 Summary
Objective
The evaluation of any non-randomized study (NRS) requires a thorough assessment of bias.
However, there is no consensus on the sources of bias in studies lacking randomization.
Therefore, our objective was to develop a conceptual framework for bias in NRS.
Study Design and Setting
A modified framework synthesis was conducted; i) an a priori framework was developed,
ii) a systematic search identified reviews of quality assessment instruments (e.g. scales and
checklists) for NRS, and iii) sources of bias were extracted, analyzed thematically and
organized into a framework.
Results
Of the 7 reviews identified, 4 were published in peer-reviewed journals and the remaining
were produced by publicly-funded, scientific agencies. The final framework contains 37
sources of bias or “items”. These items were organized within 6 overarching “domains”;
selection bias, information bias, performance bias, detection bias, attrition bias, and selective
reporting bias.
Conclusion
32
The sources of bias in NRS have been arranged into 6 main domains. This framework can
facilitate the study of bias in NRS and help those designing or reviewing NRS.
33
3.2 Introduction
Evaluating the merits of any study should involve a thorough consideration of internal and
external validity. Internal validity is the extent to which the findings of a study are
representative of the true association between exposure and outcome. Bias, precision and
confounding are components of internal validity (Grimes and Schulz 2002). Bias is defined
as “systematic error or deviation in results or inferences from the truth” (Agabegi and Stern
2008). Although bias cannot be totally eliminated from studies, the goal is to minimize it. For
randomized controlled trials (RCTs), several empirical studies have shown that bias arises
with inadequate randomization, allocation concealment and blinding (Schulz et al. 1995;
Moher et al. 1998; Kjaergard, Villumsen, and Gluud 2001; Balk et al. 2002; Als-Nielsen et
al. 2003; Tierney and Stewart 2005; Pildal et al. 2007; Nuesch, Reichenbach, et al. 2009;
Nuesch, Trelle, et al. 2009; Nuesch et al. 2010; Bassler et al. 2010; Dechartres et al. 2011;
Bafeta et al. 2012; Hrobjartsson et al. 2012, 2013; Wood et al. 2008; Savovic et al. 2012).
These studies have helped guide researchers, physicians, and policy makers in assessing the
risk of bias in RCTs. However, comparable evidence is not available to guide the appraisal of
NRS. At best, study-design labels (e.g. case-control study) are used as crude markers for the
extent of bias in any study. Bias assessments should instead involve a full consideration of
study methodology.
Those assessing risk of bias in NRS might be tempted to use any one of the over 190 tools
(i.e. scales and checklists) available (Deeks 2002). Unfortunately, many were developed
without adhering to the principles of measurement theory (Sanderson, Tatt, and Higgins
2007). Moreover, many focus on the broader concept “quality” and were not specifically
designed to evaluate bias. Others, such as the Newcastle-Ottawa Scale, have been shown to
have low reliability between individual reviewers (Hartling et al. 2012). The concepts of bias
and quality are often used interchangeably but each term represents distinct constructs;
quality includes not only bias but also considerations of external validity, ethical conduct and
reporting standards (Table 3.1). While the components of quality are undoubtedly inter-
related, they should nevertheless be evaluated independent of one another. Indeed, the major
limitation of many available instruments for the appraisal of NRS is the emphasis placed on
reported study methods and not actual study conduct (Sanderson, Tatt, and Higgins 2007).
34
Table 3.1 Definitions of key constructs.
Quality Components Description* Internal Validity§
The extent to which study design, implementation, and data analysis have minimized or eliminated bias and that the findings are representative of the true association between exposure and outcome.
Bias A systematic error or deviation in results or inferences from the truth. Precision A measure of the likelihood of random errors in the results of a study, meta-
analysis or measurement. The greater the precision, the less random error. Confidence intervals around the estimate of effect from each study are one way of expressing precision, with a narrower confidence interval meaning more precision.
Confounding A factor that is associated with both an intervention (or exposure) and the outcome of interest but does not lie in the causual pathway. A confounder distorts the relationship between the intervention (or exposure) and the outcome
External Validity The extent to which results provide a correct basis for generalizations to other circumstances
Ethics† A branch of philosophy systematizing, defending, and recommending concepts of right and wrong conduct
Reporting Standards A proscriptive standard outlining what should be included in the report of a study (e.g. CONSORT, STROBE). These standards do not evaluate study conduct.
* Unless otherwise specified, construct definitions adapted from the Cochrane Glossary, www.cochrane.org/glossary § Adapted from “Identifying and avoiding bias in research,” (Pannucci and Wilkins 2010) † Adapted from “Ethics and science: an introduction,” (Briggle and Mitcham)
To best use the evidence from NRS, an approach to systematically deconstruct and evaluate
bias is required. A conceptual framework for bias in NRS would not only help those
appraising individual studies but could facilitate the study of bias in NRS. Reviewing the
available literature failed to identify such a framework. Therefore, the objective of this study
was to develop a conceptual framework for bias in NRS. A conceptual framework is a visual
representation that “explains, either graphically or in narrative form, the main things to be
studied – the key factors, concepts, or variables – and the presumed relationships among
them” (Miles and Huberman 1994). Conceptual frameworks have been widely used for the
study of complex phenomena including knowledge translation (Graham et al. 2006), shared
decision-making (Legare et al. 2008), and guideline implementation (Gagliardi et al. 2011).
Since our primary aim was to develop a conceptual framework for bias in NRS of
interventions, such a framework would not reflect the biases that could arise in studies of
prognosis or diagnosis.
35
3.3 Methods
To develop a conceptual framework for bias in NRS, we sought to first aggregate a list of
potential sources of bias. Data in the form of comprehensive lists of bias or classification
schemes for bias in NRS were required. Accordingly, we focused on published systematic
reviews of quality assessment tools for NRS. Many of these tools evaluate the
methodological rigor of NRS. By focusing on systematic reviews of NRS assessment tools,
we hoped to gain insight on how others had conceptualized and organized the content of
available tools. These analyses would likely contain descriptions of the potential sources of
bias in NRS. Therefore, our objective was to review these organizational approaches and
extract all potential sources of bias. Extracted biases were then organized into a conceptual
framework. Even though these reviews focused on tools assessing quality and not
specifically bias, this approach was deemed the most broad and inclusive of potential sources
of bias in NRS.
Our aim was to construct a hierarchical framework with “items” or individual sources of bias
organized thematically under “domains” or overarching themes. For example, while
“blinding of outcome assessors” would be considered an item, it would fall under the domain
of “detection bias.” A modified framework synthesis approach was used for this study
(Barnett-Page and Thomas 2009). Framework synthesis is used to synthesize qualitative data
across multiple studies to develop an overarching framework. Instead of using qualitative
studies, we used systematic reviews as the primary data source in this study. The other
principles of framework synthesis were closely observed: i) developing an a priori
framework, ii) extracting and mapping data onto the evolving framework in an iterative
manner, iii) organizing the final synthetic product so that associations between themes
became apparent. Domains in the a priori framework were chosen and defined using
background reading material (Sackett 1979; Kleinbaum, Morgenstern, and Kupper 1981;
Grimes and Schulz 2002; Fletcher and Fletcher 2005; Choi and Noseworthy 1992; Delgado-
Rodriguez and Llorca 2004) and team discussions. This initial framework contained the
following domains: selection bias, information bias, performance bias and attrition bias. A
36
search strategy was designed with the assistance of an information specialist to identify
pertinent reviews.
3.3.1 Search strategy
A MEDLINE search (2000- 2011) was conducted to identify systematic reviews of quality
assessment or risk of bias tools for NRS. The search strategy was structured to include terms
related to four main concepts: i) quality, bias, validity and critical appraisal (with appropriate
truncations), ii) instruments (tools, scales, checklists, and related terms), iii) non-randomized
study design (including cohort, case-control, observational and appropriate permutations)
and iv) systematic review. Titles and abstracts were assessed for eligibility by a single-
reviewer.
Given the challenges of searching for methodological articles, we supplemented our search
strategy in four ways. First, we reviewed the references of eligible articles. Second, the
“Related Citations” feature of PubMed® was used in turn for each of the eligible articles to
identify additional reviews. Third, Web of Knowledge® was used to identify articles citing
the eligible studies and these titles and abstracts were assessed for eligibility. Fourth, the
“Related Articles” feature of Google Scholar® was used in conjunction with each eligible
study to find any additional reviews meeting the eligibility criteria.
Studies meeting the following inclusion criteria were analyzed; i) review of quality
assessment tools for NRS, ii) content of tools analyzed and grouped thematically,
iii) publication in English, iv) publication in a peer-reviewed journal or report of an
independent scientific association or government agency. Exclusion criteria included the
following; i) review of tools for NRS evaluating diagnostic/prognostic questions ,
ii) reporting guidelines or reviews of reporting guidelines and iii) reviews published before
2000. Reporting guidelines were excluded because these focus on which elements should be
reported and do not evaluate actual study conduct. Systematic reviews published before
2000 were excluded so that the framework would be based on contemporaneous data
37
sources. However, it is likely that the reviews included many instruments themselves
developed prior to 2000.
3.3.2 Data collection
The following information was abstracted from each review; year of publication, number of
tools reviewed, number of content domains constructed, and the number of items or sources
of bias identified. In these reviews, investigators had organized the content of NRS tools and
summarized this content. We examined these organizational approaches and abstracted all
domains and specific sources of bias listed. These data were then analyzed thematically.
3.3.3 Analytic approach
In this study, abstracted items and domains were tabulated and coded using the a priori
framework. Open coding involved line by line analysis of abstracted sources of bias. Codes
were then combined into inter-related categories or domains using axial coding. This
iterative process involved a constant comparative approach where domains were compared
with existing ones and consensus was achieved across three study members (LS, DRU, GT).
Thus, data analysis was initially a deductive process. Deductive reasoning involves moving
from the broad to the more specific; a theory is devised (i.e. an a priori framework) and is
supported or refuted using the available data (Miles and Huberman 1994). Since the a priori
framework informed initial coding, a broad structure was used to inform data analysis.
Whenever the emerging framework could not accommodate a new domain related to bias,
the framework was expanded (Barnett-Page and Thomas 2009). Therefore, inductive analysis
occurred whenever domains emerged solely from the data; inductive reasoning involves
using collected data to form the basis for theory or broad conclusions. Therefore, thematic
analysis included both a deductive and inductive phase, as is customary in framework
synthesis (Barnett-Page and Thomas 2009).
38
Data relevant to bias were included in the framework whereas excluded items, those relating
to other facets of quality (i.e. ethics, external validity, reporting standards and/or precision),
were organized separately. We synthesized these latter elements in a separate table so that
the final framework could be compared and contrasted with these extraneous elements.
3.3.4 Framework refinement
Five scientists with extensive experience in clinical epidemiology and health services
research were approached to informally review the clarity and face validity of the
framework. They were asked, i) does this framework reflect the biases in NRS, ii) are any
sources of bias missing, iii) is the wording of any items ambiguous or unclear?
3.4 Results
3.4.1 Included studies
Seven reviews met the inclusion criteria (Figure 3.1). Four were published in peer-review
journals (Saunders et al. 2003; Katrak et al. 2004; Sanderson, Tatt, and Higgins 2007; Crowe
and Sheppard 2011) whereas 3 were produced by publicly-funded agencies (the Agency for
Healthcare Research and Quality (West et al. 2002; Viswanathan and Berkman 2011) and the
National Institute for Health Research-Health Technology Assessment Programme (Deeks et
al. 2003)). Five reviews focused solely on NRS and two evaluated instruments for both RCTs
and NRS (West et al. 2002; Crowe and Sheppard 2011) — but present data separately for
each design. The characteristics of eligible studies are summarized in Table 3.2.
Reviews differed in the number of instruments evaluated (range, 19-193). Each study
presented a scheme organizing the content of assessment tools. The number of domains
(range, 5-12), sub-domains (range, 0-22) and items (range, 11-54) varied across the reviews.
Table 3.3 outlines the constructs identified as domains. Authors clearly varied in the number
of domains constructed and in many instances, domains that appear in one review were not
39
mentioned in others. For example, West et al. include funding as a bias domain but others
did not. Crowe and Sheppard broached the concept in their review, but nested the finding
under “Ethical matters.” In many instances, the domains identified by authors were not
specific
Figure 3.1 Flow diagram of included studies.
* Deeks et al. (2003), Sanderson et al. (2007) ** West et al. (2002) § Saunders et al. (2003), Katrak et al. (2004), Vishwanthan & Berkman (2011) † Crowe et al. (2011)
Table 3.2 Characteristics of included studies.
Author Year Number of
Tools Reviewed Number of Domains
Number of Sub-domains
Number of Items
West et al. 2002 19* 9 - 29 Deeks et al. 2003 193 12 - 45 Saunders et al. 2003 18 4 - 27 Katrak et al. 2004 19§ - - 11 Sanderson et al. 2007 86 6 - 10 Crowe and Sheppard 2011 44 8 22 54 Vishwanathan and Berkman 2011 NA† 5 12 29 * 106 instruments identified and 19 specific to non-randomized studies § 121 critical appraisal tools identified and 19 specific to non-randomized studies † Authors reviewed 1429 questions collated from i) published instruments and ii) 84 Agency for Healthcare Research and Quality (AHRQ)-sponsored systematic reviews. These questions were analyzed and reduced to 29 final items by an expert panel
40
Table 3.3 Bias domains extracted from systematic reviews of quality assessment tools for NRS.
West et al. Deeks et al. Saunders et al. Katrak et al. Sandersen et al. Crowe and Sheppard
Vishwanathan and Berkman
Study question Study population Comparability of subjects Exposures/Interventions Outcome measures Statistical analyses Results Discussion Funding
Background/context Sample definition and selection Interventions Outcomes Creation of treatment groups Blinding Soundness of information Follow-up Analysis: comparability Analysis: outcome Interpretation Presentation and reporting
Subjects Interventions Outcomes Statistical Analyses
Nil Methods for selecting study population Methods for measuring exposure and outcome variables Design-specific sources of bias Methods to control confounding Statistical Methods Conflicts of interest
Preamble Introduction Research and design Sampling Ethical Matters Data collection Results Discussion
Selection bias and confounding Performance bias Attrition bias Detection bias
Reporting bias
41
types of bias but instead broad labels such as “outcomes” (West et al. 2002; Deeks et al.
2003; Saunders et al. 2003). Our analyses therefore progressed in a classical qualitative
manner wherein inter-related concepts were collapsed onto underlying constructs. The
conceptual framework that emerged is described below.
3.4.2 Conceptual framework
The final framework contains six overarching bias domains; selection, information,
performance, detection, attrition and selective reporting bias (Figure 3.2, Table 3.4).
Figure 3.2 Conceptual framework for bias in non-randomized studies.
42
Table 3.4 Bias domains in the conceptual framework.
Domain Definition* Selection Bias Attrition Bias
Bias arising when members of the intervention/exposure group differ from the comparator (i.e. control) group in ways aside from the exposure of interest Systematic differences between groups in withdrawals from a study
Information Bias Detection Bias
Bias arising from measurement errors of exposure, covariate or outcome variables Systematic differences between groups in how outcomes are determined and thus, a type of information bias
Performance Bias Systematic differences between the groups in the care that is provided, or in exposure to factors other than the interventions of interest
Selective Reporting Bias Systematic differences between reported and unreported findings * Adapted from the Cochrane Glossary, www.cochrane.org/glossary
Compared with the a priori framework, there were two significant expansions; first, selective
reporting bias was added. Second, the domain detection bias was added and nested under
information bias. The framework contains 37 specific items or sources of bias (Table 3.5). A
description of the framework domains and items is provided below.
Selection Bias
Selection bias is often cited as the most significant shortcoming of NRS. Within NRS,
selection bias refers to any process that leads groups to differ from one another, aside from
the exposure of interest. Randomization determines group allocation in RCTs, whereas in
many NRS, groups are often formed by virtue of the intervention received during the course
of routine care. When the characteristics of patients are related to both receipt of treatment
and the development of the outcome, confounding occurs. Investigators anticipating such
bias can employ a number of strategies throughout the course of a study, from assembling the
cohort (e.g. matching) through to analysis (e.g. stratification and regression analyses), to
diminish selection bias.
43
Table 3.5 Frequency of included items. Item
# Item Bias Weeks Deeks Saunders Katrak Sanderson Crowe & Sheppard
Vishwanathan & Berkman
1 Outcomes specified a priori Selective Reporting X 2 Analyses specified a priori Selective Reporting X 3 Attempt to balance groups on known confounders by design Selection X X X Inclusion/Exclusion
Explicitly defined Identical criteria across groups Measured using valid and reliable instruments/approach Source of data
4 Selection X X X X X 5 Selection X X X 6 Information X 7 Information X X 8 Participant recruitment consistent across groups Selection X 9 Comparability of groups at baseline Selection X X X X X Confounders
Explicitly identified/defined Measured using valid and reliable instruments/approach Source of data Systematic determination of confounders across groups
10 Selection X X 11 Information X 12 Information X X 13 Information X 14 Intervention/Exposure
Explicitly identified/defined Measured using valid and reliable instruments/approach Source of data Systematic determination of intervention/exposure Intervention delivery and adherence
15 Performance X X X X X X 16 Information X X X X 17 Information X X 18 Information X X X 19 Performance X X X 20 Contamination Performance X X X 21 Concurrent treatment/co-interventions consistent across groups Performance X X X 22 Blinding of participants/personnel Performance X X X X 23 Blinding of outcome assessors Detection X X X X X X
Outcomes Explicitly defined Objective definition Measured using valid and reliable instruments/approach Source of data Systematic determination of outcome across groups
24 Detection X X X X X 25 Detection X 26 Information X X X X X 27 Information X X 28 Information X
Follow-up Adequately long-follow-up (to capture outcome events) Equal duration across groups Equal intensity across groups Losses to follow-up
29 Detection X X X X 30 Attrition X X 31 Attrition X 32 Attrition X X X X X
Analysis methods Intention to treat analysis Missing data (e.g. addressed via imputation or sensitivity analyses) Handling of known confounding Multiple comparisons
33 Attrition X X X 34 Attrition X X
35 Selection X X X X X X 36 Selective Reporting 37 Funding Selective Reporting X X X X
44
When inclusion inclusion/ exclusion criteria are not applied in a systematic way, bias may
arise. Ultimately, these criteria are used to create homogenous groups of participants —
groups should be as similar as possible, aside from receipt of the intervention under study.
Placing limits on age, severity of illness and the existence of comorbid illness can help to
construct groups that are more comparable to one another. Inclusion/exclusion criteria need
to be clearly defined (e.g. stage IV cancer as opposed to “advanced cancer”) and applied
uniformly across groups to prevent bias. Participants destined to be classified as the
“exposed” or “controls” should also be drawn from the same source population. Consider a
hypothetical study of robotic surgery for the treatment of prostate cancer. Since high-volume
centers are more likely to acquire robot technology, any study assessing this new technique
should draw control subjects from the same institution, or an institution with similar
characteristics. Otherwise, increased hospital volume could confound the relationship
between the exposure (i.e. robotic surgery) and the outcome (e.g. mortality). In this example,
hospital surgical volume is a confounder; it is i) associated with the exposure, ii) is an
independent determinant of outcome, and iii) is not an intermediate in the causal pathway
(Fletcher and Fletcher 2005). Given the importance of confounding in NRS, the framework
contains multiple items related to this source of bias. Confounding was considered at the
outset of a study (items 3, 9 and 10) as well during the analysis phase of a study (item 35).
Information Bias
Conducting RCTs ideally necessitates the development and registration of trial protocols
including detailed information about the measurement of patient characteristics and
outcomes. These data are thereafter collected on standardized forms, by trained study
personnel and entered into databases with mechanisms to detect inaccurate data entry. In
contrast, many NRS make use of information gathered from hospital charts or administrative
processes (e.g. billing). As this information was not collected in the course of a study,
questions can arise about the accuracy, completeness and other purposes of such information.
Consequently, there are 11 items in the framework related to information bias – bias
45
stemming from the errors in the measurement of exposures, confounders and outcomes.
Detection bias (or ascertainment bias) applies specifically to outcomes and is therefore
classified as subset of information bias.
Bias can arise in a NRS if investigators rely on source data of questionable quality (items 7,
12, 17 and 27). For example, consider a case-control study of antibiotic use; asking patients
to recall their drug history over the past 10 years is far less reliable than using hospital charts
or provincial drug-plan databases to obtain this information. In turn, those studying anti-
retroviral therapy may refer to hospital records to obtain CD4 blood levels, but could have
acquired more data points if a research protocol had dictated the frequency of sample
collection. Therefore, assessing the adequacy of source data in NRS is context dependent –
but will generally be poorer in quality as compared to data collected in the course of a trial.
Effect measures can thus deviate from the truth if the source data is itself flawed or
incomplete in a systematic way.
Bias may also arise if investigators rely on instruments that have not been validated or tested
for reliability (items 6, 11, 16 and 26). Even when valid and reliable instruments are used,
unless they are used in a systematic manner, bias may arise. Accordingly, the framework
contains items related to the systematic determination of confounders (item 13),
interventions/exposures (item 18) and outcomes (item 28). Moreover, blinding outcome
assessors (item 23) specifically serves to diminish detection bias and in its absence, bias may
arise because the conscious or subconscious beliefs of investigators may influence how
outcomes are recorded. For example, consider a trial comparing routine care versus incentive
spirometry (i.e. the use of breathing device) for the prevention of post-operative pneumonia.
An assessor who believes strongly in the benefit of incentive spirometry may review chest
radiographs of patients using the device and be less likely to diagnose a pneumonia.
Diagnosing a pneumonia on a chest radiograph can require judgment and is thus subject to
interpretation.
46
Performance Bias
In an ideal study, participants in the active and control arms of a study are similar except for
the exposure of interest. Blinding both study subjects and investigators ensures that all
participants are treated alike. However, such blinding is not always possible (Boutron et al.)
and in its absence, performance bias arises. Performance bias has four main components;
i) explicit definition of the intervention/exposure (item 15), ii) consistent delivery and
adherence to the intervention (item 19), iii) similar co-interventions across groups (item 21),
iv) contamination (item 20) and iv) blinding of participants and study personnel (item 22).
Bias can arise in a study if there is variation in the definition of an intervention or in its
standardization. For example, in a hypothetical study of surgery versus surgery and
chemotherapy for gastric cancer, surgeons can differ in the extent of lymphadenectomy
(i.e. removal of lymph nodes) performed intra-operatively. Some surgeons remove less
surrounding tissue, and thus fewer nodes, whereas others may do a more extensive
lymphadenectomy. If the extent of surgery differs between patients, will the summary effect
estimate truly reflect the difference between the two treatment strategies? Whereas in
pharmacological studies, standardizing the intervention can mean defining a dose, frequency
of delivery and using a specific pharmaceutical source, achieving this type of standardization
is more difficult in studies of non-pharmacological interventions. Within RCTs, investigators
will often insist on surgeons having achieved a certain degree of proficiency with a technique
before involving them in a study. For example, in the trial by Nelson and colleagues on
laparoscopic colon surgery, both procedure-volume thresholds were established and
videotapes of operations were reviewed to ensure participating surgeons were delivering the
intervention consistently (Nelson et al. 2004). A similar approach could be adopted in NRS
but would add to the complexity of studies. Without explicitly defining and standardizing
interventions, the results of a NRS may not reflect the impact of an intervention as it was
intended.
47
Patients can also fail to adhere to their intended intervention via non-compliance or by
switching over to the alternative therapy/intervention (i.e. contamination). This can be
common in pharmacological studies. As there are often meaningful reasons why people
switch, investigators should explore the underlying reasons why these changes occur. Non-
random switching can introduce bias that affects estimates from studies with significant non-
adherence or contamination. Some have argued that these effect estimates instead represent
“real-world” conditions.
Co-interventions can also introduce bias by similar mechanisms in NRS and RCTs. Consider
the following example which applies equally to both study types; investigators are evaluating
the rate of wound infection following two surgical procedures A and B. At the end of the
study, lower rates of post-operative infection are observed in patients receiving A. If
antibiotic prophylaxis was provided more frequently to patients undergoing A than those
undergoing B, the observed effect might be attributable to antibiotic prophylaxis rather than
to the surgical procedure. In RCTs, blinding of study personnel and participants helps to
minimize this bias but this is rarely encountered in NRS. Therefore, differential application
of co-interventions can introduce bias in NRS.
Attrition Bias
Attrition bias arises when there is incomplete data collection (item 34) or differential follow-
up (item 32). These processes lead to imbalances at the end of a study. Attrition bias
therefore diminishes the comparability of groups and was categorized as a type of selection
bias. Five of the reviews identified losses to follow up as a source of bias in NRS (Saunders
et al. 2003; Katrak et al. 2004; Sanderson, Tatt, and Higgins 2007; Crowe and Sheppard
2011; Viswanathan and Berkman 2011). Since NRS are often observational studies making
use of data collected during routine care, attrition poses a particular challenge. Bias can arise
if one group is followed more intensely than the other. There is more opportunity to record
information from patients that are seen more often or undergo tests more frequently. For
48
example, it is possible that disadvantaged populations are less likely to comply with follow-
up. If socioeconomic status is related to the outcome, seeing such patients infrequently will
not allow for complete capture of outcomes. Bias therefore arises if follow-up is of unequal
intensity (item 31) across groups.
Moreover, when new techniques or interventions are compared with more established ones,
there may only be a few years of follow-up available with the newer approach. If the patients
who received the conventional technique have been followed for over a decade, the
inequality in duration of follow-up (item 30) could make the newer technique falsely appear
superior to the older approach. Both groups may be followed with equal intensity (e.g. same
number of blood tests or computed tomography scans) but by virtue of being followed for a
shorter amount of time, bias may be introduced in the NRS.
The concept of intention-to-treat analysis is often classified within attrition bias for RCTs. A
similar approach was adopted in this framework for bias in NRS. In RCTs, intention-to-treat
(ITT) analysis implies analyzing participants according to the group they were randomly
assigned to, irrespective of the treatment received. Understanding this concept within the
context of NRS is more difficult. Since patients are assigned to the treatment or exposure arm
of a study based on the treatment received, how does intention-to-treat analysis retain its
meaning? Taking a closer look at studies comparing laparoscopy with conventional surgery
for colon cancer can help to illustrate the meaning of ITT analysis in NRS. Laparoscopy or
“key-hole” surgery involves operating through small incisions (e.g. <1cm) instead of the
much larger incision (>20 cm) used in conventional surgery. Minimally-invasive instruments
are inserted through the small incisions while the operative field is displayed on monitors in
the operating room. If surgeons encounter bleeding or adhesions that impede the progress of
the procedure, they may abandon laparoscopy in favor of the conventional approach. Later
analyzing these “converted” patients in the conventional surgery group would be a violation
of the principles of ITT analysis. More generally, ITT analysis in NRS requires using
strategies (e.g. last observation carried forward) to make sure that participants with some
missing data are not wholly removed from the analysis. Failure to do so could introduce bias
49
by removing patients with missing data — reasons for missingness may be related to the
outcome of interest.
Selective Reporting Bias
There are 5 items in the conceptual framework related to selective reporting bias. Whenever
investigators systematically choose to include certain findings but not others in the published
report of a study, selective reporting bias occurs. This bias is further divided into two types;
selective outcome reporting or selective analysis reporting (Norris et al. 2012). The former
arises when a subset of outcomes are reported based on: i) their direction or statistical
significance, ii) whether they are consistent with the hypotheses of the investigator or
funding source, or iii) if the findings support a paradigm shift (Chan et al. 2004). Selective
outcome reporting is mediated through a number of mechanisms which include omitting an
entire outcome, reporting only favorable aspects of an outcome (e.g. at 6 months of follow-
up but not 1-year), or providing insufficient detail (e.g. p>0.05) (Kirkham et al. 2010).
Selective analysis reporting involves including only a portion of the analyses performed,
using multiple approaches to missing data but reporting a subset, or turning continuously
measured variables into categorical ones (Norris et al. 2012).
Selective reporting bias can arise in NRS if there is selective reporting of outcomes and
analyses. Many have suggested that NRS should require protocol registration in a manner
akin to RCTs (Williams et al. 2010; Swaen, Carmichael, and Doe 2011) — doing so would
encourage investigators to refrain from “cherry-picking” outcomes and analyses (Mathieu et
al. 2009). Assessing the potential for selective reporting bias in NRS remains a challenge
however, in the absence of such protocols.
50
3.4.3 Excluded items
There were 24 items not included in the final conceptual framework (Table 3.6). The
majority were related to reporting standards (n=9), and the remainder to external validity
(n=8), ethics (n=4) and precision (n=3) (Table 3.6). While “precision” is considered a
component of internal validity (Higgins et al. 2011), the other three domains were
categorized as facets of “quality” (Sanderson, Tatt, and Higgins 2007).
A distinction should be drawn between the items related to selective outcome reporting
(items 1, 2, 33, 34, 36 and 38) of the framework and the “selection of outcomes for relevance
and importance.” Investigators should measure outcomes that are important to patients,
health care providers, administrators and policy makers alike. Which outcomes one chooses
to study is a different consideration from choosing which ones to publish among those
studied. Accordingly, the “selection of outcomes for relevance and importance” is a
reflection of the relevance and external validity of a study.
51
Table 3.6 Items abstracted from reviews but not related to bias.
Excluded Items Ethics
Ethics approval Informed consent Privacy/confidentiality maintained throughout the study Appropriate comparison group (given standard of care)
External Validity Question relevant to practice Participants representative of those seen in clinical practice Recruitment/participation rate Participants comparable to non-participants Feasibility of intervention Selection of outcomes for relevance and importance Clinical importance of findings
Reporting Standards Background information provided Clearly stated question Hypotheses described Study design adequately described Description of study population Statistical presentation/reporting of findings Conclusions supported by results/limitations Description of implications/applications
Internal Validity – Precision Sample size calculation Power calculation
52
3.5 Discussion
A comprehensive framework for bias in comparative, NRS has been developed and includes
37 individual sources of bias, organized within 6 domains. No single review included all
identified items and many included items related to the larger construct quality. The
appearance of items related to reporting standards, generalizability and ethics within reviews
underscores how many available instruments were not specifically designed to evaluate bias.
The domains of bias in the final framework are similar to those in the Cochrane Risk of Bias
Tool for RCTs. Whereas the Cochrane Risk of Bias Tool has five domains (selection,
performance, detection, attrition and reporting bias), the framework for NRS places detection
bias within the larger construct of information bias. The inclusion of information bias as a
stand-alone domain in the current framework highlights how NRS and RCTs fundamentally
differ in data acquisition; RCTs are by definition prospective studies carried out by dedicated
personnel following explicit protocols, collecting data in a formalized way. Whether
explanatory or pragmatic, RCTs are planned and structured experiments where specific
attempts are made to answer a clinical question. Most NRS however, make use of
information collected during routine care and this has notable implications on data quality.
Patients answer questions and undergo tests at intervals that are dictated by care goals, not
research protocols. For these reasons, many items that emerged in the framework related
directly to information bias – namely the quality of the data available, and the measures used
to determine exposure, confounders and outcome.
Some reviewing the framework might question why publication bias, a type of reporting
bias, does not appear in the framework. The biases addressed in the framework all function
within a study. In contrast, publication bias, a metabias (Goodman and Dickersin 2011),
occurs at the level of the entire study; a study reporting statistically significant results is more
likely to be published, often more quickly, more than once and ultimately gets cited more
often (Dwan et al. 2008). This bias is of particular concern to those performing systematic
53
reviews and meta-analyses where identifying all pertinent studies is critically important.
Since publication bias becomes apparent when aggregating studies, it does not apply to
within study processes. The tendency to publish some outcomes in lieu of others is more
formally captured by the domain selective outcome reporting, in part, to highlight this very
distinction.
Considering the indexing of methodological articles is highly variable, an augmented search
strategy was used to identify relevant reviews of quality assessment tools for NRS. This is
one of the strengths of the current study, as is the use of a formal approach for the synthesis
and development of the framework. In addition, by dividing the concept “follow-up” into
four components (adequacy of duration, equality of duration and intensity, and losses to
follow-up), the framework highlights facets of follow-up that have been previously
overlooked. While the informal face validity exercise did not lead to additions to the
framework, it did help refine the language used. The excluded items also help to illustrate
how issues of reporting have historically been intertwined with assessments of conduct.
As the framework applies to NRS more broadly and has not been tailored for specific study
designs, such as retrospective cohort studies or case-control studies, some may argue this
represents a limitation of the current work. Indeed, various instruments for the evaluation for
NRS, including the Newcastle-Ottawa Scale, have variations that apply to specific study
designs (Stang 2010). A design-specific approach was not adopted because most of the
source material did not make an analogous distinction when reviewing the instruments
evaluating bias. Introducing such a distinction during the course of framework development
would have required an inferential leap. Moreover, since the current framework can be
applied to any NRS, end-users do not have to identify the type of NRS being evaluated with
the framework. Previous studies have demonstrated that there can be a great deal of
ambiguity with study design labels for NRS (Furlan 2006; Hartling et al. 2011).
Conceptual frameworks facilitate research by providing a structure for understanding a
phenomenon. The current framework focuses on sources of bias but it is important to note
54
that any bias has an associated direction and magnitude that is context dependent. Will a lack
of blinded outcome assessors always bias effect estimates away from the null? That depends
on the hypotheses being tested and the subconscious motivations of the researchers or study
personnel. To use the framework for those studying and evaluating studies, it is important to
recognize how any given source of bias may arise in a particular study and if possible,
speculate on the direction of this bias. For RCTs, the Cochrane Risk of Bias tool does not
focus soley on what was done but requires reviewers to make a judgment about the
associated risk of bias. For example, if outcome assessors were not blinded, is bias
introduced when evaluating an objective outcome such as mortality? There is empirical
evidence to support that objective outcomes are insulated from the effects of unblinded
assessment in RCTs (Wood et al. 2008; Hrobjartsson et al. 2012, 2013). Until there are
empirical studies of bias for NRS, theoretical considerations will be used in determining the
risk of bias associated with any given item or domain in the framework.
Many of the items and domains in the framework are well-known to methodologists,
researchers and the end-users of NRS. Some of these include the blinding of outcome
assessors, minimizing losses to follow-up and adjusting analyses for known confounding.
This framework helps to shed light on many of the important sources of bias that are
infrequently discussed but are nonetheless important to consider. For example, the
framework highlights not only the duration of follow-up as a source of bias but also the
intensity of follow-up. Both of these concepts are set aside from the concept “losses to
follow-up.” The comprehensiveness of the current framework is indeed one its strengths.
Moreover, the sources of bias identified in this current framework include many of those
identified in the textbook “Clinical Epidemiology: The Essentials, 4th edition” (Fletcher and
Fletcher 2005). In fact, we identified an additional 12 sources of bias.
Ultimately, the framework for bias in comparative NRS may be used by individuals
designing a NRS, those reviewing a proposed protocol, or published studies. Additional
research and guidance is necessary however, to fully operationalize this framework for these
purposes.
55
3.6 Conclusion
In conclusion, a conceptual framework for bias in NRS containing six overarching domains
was developed. It contains 37 distinct items or sources of bias that were synthesized from
seven reviews of quality assessment tools for NRS.
56
Chapter 4 Common Methods for Chapters 5 & 6
4.1 Overview
The overarching objective of this thesis was to study bias in surgical NRS and RCTs. The
advent of laparoscopy provided us with such an opportunity; unlike most surgical
procedures, laparoscopic colon surgery has been thoroughly evaluated using both
randomized and non-randomized study designs (Howes et al. 1997; Wente et al. 2003). In the
1990’s, clinical equipoise existed between laparoscopy and conventional surgery for the
treatment of colon cancer. NRS comparing the two surgical approaches first appeared in
1993 (Falk et al. 1993). Numerous early case reports however detailed the development of
port-site metastases following laparoscopic resections (Nduka et al. 1994; Wexner and
Cohen 1995; Montorsi et al. 1995; Kazemier et al. 1995; Zmora, Gervaz, and Wexner 2001).
These reports led many to believe that laparoscopy was inferior to “open” or conventional
surgery. Others, however, disagreed. By 1995, the results of the first RCT comparing
laparoscopy with open surgery were published (Lacy et al. 1995). Numerous NRS and RCTs
have since established the safety of laparoscopy for the treatment of colon cancer (Abraham
et al. 2007; Schwenk et al. 2005; Kuhry et al. 2008). However, our aim was not to derive
clinical conclusions about the adequacy of laparoscopy but instead, to study bias. Therefore,
this surgical procedure was chosen for a case study of bias because many NRS and RCTs
specifically comparing laparoscopy with conventional surgery are available.
57
Specific Aims #2 and #3 have been reiterated below to place the following methods in context:
Specific Aim #2 (Chapter 5)
To compare effect estimates from NRS with those from RCTs at low risk of bias.
Specific Aim #3 (Chapter 6)
To explore the impact of NRS-design attributes on estimates of treatment effect.
4.2 Literature search
All studies comparing laparoscopic resections with conventional surgery for colon cancer
were identified in Medline (1950-2010) and EMBASE (1980-2010). Comprehensive search
strategies were separately devised for each database with the assistance of an information
specialist (Appendix B). Search terms included “colon cancer,” “laparotomy”, “open
surgery” and “laparoscopy.” The reference lists of articles retrieved were also manually
searched to identify relevant studies. Titles and abstracts were reviewed for eligibility using
EndNote™ bibliographic management software (Thomson Reuters, New York, NY, USA).
Studies were included if they fulfilled the following a priori inclusion criteria: i) surgery
undertaken in an elective setting (versus emergency operations); ii) publication in a peer-
reviewed journal and iii) publication in English. Patients who undergo emergency operations
often have a lower baseline functional status and higher comorbidity burden than elective
surgery patients (Ingraham et al. 2012). Not surprisingly, emergency colon surgery is
associated with higher rates of post-operative complications (Kirchhoff, Clavien, and
Hahnloser 2010), mortality (Leung et al. 2009) and longer hospital admission (Kelly et al.
2012). To minimize between-study clinical heterogeneity, only articles describing the
outcomes of elective procedures were included. Publications in English were reviewed for
pragmatic reasons; notably, Juni et al. have shown that excluding non-English studies from
meta-analyses can have minimal impact on summary effect estimates (Juni et al. 2002).
58
Studies that met the following exclusion criteria were not considered further: i) no clinical
outcomes reported (e.g. studies reporting biochemical outcomes only); ii) animal studies and
iii) systematic reviews or meta-analyses. For all abstracts that met inclusion/exclusion
criteria or were potentially eligible, full articles were retrieved.
4.3 Data abstraction and management
Data were abstracted using a pretested and standardized data collection form. DRU and LS
pilot tested the form using articles comparing laparoscopy with open surgery for the
treatment of diverticulitis, another disease of the colon. The form was revised in an iterative
manner to eliminate ambiguity.
Data abstracted from each article included the following (Table 4.1): i) study design (NRS or
RCT); ii) article attributes (year of publication, journal, length); iii) author characteristics
(number, involvement of a consortium or methodological expert); iv) study attributes
(country where the study took place, academic versus community setting); v) outcomes
reported; and vi) unadjusted outcome data (number of events and group size, mean and
standard deviation).
For RCTs, seven risk of bias items (random sequence generation, allocation concealment,
blinding of participants and personnel, blinding of outcome assessment, incomplete outcome
data, selective outcome reporting and other sources of bias) were abstracted. These seven
items collectively form the Cochrane Risk of Bias Tool (Higgins et al. 2011) (see Section
4.8.2 for additional information). For NRS, nine study characteristics (primary outcome,
prospective data collection, sample size calculation, concurrent controls, matched controls,
standardized concurrent therapy, systematic outcome assessment, blinded outcome
assessment, intention
59
Table 4.1 Definitions for abstracted variables.
Category Variable Definition/Guidance
Study Design
RCT A clinical trial in which participants were randomly assigned to either the experimental group (i.e. laparoscopy) or the control group (i.e. open surgery)
NRS A study in which patients received the experimental intervention (i.e. laparoscopy) or conventional treatment (i.e. open surgery) in a non-random manner
Article Attributes
Year of publication Calendar year in which the article was published Journal Name of the peer-reviewed periodical in which the article
appears Length Total number of pages
Author Characteristics
Number Number of named authors Consortium The involvement of an organization or association as a
named author Methodological expert An author was identified as a methodological expert if
he/she had an affiliation with a department of biostatistics, clinical epidemiology, health policy, public health or statistics
Study Attributes
Country If the country where the study took place was not specified in the “Methods,” the address of the corresponding author was used. If more than one country was specified in the “Methods,” the study was categorized as “Multinational”
Academic versus community A study was classified as taking place in an “Academic” setting if;
- for single institution studies, the hospital involved was affiliated with a university
- for multi-institutional studies, the primary investigator was affiliated with a university
Outcomes reported A list of all outcomes reported within the body of an article were abstracted
Outcome data
Binary outcomes Peri-operative mortality
- Number of deaths respectively in the laparoscopy and open surgery groups
- Number of participants in the laparoscopy and open groups
Post-Operative Complications - Number of patients experiencing a
complication in the laparoscopy and open surgerygroups
- Number of participants in the laparoscopy and open surgery groups
Continuous outcomes Length of Stay
- Mean number of days a patient remained hospitalized following surgery in the laparoscopy and open groups
- Standard deviation of the mean for each group Number of lymph nodes harvested
- Mean number of lymph nodes found within the surgical specimen in the laparoscopy and open surgery groups
- Standard deviation of the mean for each group
60
to treat analysis) were abstracted. Additional information about the selection, definition and
validation of study characteristics is available in Section 6.3.2.
All data were directly entered into a hierarchical, relational database constructed using
Microsoft Access™ (Microsoft, Seattle, Washington, USA) (Figure 3.1). Each study meeting
the inclusion/exclusion criteria was assigned a unique “study identification number” (SID)
that was used to link all nested, second-order forms and tables. Logic rules and data checks
were used to ensure data accuracy.
Figure 4.1 Database Structure. SID, study identification number. *Post-operative complications, peri-operative mortality, length of stay and number of lymph nodes harvested. See Section 4.5 for more detail about outcome selection.
4.4 Categorizing studies as RCTs or NRS
Many study design classification tools are limited by fair inter-rater reliability and low
accuracy (Furlan 2006; Hartling et al. 2010; Hartling et al. 2011). For example, in a study by
Furlan and colleagues, the kappa statistic was 0.53 (95% CI, 0.49-0.67) among senior
reviewers classifying a given study as a RCT, controlled-clinical trial, prospective cohort
study, retrospective cohort study, case-control study, cross-sectional study, case series, or
SID
Outcomes of Interest* Other OutcomesStudy-level
Characteristics
1:1 1:4 1:∞
61
case report (Furlan 2006). The kappa statistic was 0.46 (95% CI, 0.37-0.47) among junior
reviewers. Hartling and colleagues similarly found fair inter-rater reliability (kappa=0.45)
among six reviewers using a modified version of the Cochrane Non-Randomized Studies
Methods Group (NRSMG) design algorithm (Hartling et al. 2010). When reviewers’
classifications were compared with a reference standard, the assessments had low accuracy
(Hartling et al. 2011). It possible that low inter-rater reliability and low accuracy of these
approaches is due to the
In light of these findings, we chose to classify the design of comparative studies in our data
set as follows: i) RCTs or ii) NRS. A RCT was defined as a clinical trial in which
participants were randomly assigned to either the experimental group (i.e. laparoscopy) or
the control group (i.e. open surgery). In contrast, a NRS was defined as a study in which
patients received the experimental intervention (i.e. laparoscopy) or conventional treatment
(i.e. open surgery) in a non-random manner (Reeves B.C. et al. 2011). We believe this
classification scheme is far simpler than the approach identified by AD Furlan (Furlan 2006)
or any of the 10 classification tools evaluated by Hartling et al. (Hartling et al. 2010; Hartling
et al. 2011). Moreover, Furlan and colleagues have found that for RCTs, it is “relatively
simple to assign a label. It is based on direct observation, i.e. if the word “randomized” (or its
variations) appears in the study suggesting that subjects were assigned at random to the
intervention and control groups.” Therefore, we believe our approach to study classification
is likely to be simple and accurate.
4.5 Outcome selection and definition
All outcomes reported in NRS and RCTs were abstracted and tabulated. For the current case
study of bias, we chose to focus our attention on outcomes that were both of clinical
significance and frequently reported. Of the over 152 outcomes reported, we selected four
outcomes for our analyses: i) post-operative complications; ii) peri-operative mortality;
62
iii) length of stay and iv) number of lymph nodes harvested. “Post-operative complications”
was defined as the number of patients in either the control (i.e. open surgery) or intervention
(i.e. laparoscopic surgery) arm experiencing a complication within the first 30 days following
surgery. Similarly, the number of patients who died in each group within the first 30 post-
operative days contributed to the definition of peri-operative mortality. “Length of stay” was
considered the amount of time the patient remained hospitalized following surgery. The
“number of lymph nodes harvested” was defined as the number of nodes found within the
surgical specimen when examined by a pathologist.
4.5.1 Subjective versus objective outcomes
In a previous study examining the association between study design attributes and bias in
RCTs, Woods et al. categorized outcomes as objective or subjective (Wood et al. 2008).
They classified mortality and standardized laboratory procedure measures (e.g. hemoglobin
concentration) as objective outcomes whereas subjective outcomes included patient-reported
outcomes and physician-assessed disease outcomes (e.g. wound infection, pneumonia and
other complications). They found a number of design attributes (e.g. allocation concealment)
were associated with biased effect estimates among subjective but not objective outcomes. In
a larger meta-epidemiological study, Savović et al. reached the same conclusion.
Accordingly, we chose to categorize the outcomes of interest (i.e. post-operative
complications, peri-operative mortality, length of stay and number of lymph nodes
harvested) as subjective or objective as well, using the same criteria employed by
Wood et al. and Savović et al. Consensus was also achieved among our research group
regarding these categorizations.
For our analyses, an outcome can be considered “subjective” for three distinct reasons.
Firstly, if multiple definitions of the outcome exist, subjectivity is introduced when choosing
one definition over another. For example, the Centres for Disease Control definition of
63
pneumonia (NHSN Patient Safety Component Manual. National Healthcare Safety Network.
Centres for Disease Control and Prevention) differs from the ACS National Surgical Quality
Improvement Program definition (ACS NSQIP - Classic Variables and Definitions, Chapter
4). In a hypothetical study of intensive care (ICU) patients, the physicians at hospital X
might use the former definition whereas the physicians at hospital Y, the latter.
Consequently, a judgment is being made as to which definition to use, making “pneumonia”
a subjective outcome. Height (in cm) on the other hand is an objective outcome since the
definition of a centimeter is an internationally standardized measure (Wandmacher and
Johnson 1995).
Second, an outcome may be deemed subjective due to variation in the processes used to
eventually assess its occurrence. For example, consider a study of post-operative patients
where the outcome of interest is the development of a clot in an extremity (i.e. a deep vein
thrombosis, DVT). Some patients with a DVT have associated redness and swelling in their
extremity which prompts the treating physician to order a test confirming the presence or
absence of a DVT. However, many patients have DVTs that are subclinical and have no
associated symptoms (Kelly et al. 2001). This does not mean these patients do not have
DVTs - if the appropriate test was ordered, the results would be positive. Ordering the test to
verify the presence of a DVT however, is subject to the judgment or discretion of the treating
physician. The opportunity to assess the outcome is dependent on judgment.
Third, once the opportunity to assess the outcome arises, if the assessment itself can be
influenced by personal interpretation or opinion, then the outcome is subjective. Consider the
following scenario: in a study of peri-operative antibiotics, the definition of wound infection,
the primary outcome, might be standardized to include the presence of skin erythema (i.e.
redness). Physician A might regard the redness around patient X’s wound as evidence of an
infection. Alternatively, physician B might look at the same redness and believe that an
allergic reaction to the overlying tape has occurred. Even though both physicians are using
the identical definition of a wound infection and are looking at the same phenomena (i.e. the
redness), they are interpreting the situation in two different ways. Consequently, physician A
64
might record a positive outcome whereas physician B would not. Again, judgment plays a
central role in determining whether the outcome has occurred.
In summary, an outcome is subjective if i) there are multiple definitions available;
ii) judgment determines the opportunity to assess the outcome or iii) discretion/interpretation
is implicit in the application of the definition. Therefore, post-operative complications and
length of stay were considered subjective outcomes since both can be influenced by
physicians’ discretion. On the other hand, peri-operative mortality was classified as an
objective outcome, as was the number of lymph nodes harvested.
4.5.2 Summary effect measures
A priori decisions were made to express binary outcomes as odds ratios (OR) and continuous
outcomes as mean differences (MD). When combining binary outcomes across multiple
studies, one can choose from among the following three summary effect measures: risk
difference (RD, also referred to as the absolute risk reduction, ARR), the risk ratio (RR, also
called the relative risk) and the OR (Walter 2000). The former is considered an absolute
measure of effect whereas the latter two are relative measures.
Risk is the probability with which an outcome will occur (Altman, Egger, and Smith 2001).
If the risk of an event occurring is 0.1 or 10%, then 10 out of 100 people will experience the
outcome. The RR is the risk of the outcome in the exposed or treatment group divided by the
risk in the unexposed or control group. A RR of 0.2 implies that the risk of the outcome in
the treatment group is 20% of the risk in the control group. The RD is simply the difference
in risk between the treatment group and the control group. The odds of an event however is
the probability an event will occur divided by the probability that it will not occur (Walter
2000). If the odds of an event are 0.33, then for every one person experiencing an outcome,
three will not.
65
When choosing a summary statistic, it is important to weigh the consistency and the
mathematical properties of the measure with the ease of its interpretation (Altman, Egger,
and Smith 2001). If the heterogeneity in a meta-analysis increases because of the summary
measure chosen, the measure has low consistency. Empirical work by Deeks et al. has
demonstrated that meta-analyses using the RD have higher heterogeneity than when the RR
is computed for the same meta-analyses (Deeks 2002). When comparing RR with OR,
comparable levels of heterogeneity were found. Accordingly, the RR and OR are generally
regarded as more consistent than the RD.
The OR or log(OR) have superior mathematical properties as compared with the RR. In
particular, the OR and its log are considered symmetrical summary effect measures (Walter
2000). For example, if the OR for death in a study is 0.2/0.4=0.50, then the OR for survival
is 2.0. The latter OR is simply the reciprocal of the former. Moreover, if the log(OR) is used,
only a change in sign is required.
Several studies have demonstrated that OR are often misinterpreted as RR and that of the
measures discussed, the OR is the most difficult to intuit (Montreuil, Bendavid, and Brophy
2005; Davies, Crombie, and Tavakoli 1998; Grimes and Schulz 2008; Sinclair and Bracken
1994). With rare events, the difference between the OR and the RR is small but as events
become more common, this gap widens (Montreuil, Bendavid, and Brophy 2005). For
example, consider the following experimental scenario where the outcome was rare
(i.e. ≤10%):
Group Outcome
Total Odds Yes No Experimental 1 20 21 1:20 or 0.05
Control 1 25 26 1:25 or 0.04
The OR in this instance is 0.05/0.04 = 1.25. The RR is (1/21)/(1/26)=1.24. Alternatively, one
could use the using the following formula (Sinclair and Bracken 1994) to calculate the RR:
66
𝑅𝑅 = 𝑂𝑅1+Ic(OR−1)
(3.1)
Ic is the incidence of the event in the control group. In another experiment, events (e.g. death)
were common:
Group Outcome
Total Odds Yes No Experimental 5 1 6 5:1 or 5.0
Control 10 1 11 10:1 or 10.0
In this example, the OR is 5.0/10.0=0.5. However, the RR is equal to (5/6)/(10/11)=0.92 .
When the event became more common, interpreting the OR as a RR would have
considerably overestimated the benefit associated with treatment.
The analyses in this thesis were designed to provide insight into the relationship between
study design and bias. The superior clinical intuitiveness of the RR was therefore non-
contributory. Binary outcomes (i.e. post-operative complications and peri-operative
mortality) were thus expressed as OR because this effect measure has superior consistency
and symmetry as compared with the RD and RR.
For continuous outcomes, the MD was chosen as the summary effect measure instead of the
standardized mean difference (SMD) or the ratio of means (RoM). The MD is the difference
in means between two groups whereas the SMD is equal to this difference divided by the
pooled standard deviation. The RoM is equal to the mean in the experimental group divided
by the mean in the control group. SMD is the measure of choice when the outcome is likely
to be measured in different ways (e.g. using different psychometric scales) across included
studies (Higgins, Green, and Cochrane Collaboration. 2011). The SMD, MD and RoM have
similar consistency (Friedrich, Adhikari, and Beyene 2011) but the MD is the most intuitive
of the three measures. Since the outcomes length of stay and number of lymph nodes
harvested were likely to be measured in days and integers respectively, there was no
advantage in using the SMD. Ultimately, the MD was chosen because the statistical software
67
used (R, version 2.15.0, R Foundation for Statistical Computing, Vienna, Austria) has the
graphing capacity to generate funnel plots with MD but not RoM (Viechtbauer 2010).
4.6 Handling multiple publications of the same cohort
We anticipated encountering multiple publications providing results for the same cohort of
study subjects. In these instances, articles were combined into a data group. Article
attributes, author characteristics and study attributes were abstracted from the earliest
publication in a data group. Outcome data for all-cause mortality, post-operative
complications, LOS and number of LN harvested were separately abstracted from the
publication that provided the most complete information (i.e. for the largest number of
patients).
4.7 Approach to missing data for continuous outcomes
Measures of centrally tendency for continuous variables include the mean, median and mode.
Authors will report either the mean and standard deviation when data are normally
distributed or the median and IQR if data are skewed (Fletcher and Fletcher 2005). During
data abstraction, it was often noted that either the mean or median were reported for LOS and
number of LN harvested. To combine results across studies, means and associated standard
deviations were required (Egger et al. 2003). In instances where only medians and ranges
were reported, medians were treated as means and standard deviations (σ) were calculated
using the following formulae (Hozo, Djulbegovic, and Hozo 2005):
𝜎 = IQR1.35
(3.2)
𝜎 = 𝑟𝑎𝑛𝑔𝑒4
(3.3)
68
IQR is equal to the inter-quartile range. The term range above is equal to the difference
between the lowest value subtracted from the highest.
Occasionally the means for both groups (laparoscopy and open surgery) were provided with
no measure of dispersion. If the p-value for the difference of means was reported, the
missing standard deviation (σ) was calculated using the following steps, assuming that the p-
value arose from a two-tailed, equal variances, Student’s t-test comparison; first, the absolute
value of the t-statistic (t), that corresponds to the two-sided p-value (p) from groups of size n1
and n2, was obtained:
𝑡 = |𝑡𝑑𝑓=𝑛1+𝑛2−2−1 �𝑝
2� | (3.4)
When the t-statistic, sample sizes and group means (�̅�1 and �̅�2) are known, then only the σ is
unknown in the following formula:
𝑡 = |�̅�1−�̅�2|
𝜎� 1𝑛1+ 1𝑛2
(3.5)
Solving for the standard deviation (σ):
𝜎 = |�̅�1−�̅�2|
𝑡� 1𝑛1+ 1𝑛2
(3.6)
For studies that did not adhere to the intention to treat principle, laparoscopy patients whose
operations were converted to open procedures occasionally had their outcomes reported
separately. Alternatively, outcomes were reported for the laparoscopy and open surgery
arms of a study across two time periods. In instances where means were combined across
two or more groups, the following calculations were employed to generate the weighted
mean (µ):
𝜇 = ∑ 𝑛𝑖�̅�𝑖𝑘𝑖∑ 𝑛𝑖𝑘𝑖
(3.7)
69
ni represents the number of patients in a given group and �̅�𝑖 the mean in this group. In order
to calculate the weighted standard deviation (σcalculated) (3.11), one employed the following
formulae to in turn calculate the sums of squares between groups (SSbetween) (3.8), the sums of
squares within a group (SSwithin) (3.9) and the combined sums of squares (SSTotal)
(3.10)(Pagano and Gauvreau 2000):
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = ∑ 𝑛𝑖(�̅�𝑖 − μ)2𝑘𝑖 (3.8)
𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = ∑ (𝑛𝑖 − 1)𝜎𝑖2𝑘
𝑖 (3.9)
𝑆𝑆𝑇𝑜𝑡𝑎𝑙 = 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 + 𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 (3.10)
𝜎𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 = �𝑆𝑆𝑇𝑜𝑡𝑎𝑙∑ (𝑛𝑖−1)k𝑖
(3.11)
Finally, in instances where none of the aforementioned methods could be used to calculate
missing standard deviations, simple imputation was employed (Allison 2002). Using formula
(3.12), the value for the missing standard deviation (σmissing) was calculated from the standard
deviations (σi) and sample sizes (ni) available:
Study n σ
1 n1 σ1
2 n2 σ2
3 n3 σmissing
… … …
k ni σi
𝜎𝑚𝑖𝑠𝑠𝑖𝑛𝑔 = �∑ (𝑛𝑖−1)(σ𝑖)2𝑘𝑖=1
(𝑛𝑖−1)2 (3.12)
The ni in this instance is the number of patients in a given arm of a study – either the
laparoscopy or open surgery arm.
70
4.8 Identifying a referent group – Strong RCTs
4.8.1 Why categorize RCTs as Typical versus Strong?
The analyses of Chapters 5 and 6 required the identification of an appropriate referent group.
The objective of Specific Aim #2 (Chapter 5) was to compare the results of NRS with RCTs.
However, it has been shown that RCTs without appropriate allocation concealment (Schulz
et al. 1995), (Moher et al. 1998), (Kjaergard, Villumsen, and Gluud 2001; Pildal et al. 2007;
Nuesch, Reichenbach, et al. 2009), blinding (Schulz et al. 1995), (Moher et al. 1998),
(Kjaergard, Villumsen, and Gluud 2001; Pildal et al. 2007; Nuesch, Reichenbach, et al. 2009;
Hrobjartsson et al. 2012, 2013), and selective outcome reporting (Chan et al. 2004) are likely
biased. If we compared the results of NRS with those from all the RCTs, we would have
been using a heterogeneous referent group of studies at varying risks of bias. Instead, we
compared summary effect estimates from NRS with those from the low risk of bias RCTs
(i.e. Strong RCTs). The Cochrane Risk of Bias Tool was used to identify Strong RCTs
according to the methods outlined in Section 4.8.2.
The goal of Specific Aim #3 (Chapter 6) was to model the bias associated with NRS design
attributes. Comparing effect estimates from NRS without a design attribute (e.g. concurrent
controls) to those from NRS with the attribute would assume that the effect estimate from the
latter is the most accurate one available – the one closest to the proverbial “truth.” The
estimates of treatment effect from Strong RCTs were instead chosen as the referent group for
two reasons; there is empirical data to support the selection of certain RCTs as less biased
than others (Schulz et al. 1995; Moher et al. 1998; Kjaergard, Villumsen, and Gluud 2001;
Egger et al. 2003; Als-Nielsen et al. 2003; Tierney and Stewart 2005; Pildal et al. 2007;
Nuesch, Reichenbach, et al. 2009; Nuesch, Trelle, et al. 2009; Nuesch et al. 2010; Bassler et
al. 2010; Dechartres et al. 2011; Bafeta et al. 2012; Hrobjartsson et al. 2012, 2013) and
secondly, a RCT at low risk of bias is likely to provide an estimate of treatment effect that is
closest to the “truth.”
71
4.8.2 Cochrane Risk of Bias Tool
Strong RCTs were identified using the Cochrane Risk of Bias Tool. This tool is endorsed by
the Cochrane Collaboration for the evaluation of internal validity of RCTs (Higgins, Green,
and Cochrane Collaboration. 2011). In 2005, members of the Cochrane Bias Methods Group
and the Cochrane Statistical Methods Group developed the first version of the tool. The
development process involved the compilation of potential sources of bias, a review of the
available empirical evidence, informal consensus for the selection and operationalization of
bias domains, the incorporation of feedback in an iterative manner and pilot testing. An
additional three stage project involving focus groups, online surveys and a consensus
meeting were undertaken in 2009 to evaluate the original tool. An updated version of the
instrument was released in 2011 (Higgins et al. 2011).
This instrument is composed of seven items classified under six domains of bias (Table 4.2).
The domains include selection bias, performance bias, detection bias, attrition bias, reporting
bias and other bias. Items appear in the first column of Table 4.2. Of note, some items are
assessed at the study-level whereas others require a separate assessment for each outcome.
Table 4.2 Cochrane Risk of Bias Tool
Item Support for judgement Domain Random sequence generation Describe the method used to generate the
allocation sequence in sufficient detail to allow an assessment of whether it should produce comparable groups.
Selection bias (biased allocation to interventions) due to inadequate generation of a randomised sequence.
Allocation concealment Describe the method used to conceal the allocation sequence in sufficient detail to determine whether intervention allocations could have been foreseen in advance of, or during, enrolment.
Selection bias (biased allocation to interventions) due to inadequate concealment of allocations prior to assignment.
Blinding of participants and personnel Assessments should be made for each main outcome (or class of outcomes).
Describe all measures used, if any, to blind study participants and personnel from knowledge of which intervention a participant received. Provide any information relating to whether the intended blinding was effective.
Performance bias due to knowledge of the allocated interventions by participants and personnel during the study.
Blinding of outcome assessment Assessments should be made for
Describe all measures used, if any, to blind outcome assessors from knowledge of which intervention a participant received. Provide any information relating to whether the
Detection bias due to knowledge of the allocated interventions by outcome assessors.
72
Adapted from Table 8.5.a, Higgins JPT, Altman DG, Sterne JAC (editors). Chapter 8: Assessing risk of bias in included studies. In: Higgins JPT, Green S (editors). Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 (updated March 2011). The Cochrane Collaboration, 2011. Available from www.cochrane-handbook.org.
Each item in the tool is assessed in two steps; summarizing known facts followed by a
judgment. First, free text descriptions or pertinent quotes about what took place are
compiled from published reports, protocols or correspondence with authors. Providing this
supporting information augments the transparency of the assessment process. Second,
specific and detailed criteria are used to assign a judgment of low/unclear/high risk of bias
for each item. These criteria are outlined in the Cochrane Handbook for Systematic Reviews
of Interventions (Appendix C) (Higgins, Green, and Cochrane Collaboration. 2011). The
judgments for each item are used to determine if the study on the whole is at
low/unclear/high risk of bias according to the approach outlined in Table 4.3. Study
assessments are in turn used to determine if a meta-analysis of the studies is at
low/unclear/high risk of bias. The role of judgment is clearly central to the Cochrane Risk of
Bias Tool. It is for this reason the Collaboration suggests that judgments be made
independently by at least two people, with discrepancies resolved via discussion.
each main outcome (or class of outcomes).
intended blinding was effective.
Incomplete outcome data
Assessments should be made for each main outcome (or class of outcomes).
Describe the completeness of outcome data for each main outcome, including attrition and exclusions from the analysis. State whether attrition and exclusions were reported, the numbers in each intervention group (compared with total randomized participants), reasons for attrition/exclusions where reported, and any re-inclusions in analyses performed by the review authors.
Attrition bias due to amount, nature or handling of incomplete outcome data.
Selective reporting State how the possibility of selective outcome reporting was examined by the review authors, and what was found.
Reporting bias due to selective outcome reporting.
Other sources of bias State any important concerns about bias not addressed in the other domains in the tool.
If particular questions/entries were pre-specified in the review’s protocol, responses should be provided for each question/entry.
Bias due to problems not covered elsewhere in the table.
73
Table 4.3 Approach for summary assessments of risk of bias for an item, within a study and within a meta-analysis
Risk of bias Interpretation Judgement within a study Judgement in a meta-
analysis (across studies) Low risk Plausible bias unlikely to
seriously alter the results. Low risk of bias for all key domains.
Most information is from studies at low risk of bias.
Unclear risk Plausible bias that raises some doubt about the results.
Unclear risk of bias for one or more key domains.
Most information is from studies at low or unclear risk of bias.
High risk Plausible bias that seriously weakens confidence in the results.
High risk of bias for one or more key domains.
The proportion of information from studies at high risk of bias is sufficient to affect the interpretation of results.
Adapted from Table 8.7.a, Higgins JPT, Altman DG, Sterne JAC (editors). Chapter 8: Assessing risk of bias in included studies. In: Higgins JPT, Green S (editors). Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 (updated March 2011). The Cochrane Collaboration, 2011. Available from www.cochrane-handbook.org.
The Cochrane Risk of Bias Tool was used to identify Strong RCTs instead of other
instruments for three reasons; firstly, the Cochrane tool contains items used to evaluate the
internal validity of a study whereas other tools often have elements relating to external
generalizability, precision, ethics, or reporting (Katrak et al. 2004). For example, the Jadad
scale awards one point to trials that are “described as randomized” (Jadad et al. 1996). This
item does not focus on whether appropriate methods of randomization were used but instead
on whether the authors employed the terms randomly, random or randomization. In other
words, while the Cochrane Risk of Bias Tool focuses specifically on bias, other instruments
occasionally evaluate other aspects of quality. Second, the Cochrane Risk of Bias Tool was
developed using rigorous methods and pilot tested using studies drawn from multiple areas
of medicine (Higgins et al. 2011). In contrast, the Jadad Scale was developed using trials
reporting pain outcomes or analgesic interventions for outcomes other than pain (i.e. adverse
event profile). Third, the Cochrane Risk of Bias Tool does not produce a summary numerical
score in contrast to other available instruments. It has been demonstrated that the use of
summary scores often leads to inconsistent identification of low risk of bias RCTs (Juni et al.
1999; Herbison, Hay-Smith, and Gillespie 2006). This may be attributable to the fact that
many scales assign weights to included dimensions in an ad hoc manner (Juni et al. 1999).
74
The main limitation of the Cochrane Risk of Bias Tool is variable inter-rater reliability;
weighted kappas range from 0.13 (95% CI, -0.05 to 0.31) for selective outcome reporting to
0.74 (95% CI, 0.64 to 0.85) for sequence generation (Hartling et al. 2012). However,
Hartling and colleagues examined an earlier version of the Cochrane Risk of Bias Tool. In
the interim, the instrument has been revised to diminish ambiguity and has more detailed
guidance for making individual domain assessments (Higgins et al. 2011). This more recent
version of the tool was used in this study.
Strong RCTs were defined a priori as trials that were rated at a low risk of bias for five of the
seven bias items within the Cochrane Risk of Bias Tool; random sequence generation,
allocation concealment, incomplete outcome data, selective outcome reporting, and other
sources of bias. Given how blinding is rare in studies of non-pharmacological interventions
(Boutron et al. 2004), the two blinding items were not included in the definition of “Strong”
RCTs for our surgical data set. Considering we wanted to use this tool to discriminate
between Typical and Strong RCTs, using an item that is consistently lacking among all
studies would not help divide trials into those that are Strong versus Typical. “Other” sources
of bias was defined a priori to include violating intention to treat principles; studies that
erroneously included converted laparoscopic patients with those in the open surgery group
were considered at high risk of bias. Typical RCTs were defined as those not meeting the
criteria for Strong RCTs. Assessments were made using published reports, protocols and by
contacting authors when information was unavailable.
4.8.3 Validating risk of bias assessments
All risk of bias assessments for RCTs reporting the outcomes of interest were performed by
LS. A second individual (DRU) independently assessed the risk of bias assessments for
RCTs reporting post-operative complications (n=20 trials). A Cohen’s kappa statistic was
75
generated for the agreement between two-raters for classification of studies as Typical or
Strong. There was perfect agreement (κ=1.00).
4.9 Statistical analyses
Descriptive statistics were calculated to compare NRS and RCTs in terms of year of
publication, number of participants, academic versus community setting, presence of a
consortium among authors, number of named authors, methodological expertise among
authors, length of articles and baseline event rate (or mean) in control groups. Absolute and
relative frequencies were measured for discrete variables and where appropriate, medians
and IQRs were calculated for continuous variables with a non-normal distribution. Medians
were compared using the Mann-Whitney U test and categorical variables using the Chi-
square or Fisher’s exact test, as appropriate (Pagano and Gauvreau 2000). A p-value <0.05
was considered significant. All data were analyzed using R, version 2.15.0 (R Foundation for
Statistical Computing, Vienna, Austria).
4.10 Results
4.10.1 Data cohort
Once duplicates were removed, 7528 distinct abstracts remained (Figure 4.2). A further 7203
abstracts were excluded. Of the 325 abstracts focusing on laparoscopy versus open surgery
for colon cancer, 133 were excluded; 50 were systematic reviews or meta-analyses, 43 were
studies that provided only biochemical outcomes (e.g. pre- and post-operative IL-6 levels),
and 40 were written in a language other than English (Appendix D). The remaining 192
studies met the a priori inclusion criteria (Table 4.4).
76
Figure 4.2 Flow diagram for the identification of eligible studies.
A total of 144 NRS and 48 RCTs met the inclusion criteria (Appendix E). Included NRS are
described in Table 4.5. Table 4.6 describes included RCTs. Once multiple articles of the
same cohort were combined, 141 NRS (1,179,792 patients) and 26 RCT (14,843 patients)
data groups remained. The data groups were published across 50 journals and between 1993
and 2010. The number of authors varied from 1 to 14 (median, 5) but few studies had ≥1
author affiliated with a department of biostatistics, public health, health policy or
epidemiology (16%). Studies took place in 22 countries; most frequently represented nations
include the United States (30%), Italy (10%) and the United Kingdom (10%). The majority
of studies were conducted in academic settings (81%). The median number of study subjects
77
for NRS was 121 (IQR 54-262, range 14-643,700) and RCTs enrolled a median of 116
patients (IQR 60-326, range 29-1082).
Table 4.4 Characteristics of included studies.
NRS (N=141)
RCTs (N=26) p-value•
Year of publication * 2006 (2000-2009)
2004 (2001-2007) 0.27†
Participants* 121 (54-262)
116 (60-326) 0.89†
Authors
Number* 5 (4-7)
6 (5-8) 0.03†
Consortium among authors§ 2 (1.4)
5 19.2) <0.001♦
Methodological expertise§ 16 (0.7)
9 (34.6) <0.001◊
Academic setting§ 106 (75.2)
23 (88.5) 0.22♦
Number of pages* 6 (5-8)
7 (6-9) 0.02†
* Median, (Interquartile Range, IQR). § Number (percentage). † Medians compared using the Mann-Whitney U test. ♦ Frequencies compared using Fisher’s exact test. ◊ Frequencies compared using Chi-square test. • Statistically significant p values (<0.05) indicated in bold.
78
Table 4.5 Non-randomized studies meeting inclusion criteria
First Author Year Country Journal LAP* OPEN* Senagore, A 1993 United States American Surgeon 38 102 Tate, J 1993 Hong Kong British Journal of Surgery 11 14 Falk, PM 1993 United States Diseases of the Colon & Rectum 54 42 Peters, W 1993 United States Diseases of the Colon & Rectum 24 33 Gray, D 1994 United States Journal of Surgical Oncology 22 35 Van Ye, T 1994 United States Surgical Laparoscopy & Endoscopy 14 20 Musser, D 1994 United States Surgical Laparoscopy & Endoscopy 24 24 Hoffman, G 1994 United States Annals of Surgery 80 53 Franklin, M 1995 United States Surgical Endoscopy 84 84 Ramos, J 1995 United States Diseases of the Colon & Rectum 95 105 Saba, A 1995 United States Annals of Surgery 25 25 Ou, H 1995 United States Diseases of the Colon & Rectum 12 12 Konishi, F 1996 Japan Japanese Journal of Surgery 20 47 Begos, D 1996 United States Surgical Endoscopy 50 34 Franklin, M 1996 United States Diseases of the Colon & Rectum 191 224 Bokey, E 1996 Australia Diseases of the Colon & Rectum 28 33 Fleshman, J 1996 United States Diseases of the Colon & Rectum 54 35 Gellman, L 1996 United States Surgical Endoscopy 24 33 Hotokezaka, M 1996 United States Surgical Endoscopy 7 7 Goh, Y 1997 Singapore Diseases of the Colon & Rectum 20 20 Leung, K 1997 Hong Kong Archives of Surgery 50 50 Khalili, T 1998 United States Diseases of the Colon & Rectum 80 90 Psaila, J 1998 United Kingdom British Journal of Surgery 25 29 Bouvet, M 1998 United States American Journal of Surgery 91 57 Leung, K 1999 Hong Kong Journal of Surgical Oncology 28 56 Santoro, E 1999 Italy Hepato-Gastroenterology 36 36
79
Stewart, B. T 1999 Australia British Journal of Surgery 42 35 Schwandner, O 1999 Germany International Journal of Colorectal Disease 32 32 Kakisako, K 2000 Japan Surgical Laparoscopy, Endoscopy & Percutaneous
Techniques 20 23 Lezoche, E 2000 Italy Hepato-Gastroenterology 150 160 Chen, W 2000 China Formosan Journal of Surgery 27 42 Marubashi, S 2000 Japan Surgery Today 40 28 Stocchi, L 2000 United States Diseases of the Colon & Rectum 42 42 Hartley, J. E 2000 United Kingdom Annals of Surgery 53 41 Chen, H 2000 United States Diseases of the Colon & Rectum 83 83 Nishiguchi, K 2001 Japan Diseases of the Colon & Rectum 15 12 Lezoche, E 2001 Italy Journal of the Laparoendoscopic & Advanced Surgical
Techniques 207 153 Hong, D 2001 Canada Diseases of the Colon & Rectum 98 219 Yamamoto, S 2001 Japan Hepato-Gastroenterology 43 43 Mall, J. W 2001 Germany British Journal of Surgery 32 52 Law, WL 2002 Hong Kong Journal of the American College of Surgeons 65 89 Lezoche, E 2002 Italy Surgical Endoscopy 140 107 Braga, M 2002 Italy Surgical Endoscopy 26 26 Vasilev, K 2002 Bulgaria Acta Chirurgica Iugoslavica 31 36 Lujan, H 2002 United States Diseases of the Colon & Rectum 102 233 Feliciotti, F 2002 Italy Surgical Endoscopy 74 75 Feliciotti, F 2002 France Surgical Laparoscopy & Endoscopy 74 83 Kasparek, MS 2003 Germany Journal of Gastrointestinal Surgery 11 9 Kayser, J 2003 Luxembourg Bulletin de la Societe des Sciences Medicales du Grand-
Duche de Luxembourg 76 27 Inoue, Y 2003 Japan Surgical Endoscopy 32 30 Ma, H 2003 China Formosan Journal of Surgery 58 31 Lezoche, E 2003 Italy Minerva Chirurgica 310 159
80
Sklow, B 2003 United States Surgical Endoscopy 77 77 Patankar, S. K 2003 United States Diseases of the Colon & Rectum 172 172 Senagore, A 2003 United States Archives of Surgery 231 245 Delaney, C 2003 United States Annals of Surgery 150 150 Adahci, Y 2003 Japan Hepato-Gastroenrology 26 87 Kojima, M 2004 Japan Surgery Today 118 163 Baker, RP 2004 United Kingdom Diseases of the Colon & Rectum 33 66 Neri, V 2004 Italy Annali Italiani di Chirurgia 7 10 Capussotti, L 2004 Italy Surgical Endoscopy 74 181 Zheng, M 2005 China World Journal of Gastroenterology 30 34 Delaney, C 2005 United States Diseases of the Colon & Rectum 94 94 Vignali, A 2005 Italy Diseases of the Colon & Rectum 61 61 Sahakitrungruang, C 2005 Thailand Journal of the Medical Association of Thailand 24 25 Pokala, N 2005 United States Surgical Endoscopy 34 34 Law, WL 2006 Hong Kong Diseases of the Colon & Rectum 98 167 Lezoche, E 2006 Italy Surgical Endoscopy 85 64 Del Rio 2006 Italy Minerva Chirurgica 27 25 Aboulian, A 2006 Italy Minerva Chirurgica 147 25 Nakamura, T 2006 Japan Hepato-Gastroenterology 59 59 Feng, B 2006 China Aging - Clinical and Experimental Research 51 102 MacKay, G 2006 United Kingdom Colorectal Disease 22 58 Wahl, P 2006 Switzerland ANZ Journal of Surgery 187 215 Sample, CB 2006 Canada Surgical Endoscopy 21 21 Gonzalez, R 2006 United States Diseases of the Colon & Rectum 238 260 Ng, SSM 2006 Hong Kong Surgical Endoscopy 6 12 Salloum, RM 2006 United States Journal of the American College of Surgeons 14 54 Law, WL 2007 Hong Kong Annals of Surgery 255 401 Napolitano, L 2007 Italy Giornale di Chirurgia 73 141 Boni, L 2007 Italy Surgical Oncology 88 75
81
Choi, Y 2007 Korea Surgery Today 26 41 Osarogiagbon, RU 2007 United States Clinical Colorectal Cancer 39 55 McCloskey, CA 2007 United States Surgery 23 22 Tong, DKH 2007 Hong Kong Journal of the Society of Laparoendoscopic Surgeons 77 105 Salimath, J 2007 United States Journal of the Society of Laparoendoscopic Surgeons 68 179 Noblett, SE 2007 United Kingdom Surgical Endoscopy 30 30 Lordan, JT 2007 United Kingdom Colorectal Disease 109 44 Guo, D 2007 Australia ANZ Journal of Surgery 50 33 Lohsiriwat V 2007 Thailand World Journal of Surgery 13 21 Hinojosa, MW 2007 United States Journal of Gastrointestinal Surgery 190 3185 Park, JS 2007 Korea World Journal of Surgery 116 81 Law, WL 2008 Hong Kong Annals of Surgical Oncology 77 123 Mirza, MS 2008 United Kingdom Journal of Laparoendoscopic Surgery 116 117 Kemp, JA 2008 United States Surgical Innovation 27930 615722 Andersen, LPH 2008 Denmark Surgical Endoscopy 58 143 Bilimoria, KY 2008 United States Journal of Gastrointestinal Surgery 837 2222 Cermak, K 2008 Belgium Hepato-Gastroenterology 45 120 Varela, JE 2008 United States American Surgeon 3353 47090 Seitz, G 2008 Germany Surgical Laparoscopy, Endoscopy & Percutaneous
Techniques 39 38 Steele, SR 2008 United States Journal of Gastrointestinal Surgery 3296 95627 Imai, E 2008 Japan American Journal of Infection Control 231 75 Delaney, C 2008 United States Annals of Surgery 11044 21689 Buchanan, GN 2008 United Kingdom British Journal of Surgery 230 135 Ihedioha, U 2008 United Kingdom Surgical Endoscopy 32 61 Bilimoria, KY 2008 United States Archives of Surgery 11038 231381 Nakamura, T 2008 Japan World Journal of Surgery 101 43 Gameiro, M 2008 Germany Surgical Innovation 45 25 Kim, H 2009 Korea Surgical Endoscopy 37 50
82
Zmora, O 2009 Israel Surgical Endoscopy 227 103 Chikkappa, M 2009 United Kingdom International Journal of Colorectal Disease 57 49 Faiz, O 2009 United Kingdom Colorectal Disease 191 50 Yin, W 2009 China Hepato-Gastroenterology 32 30 Wilks, J 2009 United States American Journal of Surgery 60 60 Tan, W 2009 Singapore International Journal of Colorectal Disease 37 40 Scarpa, M 2009 Italy Surgical Endoscopy 21 21 Ptok, H 2009 Germany European Journal of Surgical Oncology 346 8307 Poon, J 2009 Hong Kong Annals of Surgery 296 715 Faiz, O 2009 United Kingdom Diseases of the Colon & Rectum 1095 60851 Shabbir, A 2009 Singapore ANZ Journal of Surgery 32 32 Kennedy, GD 2009 United States Annals of Surgery 2869 4800 Park, J 2009 Korea Surgical Laparoscopy, Endoscopy & Percutaneous
Techniques 119 145 Tei, M 2009 Japan Surgical Laparoscopy, Endoscopy & Percutaneous
Techniques 78 51 Nakamura, T 2009 Japan Surgery Today 100 100 Lin, JH 2009 United States Surgical Innovation 99 70 Kiran, R 2010 United States Archives of Surgery 143 143 Marshall, C 2010 United States American Journal of Surgery 33 17 Maeda, T 2010 Japan Surgical Endoscopy 32 43 Abdel-Halim, M 2010 United Kingdom Annals of the Royal College of Surgeons of England 22 34 Akiyoshi, T 2010 Japan Journal of Gastrointestinal Surgery 253 39 Balentine, C 2010 United States Journal of Surgical Research 42 113 da Luz Moreira, A 2010 United States Surgical Endoscopy 231 231 El-Gazzaz, G 2010 United States Surgical Endoscopy 243 486 Fujii, S 2010 Japan International Journal of Colorectal Disease 258 258 Han, K 2010 Korea Journal of the Korean Society of Coloproctology 35 55 Hemandas, A 2010 United Kingdom Annals of Surgery 224 200
83
Jiang, J 2010 China International Journal of Colorectal Disease 20 19 Kiran, R 2010 United States Journal of the American College of Surgeons 3414 7565 Kurian, A 2010 United States Journal of Surgical Education 150 95 Lian, L 2010 United States Surgical Endoscopy 97 97 Lloyd, G 2010 Multinational Surgical Endoscopy 97 97 Madbouly, K 2010 Egypt British Journal of Surgery 20 10 El-Gazzaz, G 2010 United States Surgical Endoscopy 1516 3528 Morris, E 2011 United Kingdom British Journal of Surgery 238 470
* Number of study subjects with colon cancer in each arm of the study
84
Table 4.6 Randomized controlled trials meeting inclusion criteria
First Author Year Country Journal LAP* OPEN* Lacy, A 1995 Spain Surgical Endoscopy 25 26 Ortiz, H 1996 Spain International Journal of Colorectal Disease 20 20 Milsom, J 1997 United States Journal of Surgical Research 55 54 Stage, J 1997 Denmark British Journal of Surgery 15 14 Lacy, A 1998 Spain Surgical Endoscopy 31 40 Schwenck 1998 Germany Surgical Endoscopy 30 30 Schwenk, W 1998 Germany Langenbecks Archives of Surgery 30 30 Schwenk, W 1999 Germany Archives of Surgery 30 30 Delgado, S 2000 Spain Surgical Endoscopy 129 126 Curet, M 2000 United States Surgical Endoscopy 25 18 Lacy, A 2002 Spain Lancet 111 108 Braga, M 2002 Italy Annals of Surgery 136 133 Weeks, J 2002 United States Journal of the American Medical Association 228 221 Winslow, E 2002 United States Surgical Endoscopy 37 46 Hasegawa, H 2003 Japan Surgical Endoscopy 24 26 Basse, L 2003 Denmark Surgical Endoscopy 16 16 Janson, M 2004 Sweden British Journal of Surgery 98 112 Kang, J 2004 China Surgical Endoscopy 30 30 Kaiser, A 2004 United States Journal of the Laparoendoscopic & Advanced Surgical
Techniques 28 20 Leung, K 2004 Hong Kong Lancet 203 200 Vignali, A 2004 Italy Diseases of the Colon & Rectum 190 194 Nelson, H 2004 Multinational New England Journal of Medicine 435 428 Braga, M 2005 Italy Diseases of the Colon & Rectum 190 201 Guillou, P 2005 Multinational Lancet 140 273 Basse, L 2005 Denmark Annals of Surgery 30 30
85
Veldkamp, R 2005 Multinational Lancet 536 546 Braga, M 2005 Italy Annals of Surgery 258 259 King, P 2006 United Kingdom British Journal of Surgery 41 19 Franks, P 2006 United Kingdom British Journal of Cancer 452 230 Liang, J 2007 China Annals of Surgical Oncology 135 134 Jayne, D 2007 United Kingdom Surgical Innovation 526 268 Braga, M 2007 Italy Annals of Surgery 113 113 Chung, C 2007 Hong Kong Annals of Surgery 41 40 Janson, M 2007 Sweden Surgical Endoscopy 130 155 Fleshman, J 2007 Multinational Annals of Surgery 435 428 King, P 2008 United Kingdom International Journal of Colorectal Disease 41 19 Lacy, A 2008 Spain Annals of Surgery 106 102 Frasson, M 2008 Italy Diseases of the Colon & Rectum 268 267 Hewett, P 2008 Multinational Annals of Surgery 294 298 Gonzalez, I 2008 Spain International Journal of Colorectal Disease 59 57 COLOR Study Group 2009 Multinational Lancet 534 542 Ng, S 2009 Hong Kong Diseases of the Colon & Rectum 76 77 Neudecker, J 2009 Germany British Journal of Surgery 250 222 Pascual, M 2010 Spain British Journal of Surgery 60 60 Taylor, G 2010 United Kingdom Formosan Journal of Surgery 280 131 Allardyce, R 2010 Multinational British Journal of Cancer 294 298 Braga, M 2010 Italy British Journal of Surgery 134 134 Jayne, D 2010 United Kingdom British Journal of Surgery 212 549
* Number of study subjects with colon cancer in each arm of the study
86
4.10.2 Strong RCTs
4.10.2.1 Post-operative complications
Twenty RCTS reported post-operative complications. Most of these trials were at low risk of
bias for the items incomplete outcome reporting and other bias. However, only 75% of RCTs
(n=15) were at low risk of bias for random sequence generation and fewer were at low risk of
bias for allocation concealment (n=13, 65%). A minority of trials were at low risk of bias for
selective outcome reporting (n=4, 20%).
Table 4.7 Summary of risk of bias item responses for RCTs reporting post-operative complications.
Risk of Bias Low n (%)
Unclear n (%)
High n (%)
Randomization sequence generation 15 (75) 5 (25) 0 (0) Allocation concealment 13 (65) 10 (50) 0 (0) Blinding of participants and personnel 1 (5) 0 (0) 19 (95) Blinding of outcome assessment 1 (5) 0 (0) 19 (95) Incomplete outcome data 19 (100) 0 (0) 1 (5) Selective outcome reporting 4 (20) 16 (80) 0 (0) Other bias 19 (100) 0 (0) 1 (5)
Individual item assessments were used to classify RCTs as either Typical (i.e. unclear or high
risk of bias) or Strong (i.e. low risk of bias) according to the guidance in Table 4.3. Four
trials were categorized as Strong RCTs (Guillou 2005, Hewett 2008, Nelson 2004, Veldkamp
2005). These four studies were at low risk of bias for five of seven bias domains
(randomization sequence generation, allocation concealment, incomplete outcome data,
selective reporting and “other” bias). The remaining 19 RCTs were at an unclear risk of bias
for selective outcome reporting as none had published protocols. A number of these studies
were at unclear risk of bias for random sequence generation and allocation concealment.
Accordingly, these 19 studies were classified as Typical RCTs.
87
4.10.2.2 Peri-operative mortality
Seventeen RCTs reported 30-day peri-operative mortality. A third of these trials had unclear
random sequence generation and unclear allocation concealment (Table 4.8). All studies
were at low risk of bias for blinding of outcome assessment since mortality is considered an
objective outcome by the Cochrane Collaboration. A minority of trials were at low risk of
bias for selective outcome reporting (n=4, 24 %).
Table 4.8 Summary of risk of bias item responses for RCTs reporting peri-operative mortality
Risk of Bias Low n (%)
Unclear n (%)
High n (%)
Randomization sequence generation 11 (65) 6 (35) 0 (0) Allocation concealment 11 (65) 6 (35) 0 (0) Blinding of participants and personnel 1 (6) 16 (94) 0 (0) Blinding of outcome assessment 16 (94) 0 (0) 1 (6) Incomplete outcome data 17 (100) 0 (0) 0 (0) Selective outcome reporting 4 (24) 13 (76) 0 (0) Other bias 16 (94) 0 (0) 1 (6)
Four trials were identified as Strong RCTs (Guillou 2005, Hewett 2008, Nelson 2004,
Veldkamp 2005). These four studies were rated at low risk of bias for all seven bias domains
of the Cochrane Risk of Bias Tool. The remaining 15 trials were classified as Typical RCTs
because of deficits with regards to randomization, allocation concealment, and the possibility
of selective outcome reporting.
4.10.2.3 Length of stay
A total of twenty-two RCTs reported length of stay. Approximately one-fifth of these studies
were at unclear risk of bias for random sequence generation and allocation concealment.
Only one study employed blinding and the vast majority of trials were at unclear risk of bias
88
for blinding of participants/personnel and blinding of outcome assessment. Four RCTs were
at low risk of bias for selective outcome reporting since these trials were registered studies
with published protocols.
Table 4.9 Summary of risk of bias item responses for RCTs reporting length of stay
Risk of Bias Low n (%)
Unclear n (%)
High n (%)
Randomization sequence generation 14 (64) 7 (22) 0 (0) Allocation concealment 14 (64) 7 (22) 1 (5) Blinding of participants and personnel 1 (5) 21 (95) 0 (0) Blinding of outcome assessment 1 (5) 20 (90) 1 (5) Incomplete outcome data 21 (95) 0 (0) 1 (5) Selective outcome reporting 4 (18) 18 (82) 0 (0) Other bias 22 (100) 0 (0) 0 (0)
While 25 RCTs reported length of stay, only four were identified at Strong RCTs. These four
studies were at low risk of bias for five of seven bias domains (randomization sequence
generation, allocation concealment, incomplete outcome data, selective reporting and “other”
bias). These trials were the same four studies identified as Strong for the previous two
outcomes.
89
4.10.2.4 Number of lymph nodes harvested
A total of seventeen RCTs reported the number of lymph nodes found within the surgical
specimen. A notable proportion of studies were at an unclear risk of bias with regards to
random sequence generation and allocation concealment (n=6, 35%). Blinding was a rarity in
these trials. Four studies had published protocols and were thus at low risk of bias for
selective outcome reporting.
Table 4.10 Summary of risk of bias item responses for RCTs reporting number of lymph nodes harvested
Risk of Bias Low n (%)
Unclear n (%)
High n (%)
Randomization sequence generation 11 (65) 6 (35) 0 (0) Allocation concealment 11 (65) 6 (35) 0 (0) Blinding of participants and personnel 0 (0) 0 (0) 17 (100) Blinding of outcome assessment 1 (6) 16 (94) 0 (0) Incomplete outcome data 16 (94) 0 (0) 1 (6) Selective outcome reporting 4 (24) 13 (76) 0 (0) Other bias 16 (94) 0 (0) 1 (6)
Of these trials, four were identified as Strong RCTs. These four studies were at low risk of
bias for five of seven bias domains (randomization sequence generation, allocation
concealment, incomplete outcome data, selective reporting and “other” bias). Again, these
four trials were the same four identified as least biased for the previous three outcomes, post-
operative complications, peri-operative mortality, and length of stay.
90
4.11 Risk of bias assessment summary
Four studies were consistently identified as Strong RCTs (Nelson 2004, Guillou 2005,
Veldkamp 2005, Hewett 2008) across the four outcomes of interest. The remaining trials, the
Typical RCTs, were often at unclear risk of bias for randomization sequence generation and
allocation concealment. All of the Typical RCTs were at unclear risk of bias for selective
outcome reporting; the absence of published protocols among these trials precluded the
assessment of this item. This finding is not unexpected since guidance within the Cochrane
Handbook suggests that most studies are expected to be rated at an unclear risk for this
domain precisely for this reason. It was truly the presence of published protocols that set the
Strong RCTs apart from the rest of the Typical RCTs. Additionally, the Strong RCTs were
publicly funded, multi-center trials that had sample sizes of over 400 patients. Trials with
these attributes have been shown to be less susceptible to bias (Als-Nielsen et al. 2003;
Dechartres et al. 2011; Bafeta et al. 2012).
91
Chapter 5 Comparing effect estimates from
non-randomized studies and randomized controlled trials
5.1 Summary
Background
Multiple studies suggest that effect estimates from NRS are comparable to those from RCTs.
However, it has also been shown that biased effect estimates arise in RCTs in the absence of
certain study attributes. Comparisons of NRS and RCTs to date have likely compared NRS
with a heterogeneous group of RCTs.
Objectives
To compare the results of NRS with those of RCTs at low risk of bias. Studies comparing
laparoscopy and conventional (open) surgical treatment of colon cancer were used for this
case study.
Methods
All studies comparing laparoscopy with conventional surgery for the management of colon
cancer were identified. Random-effects meta-analysis was separately performed for two
subjective outcomes (post-operative complications and length of stay [LOS]) and two
objective outcomes (mortality and number of lymph nodes harvested). Meta-analysis was
92
performed for i) All Studies, ii) NRS, iii) RCTs, iv) Typical RCTs and v) Strong RCTs. The
Cochrane Risk of Bias Tool was used to classify studies as “Strong” (low risk of bias) or
“Typical” (unclear or high risk of bias). Meta-regression was conducted with study design as
a predictor variable. Bayesian meta-regression sensitivity analyses assessed the impact of
period effects and between-study case-mix (i.e. baseline event rate) in addition to study
design.
Results
A total of 144 studies reported the outcomes of interest (NRS=121, RCT=23). For post-
operative complications, the odds ratios from NRS were 36% smaller (i.e. demonstrating
more benefit) than those from Strong RCTs (ROR 0.64, [0.42, 0.97], p=0.04). The same
exaggerated benefit among NRS was seen when assessing LOS, (Difference in Mean
Differences, -2.15, [-4.08, -0.21], p=0.03). This pattern was not observed for the objective
outcomes (mortality, ROR 0.74, [0.38, 144], p=0.38, and number of LN harvested, DMD
0.49, [-1.43, 2.42], p=0.62). For both subjective outcomes, Typical RCTs also had more
extreme estimates of benefit as compared with Strong RCTs (post-operative complications,
ROR 0.63, [0.42,0.96], p=0.03 and LOS, DMD -1.40, [-2.76, -0.04], p=0.04). Multivariable
meta-regression results, adjusted for period effects and case-mix between studies, were
similar to the unadjusted meta-regression analyses.
Conclusions
When evaluating subjective outcomes, effect estimates from NRS were associated with
larger estimates of benefit for laparoscopy than Strong RCTs. Typical RCTs also had more
extreme estimates of benefit for laparoscopy as compared with Strong RCTs. Similar trends
were not observed among objective outcomes (mortality and number of lymph nodes
harvested).
93
5.2 Introduction
Randomized controlled trials (RCTs) are considered the gold standard for assessing the
efficacy of therapeutic interventions. Accordingly, systematic reviews and meta-analyses of
RCTs are placed at the top of the evidence hierarchy. In the absence of RCTs, meta-analyses
of non-randomized studies (NRS) may be conducted and this practice is becoming
increasingly common place. However, while some studies suggest that effect estimates from
NRS are comparable to those from RCTs (Concato, Shah, and Horwitz 2000; Benson and
Hartz 2000), others have found important differences (Britton et al. 1998; Shikata et al. 2006;
Kunz, Vist, and Oxman 2007).These comparisons have often included studies performed
over multiple decades, with prominent differences between patients and clinical settings. It
remains unclear how the conclusions of these studies may have been influenced by period
effects and clinical heterogeneity.
Meta-epidemiological studies have shown that RCTs without appropriate random-sequence
generation, allocation concealment and double-blinding yield biased estimates of treatment
effect (Schulz et al. 1995; Moher et al. 1998; Kjaergard, Villumsen, and Gluud 2001; Pildal
et al. 2007; Wood et al. 2008; Nuesch, Reichenbach, et al. 2009; Hrobjartsson et al. 2012;
Savovic et al. 2012; Hrobjartsson et al. 2013). Some studies have also suggested that
objective outcomes, such as mortality are not influenced by the presence or absence of these
study characteristics (Wood et al. 2008; Savovic et al. 2012) whereas subjective outcomes,
such as pain or complications, may instead be more susceptible to bias. Previous
comparisons of NRS and RCTs have not distinguished between RCTs at high or low risk of
bias. The agreement between RCTs at low risk bias and NRS remains unknown.
Our primary objective was to compare effect estimates from RCTs at low risk of bias with
those from NRS, across objective and subjective outcomes. Our secondary aim was to
evaluate how comparisons were influenced by period effects and differences in baseline
event rate in the control groups — a measure of underlying risk in enrolled patients.
94
5.3 Methods
We focused our case study of bias on studies evaluating laparoscopy and conventional
(i.e. open) surgery for colon cancer. These two surgical techniques have been directly
compared via numerous NRS and RCTs. A systematic review was undertaken to identify all
comparative studies (Chapter 4, Section 4.2). Only those studies providing sufficient
information to generate a summary effect estimate for the outcomes of interest (post-
operative complications, peri-operative mortality, length of stay and number of lymph nodes
harvested) were used for the analyses that follow. Post-operative complications and length of
stay were categorized as subjective outcomes whereas peri-operative mortality and number
of lymph nodes harvested were considered objective outcomes (Chapter 4, Section 4.5.1).
These studies were grouped into NRS, Typical RCTs or Strong RCTs according to the
methods outlined in Chapter 4 (Section 4.4 and 4.8).
5.3.1 Statistical analyses
5.3.1.1 Descriptive statistics
Descriptive statistics were calculated to compare NRS and RCTs in terms of year of
publication, number of participants, academic versus community setting, presence of a
consortium among authors, number of named authors, methodological expertise among
authors, length of articles and baseline event rate (or mean) in control groups. Absolute and
relative frequencies were measured for discrete variables and where appropriate, medians
and IQRs were calculated for continuous variables with a non-normal distribution. Medians
were compared using the Mann-Whitney U test and categorical variables using the Chi-
square or Fisher’s exact test, as appropriate (Pagano and Gauvreau 2000). A p-value <0.05
was considered significant. All data were analyzed using R, version 2.15.0 (R Foundation for
Statistical Computing, Vienna, Austria).
95
5.3.1.2 Meta-analysis
5.3.1.2.1 Justification for model selection
One of the aims of a meta-analysis is to produce an overall or combined effect estimate.
Either a fixed-effect or random-effects model may be used. Whereas the fixed-effect model
assumes there is one underlying effect shared by all studies, random-effects models assumes
studies are estimating different underlying effects (Figure 5.1) (Altman, Egger, and Smith
2001).
Figure 5.1 Relationship between observed data, true study effects and the common treatment effect in fixed and random-effects meta-analysis. σk
2 = observed standard error. τb2 = between-study variance in common
(true treatment effect).
Sampling error is assumed to be the sole source of variation when estimating the combined
effect in a fixed-effect model. Thus, the observed variation between individual treatment
effects is attributed solely to chance. In contrast, the true study effect could vary from study
to study in a random-effects model due to factors related to the patient population,
intervention delivery, and study methodology (i.e. clinical and methodological
heterogeneity). The observed study effects in random-effects meta-analysis are each
considered to have been sampled from a distribution of possible true effects. The mean of
this distribution is the combined effect in a random-effects model. Therefore, there are two
96
levels of sampling leading to two sources of variation in random-effects models: individual
patients in studies are sampled from the population of possible study subjects (sampling
error) and studies are each drawn from the distribution of all possible studies (Viechtbauer
2010).
The random-effects modelling approach was chosen for the analyses that follow for two
reasons. First, fixed-effects models generate confidence intervals that are too narrow by
failing to incorporate between-study heterogeneity when it exists (Altman, Egger, and Smith
2001). Second, random-effects modelling is considered standard within the meta-analysis
community (Borenstein 2009).
5.3.1.2.2 Random-effects meta-analysis
When 𝑌𝑖 is the estimate of the effect size in a given study and θ𝑖 is the true effect in that
study, then 𝑌𝑖 is expressed as:
𝑌𝑖 = θ𝑖 + 𝜀𝑖 (5.1)
𝜀𝑖 is the sampling error with which 𝑌𝑖 estimates θ𝑖 (Sutton et al. 1998). This equation can be
further expanded by replacing θ𝑖:
𝑌𝑖 = 𝜇 + ζ𝑖 + 𝜀𝑖 (5.2)
𝜇 is the true effect (mean of the distribution of possible effects) ζ𝑖 is the difference between θ𝑖 and 𝜇 (Figure 5.2) and represents
systematic error or heterogeneity 𝜀𝑖 represents random error (sampling error)
97
Figure 5.2 Relationship between the overall true effect (µ), the true effect in a given study (θ) and the observed effect (Yi).
The general equation for estimating the combined effect size from 𝑘 studies is presented
below where 𝑤𝑖 is the weight of an individual study:
𝑌� = ∑ 𝑤𝑖𝑌𝑖k𝑖=1∑ 𝑤𝑖k𝑖=1
(5.3)
Fixed-effect and random effects models differ computationally in how weights are
calculated. For fixed effects, the following equation is used:
𝑤𝑖 = 1𝑣𝑖
(5.4)
where 𝑣𝑖 represents within-study variance. Weights in fixed effects models are often simply
equal to the inverse of within-study variance. Therefore, large studies that have small
variance (i.e. more precision) are weighted more heavily than smaller studies. In random-
effects models, between-study variation (�̂�2) is also incorporated:
𝑤𝑖 = 1𝑣𝑖+ 𝜏�2
(5.5)
98
In random-effects models, the weights are not as dispersed as with fixed-effects modelling
and large studies have less influence on the overall estimate (𝑌�). Between-study variation
(�̂�2) is calculated using Cochrane’s Q statistic and the degrees of freedom (𝑑𝑓 = 𝑘 − 1):
𝑄 = ∑ 𝑤i𝑘𝑖=1 (𝑌𝑖 − 𝑌�)2 (5.6)
�̂�2 = max
⎣⎢⎢⎢⎡0, � 𝑄−(𝑘−1)
�∑ 𝑤𝑖− ∑ 𝑤i
2𝑘𝑖=1
𝑘𝑖=1
∑ 𝑤𝑖𝑘𝑖=1
��
⎦⎥⎥⎥⎤ (5.7)
Random-effects meta-analysis was separately performed for i) all studies, ii) NRS,
iii) RCTs, iv) Typical RCTs and v) Strong RCTs for each of the outcomes of interest
according to the methods described by DerSimonian and Laird (DerSimonian and Laird
1986). Heterogeneity between studies was assessed by calculating the 𝐼2 for each study:
𝐼2 = 100% x (𝑄−𝑑𝑓𝑄
) (5.8)
𝐼2 is a quantity that describes the percentage of total variation across studies that is due to
heterogeneity instead of chance (Fletcher 2007). The Cochrane Handbook for Systematic
Reviews recommends using the following approach for interpreting I2 values: 0-40%,
heterogeneity “may not be important”; 30-60%, heterogeneity may be moderate; 50-90%,
heterogeneity may be substantial; 75-100%, heterogeneity is “considerable” (Higgins, Green,
and Cochrane Collaboration. 2011). Publication bias was assessed using visual inspection of
funnel plots (Sterne et al. 2011). Axes for plots were chosen according to the principles
outlined by Sterne and Egger (Sterne and Egger 2001). All analyses were performed using R,
version 2.15.0 (R Foundation for Statistical Computing, Vienna, Austria).
99
5.3.1.3 Meta-regression
Meta-regression was used to compare effect estimates across groups of studies. Meta-
regression can be either a linear or a logistic regression model. The unit of analysis in these
models is the study. Predictors in the model are also study-level covariates (e.g. study
design) (Morton et al. 2004).
Yi=βο+ β1xi1 + β2xi2 +…+ βpxip + ζ𝑖 + 𝜀𝑖 (5.10)
Yi represents the effect estimate in a particular study βο is the intercept (average true effect when the value of all
predictor variables is equal to zero) βp denotes how the average true effect change for a one unit
increase in xip p represents the number of covariates in the model ζ𝑖represents remaining heterogeneity between studies (not explained
by covariates in the model) εi ~ N (0,vi) represents random sampling error
Thus, the meta-regression approach uses regression analysis to determine the influence of
independent (predictor) variables on the effect size (dependent variable) in a study (Sterne et
al. 2002) (Higgins and Thompson 2004). Meta-analysis can be considered a special case of
meta-regression where part of the between-study heterogeneity is explained by study-level
covariates; when there are no predictor variables in equation 5.10, it reduces to the general
equation 5.2 for 𝑌𝑖 (the effect size in a given study).
Logistic regression models were developed for binary outcomes (post-operative
complications and mortality) and linear models for continuous outcomes (length of stay and
number of lymph nodes harvested). The coefficients in binary models were exponeniated to
generate ratios of odds ratios (ROR);
ROR = 𝑒β (5.11)
100
𝑅𝑂𝑅 = combined ORx=study design A
combined ORx=study design B (5.12)
An OR less than one indicates that laparoscopy is more beneficial than conventional (open)
surgery. An OR closer to zero denotes more benefit for laparoscopy. A ROR less than one
indicates that the combined OR in the numerator of equation 5.12 was smaller than the
combined OR in the denominator. For example, consider a comparison of NRS with RCTs
for post-operative complications. If the aggregate effect estimate for NRS was 0.75 and 1.25
for RCTs, then the meta-regression results comparing these study designs would generate an
ROR roughly equal to 0.75/1.25 or 0.60. This would imply that NRS estimates showed 40%
more benefit (i.e. smaller OR) than RCTs.
Meta-regressions for linear outcomes (i.e. length of stay and number of lymph nodes) instead
yield differences in mean differences (DMD):
𝐷𝑀𝐷 = 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 𝑚𝑒𝑎𝑛 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑠𝑡𝑢𝑑𝑦 𝑑𝑒𝑠𝑖𝑔𝑛 𝐴 − 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 𝑚𝑒𝑎𝑛 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑠𝑡𝑢𝑑𝑦 𝑑𝑒𝑠𝑖𝑔𝑛 𝐵 (5.13)
For example, consider a comparison of NRS and RCTs for length of stay. At the study-level,
outcomes are expressed as a mean difference (MD):
𝑀𝐷 = 𝑚𝑒𝑎𝑛𝑙𝑎𝑝𝑎𝑟𝑜𝑠𝑐𝑜𝑝𝑦 − 𝑚𝑒𝑎𝑛𝑜𝑝𝑒𝑛 𝑠𝑢𝑟𝑔𝑒𝑟𝑦 (5.14)
A meta-analysis of NRS yields an aggregate mean difference (MD) of -1.50, indicating that
the length of stay for laparoscopy was 1.5 days shorter than the length of stay for patients
undergoing open surgery. The MD for RCTs was -0.25. The results of a meta-regression
comparing NRS with RCTs would thus be equal to -1.50 – (-0.25) or -1.25. This DMD
indicates that the difference in length of stay between laparoscopy and open surgery in NRS
was 1.25 larger on average than in RCTs. Note however, that a negative MD for length of
stay favors laparoscopy but for number of lymph nodes harvested, favors open surgery.
All meta-regression analyses were performed according to the methods outlined by
Thompson and Higgins (Thompson and Higgins 2002) using R, version 2.15.0 (R
Foundation for Statistical Computing, Vienna, Austria).
101
5.3.1.4 Sensitivity analysis
NRS comparing laparoscopy and open surgery were first published in the early 1990s and
high-quality RCTs appeared nearly 15 years later. It is likely that these surgical techniques
and peri-operative processes (e.g. imaging and anaesthesia techniques, prophylactic
antibiotic guidelines and enhanced recovery pathways focusing on early feeding and
mobilization) evolved during this time. It is possible the comparisons of effect estimates
across different groups of studies (i.e. NRS, RCTs, Typical RCTs and Strong RCTs) could be
confounded by period effects.
Additionally, individual studies likely differed in the types of patients included with some
evaluating patients with more advanced disease or a higher frequency of comorbidities with
would impact the risk of developing post-operative complications or death. Moreover,
individual institutions have also been shown to differ in their capacity to rescue patients with
complications (i.e. prevent mortality) and this could lead to important differences in
mortality between institutions or individual studies (Ghaferi, Birkmeyer, and Dimick 2009).
These sources of clinical and methodological heterogeneity could confound the relationship
between study design and observed effect estimates. The best possible method for exploring
such heterogeneity employs the use of individual patient and provider data to explore the
impact of various covariates on the treatment effect. It is uncommon for those conducting
meta-analyses and meta-regression analyses to have access to such data (Sharp and
Thompson 2000). Alternatives include using baseline event rate, which is a covariate
measured at the study level, to adjust analyses for important between-study differences
(Sharp, Thompson, and Altman 1996; Thompson, Smith, and Sharp 1997).
A sensitivity analysis incorporating period effects (i.e. year of publication) and baseline
event rate (i.e. event rate in control groups) was therefore undertaken. Baseline event rates
are considered reflective of differences in the underlying risk of patients (Barza, Trikalinos,
and Lau 2009) and “can be interpreted as a summary of a number of unmeasured patient
characteristics” (Sharp, Thompson, and Altman 1996). Studies with higher rates of post-
102
operative complications or mortality might differ not only in terms of patient case-mix but
also with regards to institutional processes of care. This single measure was therefore used to
incorporate between-study differences in patient case-mix and institutional practice.
For binary outcomes, the baseline event rate was equal to the proportion of patients in the
control group experiencing the outcome (i.e. either a post-operative complication or death).
For continuous outcomes (i.e. length of stay or number of lymph nodes harvested), the mean
in the control group was equal to the baseline event rate. In both instances, the baseline event
rate is also used to calculate the overall effect estimate in a study. Therefore, frequentist
regression methods could not be used because one of the covariates (i.e. baseline event rate)
would be correlated with the dependent variable (i.e. effect estimate). The phenomenon of
regression to the mean can occur; “a high baseline event rate, observed entirely by chance,
will on average, will give rise to a higher than expected effect estimate, and vice versa”
(Higgins, Green, and Cochrane Collaboration. 2011). It is recommended that a Bayesian
analysis should be used because this approach allows for a separate posterior probability to
be calculated for the covariate — one that is unrelated to the posterior probability for the
overall effect estimate (McIntosh 1996; Thompson, Smith, and Sharp 1997; Sharp and
Thompson 2000; van Houwelingen, Arends, and Stijnen 2002). Bayesian hierarchical models
were developed according to this guidance (Thompson, Turner, and Warn 2001) and are
available in Appendix F.
5.3.1.4.1 Model estimation
Bayesian analyses were performed using OpenBUGs, version 3.2.2. OpenBUGs employs
Markov Chain Monte Carlo methods and the Gibbs sampler to estimate posterior probability
distributions for quantities of interest (Lunn et al. 2009). After a burn-in of 20,000 updates,
100,000 iterations were performed. Three simultaneous chains were run and convergence
was assessed by examining Gelman-Rubin convergence plots (Gelman and Rubin 1996).
Initial values for each unknown parameter of each chain were randomly generated from a
103
normal distribution in R, version 2.10.1 (R Foundation for Statistical Computing, Vienna,
Austria). Non-informative prior distributions were used for all model parameters. Given the
non-informative nature of these priors and our large number of studies, we did not perform
sensitivity analyses on the choice of prior distributions. Results are reported according to the
ROBUST guidelines (Sung et al. 2005).
5.4 Results
5.4.1 Included studies
A subgroup of the data set described in Chapter 4 was used for the following analyses. One-
hundred and forty-four studies reported the outcomes of interest (Table 5.1). These
comparative studies involved a total of 1,177,740 participants (NRS n=1,171,524, RCT
n= 6,216). The earliest comparative studies were NRS, published in 1993. The first RCTs
appeared 3 years later. The majority of studies were affiliated with an academic center,
however, a notable proportion of NRS were conducted in a community setting (21.7%). Both
consortiums and authors with methodological expertise were more common among RCTs.
The reports of NRS were shorter (median 6 versus 7 pages) and authored by fewer
investigators.
104
Table 5.1 Characteristics of included studies.
NRS (N=121)
RCTs (N=23) p-value•
Year of publication * 2006 (2000-2009)
2004 (2001-2006) 0.27†
Participants 129 (67-265)
116 (60-397) 0.95†
Authors
Number* 5 (4-7)
7 (5-8) 0.03†
Consortium among authors§ 1 (0.8)
5 (21.7) <0.001♦
Methodological expertise§ 15 (12.4)
8 (34.8) 0.02◊
Academic setting§ 96 (79.3)
20 (87.0) 0.56♦
Number of pages* 6 (5-8)
7 (6-9) 0.02†
* Median, (Interquartile Range, IQR). § Number (percentage). † Medians compared using the Mann-Whitney U test. ♦ Frequencies compared using Fisher’s exact test. ◊ Frequencies compared using Chi-square test. • Statistically significant p values (<0.05) indicated in bold.
5.4.2 Binary outcomes
5.4.2.1 Post-operative complications
Ninety-nine studies (NRS=79, RCTs=20) reported the frequency of post-operative
complications. The results of random-effects meta-analysis are outlined in Table 5.2.
Separate analyses of all studies, NRS, all RCTs and Typical RCTs suggest that laparoscopy
was associated with fewer post-operative complications. However, a meta-analysis of Strong
RCTs did not find laparoscopy to be superior to open surgery (OR 0.96, 95% CI 0.80 to 1.15,
p=0.65) (Figure 5.3). Strong RCTs were the least heterogeneous group of studies (I2=15.4%);
the results of individual Strong RCTs are outlined in Table 5.3. Typical RCTs and all RCTs
were moderately heterogeneous (44.8% and 52.9%, respectively). The NRS were the most
diverse group of studies (I2=79.9%).
105
Table 5.2 Random-effects meta-analysis results for studies reporting post-operative complications.
# of Studies OR* 95% CI p-value I2♦ 95% CI All Studies 99 0.65 0.60, 0.71 <0.0001 77.2 72.5, 81.1 NRS 79 0.63 0.57, 0.70 <0.0001 79.9 75.4, 83.6 RCTs 20 0.72 0.58, 0.90 0.0045 52.9 21.7, 71.7 Typical RCTs 16 0.60 0.45, 0.82 0.0012 44.8 0.90, 69.3 Strong RCTs 4 0.96 0.80, 1.15 0.65 15.4 0.00, 87.1
* Odds Ratio. OR<1 indicates that laparoscopy is associated with fewer post-operative complications as compared with open surgery. ♦ I-squared describes the percentage of total variation across studies that is due to heterogeneity instead of chance.
106
Table 5.3 Results of Strong Randomized Controlled Trials
Events Event Rate Author Year N LAP§ OPEN† LAP§ OPEN† Odds Ratio* 95% CI
A. Post-Operative Complications COST Nelson et al. 2004 863 92/432 85/428 0.21 0.20 1.06 0.67, 1.67 MRC CLASICC Guillou et al. 2005 794 172/526 85/268 0.33 0.32 1.05 0.76, 1.43 COLOR Veldkamp et al. 2005 1082 111/536 110/546 0.21 0.20 1.05 0.77, 1.39 ALCCaS Hewett et al. 2008 592 111/294 135/298 0.38 0.45 0.73 0.53, 1.02 B. Mortality COST Nelson et al. 2004 863 2/435 4/428 0.005 0.009 0.49 0.09, 2.69 MRC CLASICC Guillou et al. 2005 794 21/526 13/268 0.040 0.049 0.82 0.40, 1.66 COLOR Veldkamp et al. 2005 1082 6/536 10/546 0.011 0.018 0.60 0.22, 1.68 ALCCaS Hewett et al. 2008 592 4/294 2/298 0.014 0.007 2.04 0.37, 11.23 Mean Mean
Difference*
Author Year N* LAP§ OPEN† 95% CI C. Length of Stay COST Nelson et al. 2004 863 5 6 -1.0 -1.20, -0.80 MRC CLASICC Guillou et al. 2005 794 9 9 0.0 -0.25, 0.25 COLOR Veldkamp et al. 2005 1082 8.2 9.3 -1.1 -1.93, -0.27 ALCCaS Hewett et al. 2008 592 9.5 10.6 -1.1 -2.28, 0.08 D. Number of Lymph Nodes COST Nelson et al. 2004 863 12 12 0 -0.56, 0.56 MRC CLASICC Guillou et al. 2005 794 12 13.5 -1.5 -1.88, -1.12 COLOR Veldkamp et al. 2005 1082 10 10 0 -1.24, 1.24 ALCCaS Hewett et al. 2008 592 13 13 0 -2.52, 2.52 * Odds ratios, mean differences and 95% CI calculated according to methods outlined in Section 5.3.1.2.2. § LAP = laparoscopic surgery † OPEN= open surgery
107
Figure 5.3 Forest plot of meta-analysis results for studies reporting post-operative complications. Squares indicate odds ratios and error bars indicate 95% confidence intervals.
Random-effects meta-regression models were used to compare effect estimates across groups
and the results are summarized in Table 5.4. NRS estimated a benefit for laparoscopy that
was 36% larger on average than in Strong RCTs (ROR 0.64, 95% CI 0.42 to 0.94, p=0.04).
Typical RCTs also estimated a larger benefit with laparoscopy than Strong RCTs (ROR 0.63,
95% CI 0.42 to 0.96, p=0.03). This pattern was not observed when comparing NRS with all
RCTs. The effect estimates from NRS and Typical RCTs were similar (Figure 5.4).
Table 5.4 Meta-regression results comparing effect estimates for post-operative complications from different study designs.
Comparison ROR* 95% CI† p-value NRS/RCTs 0.85 0.65, 1.13 0.28 NRS/Typical RCTs 1.01 0.73, 1.41 0.93 NRS/Strong RCTs 0.64 0.42, 0.97 0.04 Typical RCTs/Strong RCTs 0.63 0.42, 0.96 0.03
* Ratio of odds ratios. A ROR < 1 indicates that the study in the numerator showed more benefit than studies in the denominator.
108
Figure 5.4 Forest plot of ratios of odds ratios (ROR) from meta-regression analysis comparing study designs. Squares indicate ROR and error bars indicate 95% confidence intervals.
5.4.2.2 Peri-operative mortality
Ninety-six studies (NRS=79, RCTs=17) examined the association between surgical approach
(laparoscopic or open colon surgery) and mortality. The effect estimates from all studies and
NRS both suggest that laparoscopy is associated with fewer deaths than open surgery
(p<0.0001) (Table 5.5). Combining all RCTs however, did not demonstrate an advantage
with laparoscopy (Figure 5.5). Typical and Strong RCTs similarly suggest that there is no
benefit associated with laparoscopy. All groups had low between-study heterogeneity.
Table 5.5 Random-effects meta-analysis results for studies reporting peri-operative mortality.
# of Studies OR* 95% CI p-value I2♦ 95% CI All Studies 96 0.62 0.51, 0.75 <0.0001 25.8 3.8, 42.8 NRS 79 0.59 0.47, 0.74 <0.0001 33.9 12.7, 50.0 RCTs 17 0.83 0.55, 1.26 0.39 0 0.0, 0.0 Typical RCTs 13 0.92 0.46, 1.83 0.82 0 0.0, 0.0 Strong RCTs 4 0.78 0.46, 1.32 0.36 0 0.0, 73.9
* Odds Ratio. OR<1 indicates that laparoscopy is associated with fewer deaths. ♦ I-squared describes the percentage of total variation across studies that is due to heterogeneity instead of chance.
109
Figure 5.5 Forest plot of meta-analysis results for studies reporting peri-operative mortality. Squares indicate odds ratios and error bars indicate 95% confidence intervals.
Meta-regression results are outlined in Table 5.6. While there was a suggestion that NRS
over-estimate benefit associated with laparoscopy (ROR ranging from 0.63 to 0.74), none of
these comparisons were statistically significant (Figure 5.6).
Table 5.6 Meta-regression results comparing effect estimates for peri-operative mortality from different study designs.
Comparison ROR* 95% CI† p-value NRS/RCTs 0.69 0.41, 1.16 0.16 NRS/Typical RCTs 0.63 0.30, 1.34 0.24 NRS/Strong RCTs 0.74 0.38, 1.44 0.38 Typical RCTs/Strong RCTs 1.16 0.45, 3.03 0.76
* Ratio of odds ratios. A ROR < 1 indicates that the study in the numerator showed more benefit than studies in the denominator.
110
Figure 5.6 Forest plot of ratios of odds ratios (ROR) from meta-regression analysis comparing study designs. Squares indicate ratios of odds ratios (ROR) and error bars indicate 95% confidence intervals.
5.4.3 Continuous outcomes
5.4.3.1 Length of stay
Estimates for length of stay were reported in 128 studies (NRS=106, RCTs=22). All groups
demonstrated a benefit associated with laparoscopy (Table 5.7). While Strong RCTs found
that length of stay was 0.70 days shorter for those treated with laparoscopy (95% CI, -1.23 to
-0.17), NRS demonstrated a benefit of nearly 3 days with laparoscopy (MD -2.95, 95% CI
-3.39 to -2.50) (Figure 5.7). Notably, I2 values for all groups were over 90%.
111
Table 5.7 Random-effects meta-analysis results for studies reporting length of stay (days).
# of Studies MD 95% CI p-value I2♦ 95% CI All Studies 128 -2.74 -3.13, -2.36 <0.0001 97.3 97.0, 97.5 NRS 106 -2.95 -3.39, -2.50 <0.0001 97.3 97.0, 97.6 RCTs 22 -1.82 -2.45, -1.18 <0.0001 95.9 94.8, 96.8 Typical RCTs 18 -2.16 -2.89, -1.44 <0.0001 90.4 86.4, 93.2 Strong RCTs 4 -0.70 -1.23, -0.17 0.01 92.3 83.4, 96.4
* Mean Difference, MD=meanlaparoscopy-meanopen. A MD<0 indicates that laparoscopy is associated with a shorter length of stay.
♦ I-squared describes the percentage of total variation across studies that is due to heterogeneity instead of chance.
Figure 5.7 Forest plot of meta-analysis results for studies reporting length of stay. Squares indicate mean differences and error bars indicate 95% confidence intervals.
Meta-regression results are summarized in Table 5.8. NRS estimates for benefit were larger
by more than one day as compared with all RCTs (DMD -1.27, 95% CI -2.30 to -0.25,
p=0.01) (Figure 5.8). The difference between NRS and Strong RCTs estimates was over 2
days (DMD -2.15, 95% CI -4.08 to -0.21, p=0.03). Typical RCTs also ascribed more benefit
to laparoscopy than Strong RCTs (DMD -1.40, 95% CI -2.76 to -0.04, p=0.04).
112
Table 5.8 Meta-regression results comparing effect estimates for length of stay from different study designs.
Comparison DMD* 95% CrI† p-value NRS:RCTs -1.27 -2.30, -0.25 0.01 NRS:Typical RCTs -0.81 -1.90, 0.29 0.15 NRS:Strong RCTs -2.15 -4.08, -0.21 0.03 Typical RCTs:Strong RCTs -1.40 -2.76, -0.04 0.04
* Differences in Mean Differences. DMD=Mean differencestudy design 1-Mean differencestudy design 2. Studies are ordered in the comparison column as study design 1:study design 2. A negative mean difference indicates that laparoscopy is associated with a shorter length of stay.
Figure 5.8 Forest plot of difference in mean differences (DMD) from meta-regression analysis comparing study designs. Squares indicate DMDs and error bars indicate 95% confidence intervals. DMD=Mean differencestudy design 1-Mean differencestudy design 2. Studies are ordered in labels as study design 1:study design 2. A negative mean difference indicates that laparoscopy is associated with a shorter length of stay.
5.4.3.2 Number of lymph nodes harvested
Seventy-six studies reported the number of lymph nodes harvested (NRS=59, RCTs=17).
Meta-analyses across all groups revealed that a comparable number of lymph nodes were
found in specimens from laparoscopic and open surgical procedures (Table 5.9). The effect
113
estimate from Strong RCTs was the least favourable of laparoscopy (MD -0.55, 95% CI -
1.37 to 0.26, p=0.18) (Figure 5.9). Heterogeneity between studies was significant and ranged
from 50.6 to 93.4%. NRS were the most diverse group of studies.
Table 5.9 Random-effects meta-analysis results for studies reporting number of lymph nodes harvested.
# of Studies MD 95% CI p-value I2♦ 95% CI All Studies 76 -0.02 -0.50, 0.46 0.93 92.0 90.6, 93.2 NRS 59 0.07 -0.53, 0.67 0.81 93.4 92.2, 94.5 RCTs 17 -0.35 -0.93, 0.23 0.24 68.3 47.7, 80.8 Typical RCTs 13 -0.23 -1.03, 0.57 0.58 50.6 6.7, 73.8 Strong RCTs 4 -0.55 -1.37, 0.26 0.18 86.3 66.7, 94.4
* Mean Difference, MD=meanlaparoscopy-meanopen. A MD<0 indicates that laparoscopy is associated with finding fewer lymph nodes in the surgical specimen. ♦ I-squared describes the percentage of total variation across studies that is due to heterogeneity instead of chance.
Figure 5.9 Forest plot of meta-analysis results for studies reporting number of lymph nodes harvested. Squares indicate mean differences and error bars indicate 95% confidence intervals.
114
Table 5.10 outlines the results of the meta-regression modelling with studies reporting
number of lymph nodes harvested. The DMD between all comparisons was smaller than 0.5
(i.e. half a lymph node) and none were statistically significant.
Table 5.10 Meta-regression results comparing effect estimates for number of lymph nodes harvested from different study designs.
Comparison DMD* 95% CI† p-value NRS:RCTs 0.38 -0.76, 1.53 0.51 NRS:Typical RCTs 0.34 -0.98, 1.66 0.61 NRS:Strong RCTs 0.49 -1.43, 2.42 0.62 Typical RCTs:Strong RCTs 0.15 -2.04, 2.35 0.89
* Differences in Mean Differences. DMD=Mean differencestudy design 1-Mean differencestudy design 2. Studies are ordered in the comparison column as study design 1:study design 2. A negative mean difference indicates that laparoscopy is associated with finding fewer lymph nodes in the surgical specimen.
Figure 5.9 Forest plot of difference in mean differences (DMD) from meta-regression analysis comparing study designs. Squares indicate DMDs and error bars indicate 95% confidence intervals. DMD=Mean differencestudy design 1-Mean differencestudy design 2. Studies are ordered in labels as study design 1:study design 2. A negative mean difference indicates that laparoscopy is associated with a shorter length of stay.
115
5.4.3.3 Sensitivity analysis
The median year of publication for included studies and baseline event rates among control
groups are reported in Table 5.11. The median year of publication differs by at most two
years between NRS and RCTs. Baseline event rates are also similar. Figure 5.11 presents the
distribution of baseline event rates among NRS and RCTs. For subjective outcomes
(i.e. post-operative complications and length of stay), there were more outlying studies
among NRS; for post-operative complications, 10.1% of NRS had a baseline event rate
>45%, the highest baseline event rate observed among RCTs. For length of stay, 17.9% of
NRS had a baseline mean >14 days, the upper limit for RCTs.
Table 5.11 Median year of publication and baseline event rates in studies reporting the outcomes of interest.
NRS Median (IQR)
RCTs Median (IQR)
Post-Operative Complications Studies 79 20 Year 2006 (2002-2008) 2004 (2000-2006) Baseline Event Rate 0.26 (0.20-0.34) 0.25 (0.20-0.30) Mortality Studies 79 17 Year 2006 (2002-2008) 2004 (2000-2004) Baseline Event Rate 0.01 (0.00-0.03) 0.01 (0.00-0.02) Length of Stay Studies 106 22 Year 2006 (2000-2004) 2004 (2002-2004) Mean in Control Group* 9.40 (7.82-10.83) 9.00 (7.32-11.39) Number of LN harvested Studies 59 17 Year 2005 (1998-2004) 2004 (2002-2004) Mean in Control Group* 13.80 (9.68, 14.35) 13.00 (10.50-16.00) * Control Group = Open Surgery Group
116
Figure 5.11 Event rates in control groups across included studies. Rates in studies reporting A) Post-Operative Complications and B) Mortality expressed as percent rates. Rates in studies reporting C) Length of Stay and D) Number of Lymph Nodes Harvested expressed as means. Black – Randomized controlled trials. Gray – Non-randomized studies.
117
We examined the relationship between baseline event rate and publication year visually
(Figure 5.12) for each of the outcomes of interest. There was no common pattern evident
across outcomes and notable variation between study designs.
Figure 5.12 Baseline event rates over time. Event rates in control groups across included studies. Rates in studies reporting A) Post-operative complications and B) Peri-operative mortality expressed as percent rates. Rates in studies reporting C) Length of stay and D) Number of lymph nodes harvested expressed as means. Black – Randomized controlled trials. Gray – Non-randomized studies.
118
The results of univariable and multivariable Bayesian meta-regression analyses for studies
reporting post-operative complications are outlined in Table 5.12. The results of the adjusted
and unadjusted analyses remain consistent with those of the primary analysis (Table 5.3);
both NRS and Typical RCTs were associated with more extreme estimates of benefit for
laparoscopy as compared with Strong RCTs. Moreover, as the baseline event rate increased
in a given study, laparoscopy was associated with fewer post-operative complications. For
instance, in the comparison of NRS with Strong RCTs, the ROR for baseline event rate is
0.67. This indicates that for a one logit increase in the baseline event rate, the odds ratio for
post-operative complications decreases by 33%. As a demonstrative example, an increase in
the baseline event rate from 0.25 to 0.35 (equal to 0.21 logits) would result in a decrease of
the odds ratio by 8%. Therefore, the benefit of laparoscopy appears to be more pronounced
in studies were patients in the control group were more likely to experience a complication.
A similar trend for baseline event rate was observed among studies reporting peri-operative
mortality; as the baseline rate of deaths in a study increased, the odds ratio for death
decreased (Table 5.13). As with the primary analysis (Table 5.6), effect estimates did not
appear to differ across different study designs in both unadjusted and adjusted analyses.
Among studies reporting length of stay, NRS were again associated with more extreme
estimates of treatment effect as compared with all RCTs and Strong RCTs (Table 5.14).
Effect estimates in Typical RCTS were also more extreme as compared with those from
Strong RCTs. Multivariable analyses, adjusting for year of publication and baseline event
rate, revealed similar findings. The results for this outcome were again similar to those from
primary analyses (Table 5.6). Moreover, differences in length of stay between laparoscopy
and open surgery increased as the baseline mean increased by one day. For example, in the
comparison of NRS and Strong RCTs, as the mean length of stay increased by one day, the
difference between laparoscopy and open surgery increased by 0.39 days. A similar trend
was not observed among studies reporting number of lymph nodes harvested
The results of the Bayesian unadjusted and adjusted analyses for studies reporting number of
lymph nodes harvested (Table 5.15) were similar to primary analyses (Table 5.10); effect
119
Table 5.12 Bayesian meta-regression results comparing effect estimates for post-operative complications from different study designs, adjusted for year of publication and baseline event rate.
Unadjusted Analysis Multivariable Analysis
Comparison Design ROR 95% CrI† Design
ROR* 95% CrI† Year ROR♦ 95% CrI†
Baseline Event Rate
ROR§ 95% CrI†
NRS/RCTs 0.85 0.63, 1.15 0.87 0.67, 1.16 0.99 0.96, 1.02 0.65 0.56, 0.77 NRS/Typical RCTs 1.00 0.70, 1.41 1.06 0.78, 1.47 0.98 0.96, 1.01 0.65 0.55, 0.78 NRS/Strong RCTs 0.58 0.37, 0.93 0.60 0.37, 0.94 0.98 0.95, 1.01 0.67 0.58, 0.79 Typical RCTs/Strong RCTs 0.62 0.38, 0.99 0.57 0.36, 0.88 1.07 1.00, 1.15 0.40 0.32, 0.64
Statistically significant values in bold (i.e. credible intervals do not include unity). * Ratio of odds ratios. A ROR < 1 indicates that the study in the numerator showed more benefit than studies in the denominator. † Credible Interval. ♦ ROR for year can be interpreted as the change in the overall study effect (i.e. odds ratio) for a one unit increase in year. For example, a ROR of 0.95 indicates that publication of an article one year later would be associated with a decrease in the odds ratio by 5%. § ROR for baseline event rates are expressed for a one logit increase in baseline event rate.
Table 5.13 Bayesian meta-regression results comparing effect estimates for peri-operative mortality from different study designs, adjusted for year of publication and baseline event rate.
Unadjusted Analysis Multivariable Analysis
Comparison Design ROR
95% CrI†
Design ROR* 95% CrI† Year
ROR♦ 95% CrI† Baseline
Event Rate ROR§
95% CrI†
NRS/RCTs 0.65 0.35, 1.10 0.94 0.52, 1.62 0.96 0.91, 1.02 0.38 0.37, 0.39 NRS/Typical RCTs 0.66 0.26, 1.42 1.11 0.49, 2.21 0.96 0.91, 1.02 0.37 0.37, 0.37 NRS/Strong RCTs 0.69 0.49, 1.41 0.89 0.33, 2.03 0.96 0.89, 1.02 0.37 0.37, 0.37 Typical RCTs/Strong RCTs 1.26 0.32, 3.46 0.93 0.22, 2.64 1.01 0.84, 1.22 0.37 0.36, 0.40
Statistically significant values in bold (i.e. credible intervals do not include unity). * Ratio of odds ratios. A ROR < 1 indicates that the study in the numerator showed more benefit than studies in the denominator. † Credible Interval. ♦ ROR for year can be interpreted as the change in the overall study effect (i.e. odds ratio) for a one unit increase in year. For example, a ROR of 0.95 indicates that publication of an article one year later would be associated with a decrease in the odds ratio by 5%. § ROR for baseline event rates are expressed for a one logit increase in baseline event rate.
120
Table 5.14 Bayesian meta-regression results comparing effect estimates for length of stay from different study designs, adjusted for year of publication and baseline event rate.
Unadjusted Analysis Multivariable Analysis
Comparison Design DMD* 95% CrI† Design
DMD* 95% CrI† Year DMD♦ 95% CrI†
Baseline Event Rate
DMD§ 95% CrI†
NRS/RCTs -1.07 -1.98, -0.02 -0.76 -1.38, -0.11 -0.01 -0.06, 0.04 -0.39 -0.45, -0.33 NRS/Typical RCTs -0.85 1.94, 0.23 -0.43 -1.14, 0.31 -0.01 -0.07, 0.04 -0.39 -0.45, -0.33 NRS/Strong RCTs -1.74 -3.25, -0.26 -1.52 -2.71, -0.32 -0.02 -0.07, 0.04 -0.39 -0.46, -0.33 Typical RCTs/Strong RCTs -1.32 -2.43, -0.21 -1.07 -1.83, -0.33 0.03 -0.17, 0.23 -0.31 -0.59, -0.02
Statistically significant values in bold (i.e. credible intervals do not include zero). * DMD=Mean differencestudy design 1-Mean differencestudy design 2. Studies are ordered in the comparison column as study design 1:study design 2. † Credible Interval. ♦ DMD for year can be interpreted as the change in the overall study effect (i.e. mean difference) for a one unit increase in year. For example, a DMD of -0.50 indicates that if an article was published one year later, the mean difference becomes more negative by 0.50 days. § Baseline Event Rate = mean in the control (open) group. DMD for baseline event rate can be interpreted as the change in the overall study effect (i.e. mean difference) for a one logit increase in baseline mean.
Table 5.15 Bayesian meta-regression results comparing effect estimates for number of lymph nodes harvested from different study designs, adjusted for year of publication and baseline event rate.
Unadjusted Analysis Multivariable Analysis
Comparison Design DMD* 95% CrI† Design
DMD* 95% CrI† Year DMD♦ 95% CrI†
Baseline Event Rate
DMD§ 95% CrI†
NRS/RCTs 0.42 -0.74, 1.58 0.43 -0.78, 1.70 -0.02 -0.13, 0.09 -0.05 -0.16, 0.04 NRS/Typical RCTs 0.32 -1.07, 1.70 0.36 -0.94, 1.72 -0.02 -0.13, 0.10 -0.06 -0.17, 0.04 NRS/Strong RCTs 0.44 -1.45, 2.46 0.37 -1.68, 2.34 -0.04 -0.16, 0.08 -0.03 -0.14, 0.08 Typical RCTs/Strong RCTs 0.23 -1.34, 1.71 0.92 -0.30, 2.81 0.13 -0.11, 0.25 0.06 -0.19, 0.08
Statistically significant values in bold (i.e. credible intervals do not include zero). * DMD=Mean differencestudy design 1-Mean differencestudy design 2. Studies are ordered in the comparison column as study design 1:study design 2. † Credible Interval. ♦ DMD for year can be interpreted as the change in the overall study effect (i.e. mean difference) for a one unit increase in year. For example, a DMD of -0.50 indicates that if an article was published one year later, the mean difference becomes more negative by 0.50 days. § Baseline Event Rate = mean in the control (open) group. DMD for baseline event rate can be interpreted as the change in the overall study effect (i.e. mean difference) for a one logit increase in baseline mean.
121
estimates did not statistically differ across different study designs. Baseline event also did not
appear to influence estimates of mean differences.
5.4.3.3.1 Publication bias
Due to significant heterogeneity across all four outcomes, formal tests for publication bias
were not undertaken (Ioannidis and Trikalinos 2007). Funnel plot were therefore examined
visually (Figure 5.13). The funnel plot for post-operative complications appeared to have
some minor asymmetry; there were approximately 5 small NRS near the bottom of the funnel
favoring laparoscopy that were not balanced by similarly-sized studies favoring open
surgery. Minor asymmetry was also noted with the funnel plot for length of stay.
5.5 Discussion
This study comparing effect estimates from RCTs with those from NRS has three main
findings. First, among subjective outcomes, NRS had more extreme estimates of benefit for
laparoscopy than Strong RCTs. For the outcome post-operative complications, NRS
attributed 36% more benefit to laparoscopy than in Strong RCTs (ROR 0.64, 95% CI 0.42-
0.97). Laparoscopy was also associated with a length of stay that was 2 days shorter in NRS
as compared with Strong RCTs. A similar pattern was not observed with the objective
outcomes mortality and number of lymph nodes harvested. The observed differences
between NRS and Strong RCTs persisted after adjusting for period effects and differences in
baseline event rates between studies. Second, among subjective outcomes, effect estimates
from Typical RCTs were similar to those from NRS. Like NRS, Typical RCTs were also
associated with larger estimates of benefit; combined odds ratios for post-operative
complications were 37% smaller (e.g. more benefit attributed to laparoscopy) in Typical
RCTs than Strong RCTs. Differences in length of stay between laparoscopy and
122
Figure 5.13 Funnel plots for A) Post-operative complications, 79 non-randomized studies (black) and 20 RCTs (white); B) Mortality, 79 non-randomized studies (black) and 17 RCTs (white); C) Length of stay, 106 non-randomized studies (black) and 22 RCTs (white); D) Number of lymph nodes, 59 non-randomized studies (black) and 17 RCTs (white).
123
conventional surgery were -0.7 days in Strong RCTs (favouring laparoscopy) but were -2.15
days in Typical RCTs. Third, there was significant between-study heterogeneity across all
four outcomes, and NRS were more heterogeneous than Typical or Strong RCTs.
A previous study has compared the findings of NRS with those of RCTs evaluating
laparoscopy and conventional surgery for the management of colon cancer (Abraham et al.
2010). Unlike our study, this study concluded that the effect estimates across these two study
designs are generally comparable. Our study has a number of advantages including having
identified a larger cohort of studies comparing laparoscopy with open surgery for colon
cancer (n=144 studies versus n=61 studies). Our analyses also assessed comparability across
objective and subjective outcomes; recent studies have demonstrated that bias associated
with RCT design attributes (e.g. allocation concealment) is more pronounced with subjective
outcomes (Wood et al. 2008; Savovic et al. 2012). The findings of this study support that
similar bias or exaggerated estimates of benefit were notable among subjective but not
objective outcomes. While Abraham and colleagues measured and acknowledged the
variation in methodological quality among RCTs in their study, they nonetheless aggregated
effect estimates across all RCTs. We instead chose to handle the variability in trial quality by
dividing RCTs into Strong and Typical studies. By isolating rigorously performed RCTs, we
combined effect estimates from a population of studies at the lowest risk of bias. Our results
suggest that the findings of Strong RCTs are more conservative (i.e closer to the null) than
those of Typical RCTs. Hartling et al. have also found similar results in a study comparing
pediatric efficacy trials at low, unclear or high risk of bias (Hartling et al. 2009). In this
study, effect estimates from RCTs at low risk of bias were closest to the null.
Other studies have examined the comparability of NRS and RCTs across other interventions.
While some of have found the results of NRS and RCTs to be generally comparable
(Concato, Shah, and Horwitz 2000; Benson and Hartz 2000), others have found important
differences (Britton et al. 1998; Shikata et al. 2006; Kunz, Vist, and Oxman 2007). We found
that NRS attributed 36% more benefit to laparoscopy than Strong RCTs when examining
subjective outcomes. Whereas we found that NRS overestimated benefit with laparoscopy,
124
others have found that surgical NRS can overestimate harm (Bhandari et al. 2004). In a study
by Bhandari et al., the results of NRS evaluating arthroplasty and internal fixation for hip
fracture were compared with the results of RCTs. They identified 13 NRS and 14 RCTs.
Mortality data was available in 13 NRS and 12 RCTs. The relative risk for mortality with
arthroplasty as compared with internal fixation in NRS was 40% larger than the estimate in
RCTs; the RR was 1.44 in NRS (95% CI 1.13,1.85) versus 1.04 in RCTs (95% CI,
0.84,1.29). It is interesting that the magnitude of bias in our study is similar to the bias
detected by Bhandari et al., but the direction of bias is not.
One of the strengths of our study over previous comparisons includes incorporating RCT
quality into comparisons of NRS and trials. Previous attempts to compare effect estimates
from NRS and RCTs treated the latter as a homogeneous group of high-quality studies.
Important methodological differences between individual RCTs may have been overlooked.
Moreover, many of the studies comparing NRS and RCTs included studies conducted in the
1980s and 1990s. Less than 17% of studies in this cohort were published before 2000.
Therefore, our study represents a more contemporaneous comparison of NRS and RCTs.
The results of this study are limited by a reliance on reported study methods to categorize
RCTs as Typical or Strong. For example, it is possible that even though RCTs did not report
adequate random sequence generation, appropriate methods for randomizing patients may
have been employed. Accordingly, there may have been misclassification of RCTs because
assessments were made using reported methods instead of actual study conduct. To limit the
possibility of such misclassification, study protocols were reviewed and authors were
contacted to collect additional information. Moreover, the Cochrane Risk of Bias Tool,
which was used to classify RCTs, has been criticized for being a subjective instrument
heavily influenced by judgment; Hartling et al. have demonstrated low inter-rater reliability
for the domains blinding, incomplete data and selective reporting (Hartling et al. 2012).
However, Hartling and colleagues examined an earlier version of the Cochrane Risk of Bias
Tool. In the interim, the instrument has been revised to diminish ambiguity and has more
detailed guidance for making individual domain assessments (Higgins et al. 2011). This more
125
recent version of the tool was used in this study. We also evaluated the subjectivity of our
assessments by examining RCTs reporting post-operative complications in duplicate. There
was perfect agreement between the two assessors. The same four RCTs were identified at
least risk of bias or “Strong” across all four outcomes of interest. Notably, these RCTs were
registered, multi-centered, large RCTs that were publicly funded. RCTs with these attributes
have been shown to be less susceptible to bias (Als-Nielsen et al. 2003; Nuesch et al. 2010;
Dechartres et al. 2011).
Our analyses were also limited by using baseline event rate to control for between-study
heterogeneity. Without access to patient and provider-level data, baseline event rate was used
as a measure of aggregate underlying risk. If such data had been available, our analyses
could have been adjusted for differences in age, cancer stage or physician-experience with
laparoscopy — variables that may have influenced between study differences in post-
operative complications, mortality, length of stay and number of lymph nodes harvested.
Using baseline event rate is instead a more indirect measure of these attributes but still
represents an attempt to adjust comparisons of NRS and RCTs for between-study clinical
heterogeneity.
Our analysis focused on a single intervention in surgery. It remains to be seen if our results
are generalizable to others surgical interventions or intervention in other areas of medicine.
Additional studies will be necessary to determine if NRS routinely overestimate the benefit
associated with a novel intervention. Moreover, while we demonstrated a pattern of
exaggerated benefit among NRS for the subjective outcomes post-operative complications
and length of stay, this finding may not apply to other subjective outcomes. We had hoped to
analyze pain as an additional outcome however, it was too inconsistently and infrequently
reported to do so.
The results of this study raise some important questions for the meta-analysis community;
how should effect estimates from RCTs of varying quality be combined? Incorporating
quality scores into meta-regression analyses has been previously discouraged (Juni et al.
1999; Greenland and O'Rourke 2001). Currently, subgroup analyses are used to explore
126
heterogeneity in a meta-analysis. However, performing a separate analysis of Strong RCTs,
even in the absence of heterogeneity, may reveal important differences among trials that can
nuance the interpretation of random-effects meta-analysis.
The results of this study may fuel the ongoing debate over the utility of NRS for decision
making in health care. While meta-analyses of RCTs will continue to be considered the most
reliable source of evidence for evaluating interventions, NRS may provide important insights
when evaluating objective outcomes. It is important to note though that while this study did
not demonstrate a difference in effect estimates between NRS and Strong RCTs for objective
outcomes, an absence of a difference is not evidence of “no difference.” If our findings are
replicated in other disease areas and with other interventions, perhaps the meta-analyses of
NRS for objective outcomes could be placed higher in most evidence hierarchies.
There is increasing interest in evaluating risk of bias in NRS. Just as Strong RCTs yield
results that differ from other RCTs, perhaps the same is true of Strong NRS. A valid and
reliable tool however is required to identify these NRS. Accordingly, empirical evidence is
necessary to determine which aspects of NRS study design are associated with bias.
Numerous meta-epidemiological studies of RCTs helped to establish which aspects of RCT
design are important when assessing risk of bias — we believe that similar studies are
required for NRS. The choice of referent group for meta-epidemiological studies of NRS
however are less clear. Should NRS without a characteristic (e.g. matched controls) be
compared with NRS where controls were matched? Or should the effect estimates from this
former group be compared with RCTs? Doing so would overlook the differences in quality
among RCTs. Instead, the results of this study would suggest that aggregate effect estimates
from Strong RCTs should serve as the referent group for future meta-epidemiological studies
of NRS. Care should also be taken to make a distinction between subjective and objective
outcomes when performing these proposed studies.
127
5.6 Conclusion
When evaluating subjective outcomes, effect estimates from NRS were associated with
larger estimates of benefit for laparoscopy than Strong RCTs. Typical RCTs (i.e. at unclear
and high risk of bias) also had more extreme estimates of benefit for laparoscopy as
compared with Strong RCTs. Similar trends were not observed among objective outcomes
(mortality and number of lymph nodes harvested).
128
Chapter 6 Empirically identifying the study attributes of non-randomized studies associated with bias:
a meta-epidemiology study
6.1 Summary
Objective
Numerous studies suggest that aspects of RCT study design are associated with biased
intervention effect estimates. Comparable empirical evidence is lacking for NRS. The
objective of this study was to explore the relationship between NRS-design attributes and
estimates of treatment effect.
Methods
A systematic review identified all comparative studies evaluating laparoscopy and
conventional surgery for the management of colon cancer. NRS reporting four outcomes of
interest (post-operative complications, peri-operative mortality, length of stay and number of
lymph nodes harvested) were selected. Nine NRS study characteristics were abstracted as
binary variables: (i) whether the outcome of interest was the primary outcome, (ii) presence
of a sample size calculation, (iii) prospective data collection, (iv) concurrent (versus
historical) controls, (v) matched controls, (vi) standardized concurrent therapy (i.e. post-
operative care), (vii) systematic outcome assessment, (viii) blinded outcome assessment and
(ix) intention to treat analysis. Random-effect meta-analyses were conducted to pool
summary effect estimates across NRS with and without study characteristics. Mixed-effects
meta-regression models were used to compare effect estimates across subgroups. The effect
129
estimates from NRS without study characteristics were compared with effect estimates from
NRS with study characteristics. Effect estimates from NRS with and without study
characteristics were each compared with the results of RCTs at low risk of bias.
Results
A total of 121 NRS reported the outcomes of interest. Most RCTs had retrospective data
collection, concurrent controls, intention to treat analysis but lacked sample size calculations,
matched controls, standardized concurrent therapy (i.e. standardized post-operative care),
blinded outcome assessors or systematic outcome assessment. Effect estimates generally did
not differ across NRS with or without study characteristics except for the outcome peri-
operative mortality; NRS with retrospective data collection had more extreme estimates of
benefit for laparoscopy than NRS with prospective data collection (ROR 0.62, 95% CI 0.44,
0.87, p-value=0.01). In addition, effect estimates were closer to the null (i.e. less in favour of
laparoscopy) in NRS where the primary outcome was peri-operative mortality as opposed to
NRS in which post-operative death was a secondary outcome (ROR 1.68, 95% CI 1.11,
2.52). However, when effect estimates from NRS subgroups were compared with the results
of Strong RCTs, none proved to be statistically significant.
Conclusions
Effect estimates did not consistently vary according to the presence or absence of NRS
design characteristics among studies comparing laparoscopy and open surgery for the
treatment of colon cancer. Additional studies are necessary to identify the attributes of NRS-
design associated with bias.
130
6.2 Introduction
Non-randomized studies are regarded as an important source of information for the efficacy
of interventions, especially in instances where RCTs are not possible or rarely undertaken
(Reeves et al. 2013a). NRS are the main source of evidence for organizational, public health
(Higgins et al. 2013) and surgical interventions (Wente et al. 2003). Moreover, NRS often
provide the sole information about the long-term outcomes of interventions, rare events or
adverse events (Loke et al. 2007). The lack of randomization though renders most NRS at a
heightened risk of selection bias. We have previously shown that for subjective outcomes,
NRS are associated with more extreme estimates of benefit as compared with Strong RCTs
(Chapter 5). It is possible however, that there may be a subgroup of rigorous NRS that yield
results comparable to these high quality RCTs.
Previous meta-epidemiological studies have established that certain aspects of RCT design
are associated with biased effect estimates. For example, RCTs lacking appropriate random
sequence generation (Wood et al. 2008; Savovic et al. 2012), allocation concealment (Schulz
et al. 1995; Moher et al. 1998; Kjaergard, Villumsen, and Gluud 2001; Pildal et al. 2007;
Wood et al. 2008), blinding (Schulz et al. 1995; Kjaergard, Villumsen, and Gluud 2001;
Pildal et al. 2007; Wood et al. 2008; Hrobjartsson et al. 2012; Savovic et al. 2012;
Hrobjartsson et al. 2013) or those with exclusions after randomization (Tierney and Stewart
2005; Nuesch, Trelle, et al. 2009) have been associated with bias. Similar empirical evidence
is lacking for NRS. Identifying the attributes of NRS that are associated with bias could help
those reviewing and meta-analyzing NRS to isolate a subgroup of rigorous studies. Such
advancements are necessary to understand how to best use the evidence from NRS,
especially in domains such as surgery where NRS far outnumber RCTs.
The objective of this study was to explore the relationship between NRS-design attributes
and estimates of treatment effect. The literature comparing laparoscopy with open surgery
for colon cancer was used for this case study of bias.
131
6.3 Methods
6.3.1 Included studies
All NRS and RCTs comparing laparoscopy with conventional (i.e. open) surgery for the
treatment of colon cancer were identified using the search strategy outlined in Chapter 4
(Section 4.2). Only those studies reporting the outcomes of interest (post-operative
complications, peri-operative mortality, length of stay and number of lymph nodes
harvested) were used in the analyses that follow. Post-operative complications and length of
stay were considered subjective outcomes whereas mortality and number of lymph nodes
harvested were classified as objective outcomes (Chapter 4, Section 4.5.1). It has been
demonstrated that bias associated with RCT design attributes (e.g. allocation concealment) is
more pronounced with subjective outcomes (Wood et al. 2008; Savovic et al. 2012). We
therefore chose to analyze both subjective and objective outcomes in this study. Strong RCTs
were identified according to the approach outlined in Chapter 4 (Section 4.8).
6.3.2 NRS study characteristics
For each NRS, the following nine study characteristics were abstracted as binary variables:
(i) whether the outcome of interest was the primary outcome, (ii) presence of a sample size
calculation, (iii) prospective data collection, (iv) concurrent (versus historical) controls,
(v) matched controls, (vi) standardized concurrent therapy, (vii) systematic outcome
assessment, (viii) blinded outcome assessment and (ix) intention to treat analysis (Table 6.1).
Characteristics were primarily chosen from the conceptual framework described in Chapter 3
via informal consensus among investigators. Of note, characteristics i and ii did not stem
from the conceptual framework but were deemed important to analyze. We included
“outcome of interest as primary outcome” since it was hypothesized that NRS designed to
detect a difference in particular outcome may yield different results from studies where the
132
Table 6.1 NRS study characteristics – definitions and relationship to the conceptual framework for bias in NRS.
Characteristic Framework
Domain & Item Definition Outcome of interest as primary outcome N/A Present
Outcome of interest (i.e. post-operative complications, peri-operative mortality, LOS or number of LN harvested) was identified by study investigators as a primary outcome of the study. Absent Outcome of interest not specified as a primary outcome or no primary outcomes specified.
Sample size calculation N/A Present Investigators state that a sample size calculation (to detect a specified minimum clinically important difference) or power calculation was performed. Absent No such specification provided.
Prospective data collection Information Bias - Source of data
Present Data collection was initiated before the occurrence of outcomes among any members of the cohort under study. Absent Data collection was initiated after the occurrence of outcomes among those under study.
Concurrent controls Selection Bias - Comparability of groups at baseline
Present Controls (i.e. patients in the conventional/open surgery group) treated during the same time period as patients in the intervention (i.e. laparoscopy) group. Absent Patients in the open surgery group treated in a time period that pre-dates the treatment of laparoscopy patients.
Matched controls Selection Bias - Comparability of groups at baseline
Present Investigators state patients in the laparoscopy group were “matched” to those in the control group. Absent No indication of matching patients in the intervention (i.e. laparoscopy) group to those in the control (i.e. open) surgery group.
Standardized concurrent therapy Performance Bias - Concurrent treatment/co- interventions
Present Investigators specify that post-operative care was standardized or mention a specific “enhanced recovery pathway” protocol.
133
Absent Post-operative care was not standardized and surgeons each treated patients according to “standard principles of post-operative care,” or no specific mention is made of post-operative care.
Systematic outcome assessment Detection Bias - Systematic determination of outcome
Present Investigators specify that outcomes were assessed according to a standardized protocol and/or by trained abstractors. Absent No mention of a standardized protocol and/or trained abstractors for assessing outcomes.
Blinded outcome assessment Detection Bias - Blinded outcome assessment
Present Outcomes assessed by an individual blinded to treatment allocation. Absent Outcomes assessed by study personnel aware of treatment allocation or no specification of outcome assessor status.
Intention to treat analysis Attrition Bias - Intention to treat analysis
Present Converted patients, those whose surgeries were initiated as laparoscopic procedures but were completed as open surgery, analyzed as part of the laparoscopy (i.e. intervention) group. Absent Converted patients analyzed as part of the open (i.e. control) group.
134
outcome of interest was a secondary outcome. It was also postulated that the description of a
sample size calculation or post-hoc power calculation may have been a marker for higher
methodological rigor. The remaining characteristics emerge from five of six domains in the
conceptual framework; as there are no registered protocols for NRS, aspects of selective
reporting bias could not be assessed.
6.3.2.1 Validation of study characteristic abstraction
A second reviewer abstracted NRS study characteristics from a random subset of NRS
reporting post-operative complications (38 of 79 studies). Crude agreement was above 95%
and for all study characteristics (Table 6.2). Cohen’s kappa coefficients ranged from 0.77 to
1.00, indicating “very good” inter-rater agreement (Landis and Koch 1977).
Table 6.2 Measures of inter-rater agreement.
Study Characteristic Crude Agreement Cohen’s Kappa Outcome of interest as primary outcome 36/38 (94.7%) 0.89 Sample size calculation 38/38 (100%) 1.00 Prospective data collection 37/38 (97.4%) 0.95 Concurrent controls 36/38 (94.7%) 0.87 Matched controls 37/38 (97.4%) 0.91 Standardized concurrent therapy 38/38 (100%) 1.00 Systematic outcome assessment 36/38 (94.7%) 0.77 Blinded outcome assessment 38/38 (100%) 1.00 Intention to treat analysis 37/38 (97.4%) 0.84
6.3.3 Statistical analyses
Descriptive statistics were calculated to compare NRS and Strong RCTs in terms of year of
publication, number of participants, academic versus community setting, presence of a
135
consortium among authors, number of named authors, methodological expertise among
authors, and length of articles. Medians were compared using the Mann-Whitney U test and
categorical variables using the Pearson’s Chi-square or Fisher’s exact test, as appropriate
(Pagano and Gauvreau 2000). A p-value <0.05 was considered significant.
Absolute and relative frequencies were calculated for each of the NRS study characteristics.
Random-effect meta-analyses were conducted to pool summary effect estimates across NRS
with and without study characteristics (Altman, Egger, and Smith 2001). Inverse variance
weighting was used to combine studies (Sutton et al. 1998). Between-study variance (tau
squared, τ2) was estimated using the methods outlined by DerSimonian and Laird
(DerSimonian and Laird 1986). I2 quantities were calculated to describe the degree of
between-study heterogeneity with values of 0-40% considered low, 30-60% moderate, 50-
90% substantial and 75-100% considerable (Higgins, Green, and Cochrane Collaboration.
2011).
Mixed-effects meta-regression models were generated to compare pooled effect estimates
across subgroups (Thompson and Higgins 2002). Between-study variance (tau squared, τ2)
was estimated using the restricted maximum likelihood estimator (Viechtbauer 2010). For
binary outcomes (post-operative complications and peri-operative mortality), meta-
regression modeling yielded ratios of odds ratios for predictor variables:
𝑅𝑂𝑅 = group Agroup B
(6.4)
As an example, a ROR<1.0 would suggest that the pooled odds ratio in group A is smaller
than the pooled odds ratio for group B. If group A represents studies with retrospective data
collections and group B, studies with prospective data collection, a ROR of 0.80 indicates
that odds ratios were 20% smaller for group A, on average, than in group B. A ROR of 1.20
would suggest the opposite. For continuous outcomes (length of stay and number of lymph
nodes harvested), meta-regression modeling produced differences in mean differences
(DMDs) for predictor variables:
136
𝐷𝑀𝐷 = 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 𝑚𝑒𝑎𝑛 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑔𝑟𝑜𝑢𝑝 𝐴 − 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 𝑚𝑒𝑎𝑛 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑔𝑟𝑜𝑢𝑝 𝐵 (6.5)
For example, consider the outcome length of stay. In each individual study, a mean
difference (MD) was calculated for the length of stay, with negative values indicating that
laparoscopy was associated with a shorter length of stay than open surgery.
𝑀𝐷 = 𝑚𝑒𝑎𝑛𝑙𝑎𝑝𝑎𝑟𝑜𝑠𝑐𝑜𝑝𝑦 − 𝑚𝑒𝑎𝑛𝑜𝑝𝑒𝑛 𝑠𝑢𝑟𝑔𝑒𝑟𝑦 (6.3)
A DMD of -1.50 would suggest that the MD was 1.5 days more in favor of laparoscopy in
group A than in group B.
Mixed-effects meta-regression modeling was first used to compare effect estimates from
NRS with and without study characteristics. These models generated RORs for binary
outcomes and DMDs for continuous outcomes;
𝑅𝑂𝑅 = combined effect estimate NRS𝒘𝒊𝒕𝒉𝒐𝒖𝒕 study charcaretistic
combined effect estimate NRS𝒘𝒊𝒕𝒉 study charcaretistic (6.1)
𝐷𝑀𝐷 = 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 𝑚𝑒𝑎𝑛 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑁𝑅𝑆𝒘𝒊𝒕𝒉𝒐𝒖𝒕 𝑠𝑡𝑢𝑑𝑦 𝑐ℎ𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑖𝑠𝑡𝑖𝑐 − 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 𝑚𝑒𝑎𝑛 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑁𝑅𝑆𝒘𝒊𝒕𝒉 𝑠𝑡𝑢𝑑𝑦 𝑐ℎ𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑖𝑠𝑡𝑖𝑐 (6.2)
Subsequently, effect estimates from NRS with or without study characteristics were each
compared with the results of Strong RCTs.
i) 𝑅𝑂𝑅 = combined effect estimate NRS𝒘𝒊𝒕𝒉 study charcaretistic
combined effect estimate Strong RCTs (6.3)
𝐷𝑀𝐷 = 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 𝑚𝑒𝑎𝑛 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑁𝑅𝑆𝒘𝒊𝒕𝒉 𝑠𝑡𝑢𝑑𝑦 𝑐ℎ𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑖𝑠𝑡𝑖𝑐 − 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 𝑚𝑒𝑎𝑛 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑆𝑡𝑟𝑜𝑛𝑔 𝑅𝐶𝑇𝑠 (6.4)
ii) 𝑅𝑂𝑅 = Pooled effect estimate NRS𝒘𝒊𝒕𝒉𝒐𝒖𝒕 study charcaretistic
combined effect estimate Strong RCTs (6.3)
𝐷𝑀𝐷 = 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 𝑚𝑒𝑎𝑛 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑁𝑅𝑆𝒘𝒊𝒕𝒉𝒐𝒖𝒕 𝑠𝑡𝑢𝑑𝑦 𝑐ℎ𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑖𝑠𝑡𝑖𝑐 − 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 𝑚𝑒𝑎𝑛 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑆𝑡𝑟𝑜𝑛𝑔 𝑅𝐶𝑇𝑠 (6.4)
Effect estimates from Strong RCTs were considered the “gold-standard” summary estimate
when comparing laparoscopy with open colon surgery. All analyses were performed using R,
137
version 2.15.0 (R Foundation for Statistical Computing, Vienna, Austria). A p-value <0.05
was considered significant.
6.4 Results
6.4.1 Included studies
A total of 121 NRS reported the outcomes of interest (Table 6.3). Four trials were
categorized as Strong RCTs (Nelson et al. 2004; Guillou et al. 2005; Hewett et al. 2008).
Most NRS and all Strong RCTs were conducted in an academic setting. NRS articles were
shorter and authored by fewer investigators. On average, Strong RCTs enrolled more patients
than NRS. All Strong RCTs had at least one author with statistical expertise. Consortiums
were rarely involved in the execution of NRS.
Table 6.3 Characteristics of included studies.
NRS (N=121)
Strong RCTs (N=4) p-value•
Year of publication * 2006 (2000-2009)
2005 (2005-2006) 0.98†
Participants 129 (67-265)
828.5 (743.5-917.8) <0.001†
Authors
Number* 5 (4-7)
8.5 (5.3-10.5) 0.22†
Consortium among authors§ 1 (0.8)
4 (100.0) <0.001♦
Methodological expertise§ 15 (12.4)
4 (100.0) <0.001♦
Academic setting§ 96 (79.3)
4 (100.0) 0.58♦
Number of pages* 6 (5-8)
8.8 (8.8-10.25) <0.001†
* Median, (Interquartile Range, IQR). § Number (percentage). † Medians compared using the Mann-Whitney U test. ♦ Frequencies compared using Fisher’s exact test. • Statistically significant p values (<0.05) indicated in bold.
138
6.4.2 Subjective outcomes
6.4.2.1 Post-operative complications
Seventy-nine NRS reported the frequency of post-operative complications. These
comparative studies involved a total of 1,086,216 participants (n=281,385 undergoing
laparoscopy and n=804,831 open surgery). There was no difference in the number of patients
assigned to either laparoscopy or open surgery (median=74.0, IQR=33.5-150.0 versus
median=75.0, IQR=34.0-154.5, p-value=0.66).
Table 6.4 summarizes the frequency of study characteristics across these studies. Post-
operative complications was the primary outcome of 15.2% (n=12) of studies. Retrospective
data collection was more common than prospective data collection. Very few studies used
historical controls (n=7, 8.9%) and nearly a third employed matched controls to overcome
selection bias (n=24, 30.4%). Post-operative care was rarely standardized in these studies
(n=4, 5.1%). None of the NRS utilized blinded outcome assessment and a small minority
standardized the assessment of outcomes (n=10, 12.7%).
Table 6.4 Distribution of study attributes among NRS reporting post-operative complications (n=79).
Attribute Present N (%)
Absent N (%)
Primary outcome 12 (15.2) 67 (84.8) Sample size calculation performed 3 (3.8) 76 (96.2) Prospective data collection 34 (43.0) 45 (56.9) Concurrent controls 72 (91.1) 7 (8.9) Matched controls 24 (30.4) 55 (69.6) Standardized concurrent therapy 4 (5.1) 75 (94.9) Systematic outcome assessment 10 (12.7) 69 (87.3) Blinded outcome assessment 0 (0.0) 79 (100.0) Intention to treat analysis 68 (86.1) 11 (13.9)
There were 29, or 512 possible combinations of study characteristics across NRS since we
examined nine binary study characteristics. However, half of the NRS adhered to either one
139
of three patterns (Table 6.5). A total of 22.8% of studies had retrospective data collection,
concurrent controls and an intention to treat analysis but lacked sample size calculations,
matched controls, standardized concurrent therapy (i.e. standardized post-operative care),
blinded outcome assessment or systematic outcome assessment. The frequency of post-
operative complications was not the primary outcome of these studies. The second most
common pattern (n=14 studies, 17.8%) differed from the first in that data collection was
prospective. Pattern 3 (n=5 studies, 8.5%) instead had retrospective data collection, matched
controls but was otherwise identical to Patterns 1 and 2.
The results of subgroup random-effects meta-analyses are outlined in Table 6.6 and
Figure 6.1. Laparoscopy was associated with fewer post-operative complications than open
surgery for all subgroup analyses, except in instances where a sample size calculation had
been performed (n=3 studies, OR 0.80, 95% CI 0.31,2.04), historical controls were employed
(n= 7 studies, OR 0.77, 95% CI 0.48,1.25) and outcomes had been assessed according to a
standardized protocol (n=10 studies, OR 0.75, 95% CI 0.57,0.70). Only two subgroups had I2
values below 40% (i.e. NRS where a sample size calculation was performed and NRS
without concurrent controls).
Mixed-effects meta-regression models were used to compare effect estimates across
subgroups and the results are summarized in Table 6.7. The pooled effect estimates for NRS
without a characteristic were each compared with the pooled effect estimate for NRS with
the study characteristic. The ratios of odds ratios ranged from 0.82 to 1.29 for these
comparisons and none were statistically significant.
Table 6.8 presents selected results from Chapter 5; combined effect estimates are separately
outlined for NRS and Strong RCTs. These estimates were previously compared with one
another and for subjective outcomes (post-operative complications and length of stay) and
NRS attributed more benefit to laparoscopy than Strong RCTs. Summary effect estimates
from NRS with and without study characteristics were compared with the results of Strong
RCTs (Table 6.9). An inconsistent pattern emerged where the absence of a NRS study
characteristic was occasionally associated with more extreme benefits for laparoscopy;
140
Table 6.5 Study characteristics patterns across NRS reporting post-operative complications (n=79 studies).
Pattern*
N %
Primary outcome
Sample size
calculation
Prospective Data
Collection Matched controls
Concurrent controls
Standardized concurrent
therapy
Systematic outcome
assessment
Blinded outcome
assessment
Intention to treat analysis
Pattern 1 18/79 (22.8%) - - - - + - - - +
Pattern 2 14/79 (17.8%) - - + - + - - - +
Pattern 3 8/79 (10.1%) - - - + + - - - +
Pattern 4 5/79 (6.3%) - - + + + - - - +
Pattern 5 4/79 (5.1%) - - + - + - - - -
Pattern 6 2/79 (2.5%) - + + + + - - - +
Pattern 7 2/79 (2.5%) + - + - + - + - +
Pattern 8 2/79 (2.5%) - - + - + - + - +
Pattern 9 2/79 (2.5%) + - - + + - - - +
Pattern 10 2/79 (2.5%) + - - - + - + - +
*Patterns are listed in order of decreasing frequency. The ten most frequent patterns are described, and represent 74.6% of NRS reporting post-operative complications.
141
Table 6.6 Random-effects meta-analyses results among NRS reporting post-operative complications (n=79).
Attribute Present Absent
N
OR* [95% CI] I2♦ (95% CI) N
OR* [95% CI] I2♦ (95% CI)
Primary outcome specified 12 0.65 (0.56, 0.75) 84.7 (74.9-90.7) 67 0.61 (0.52, 0.71) 78.6 (73.2-82.9) Sample size calculation performed 3 0.80 (0.31, 2.04) 0.0 (0.0-87.2) 76 0.63 (0.57, 0.70) 80.6 (76.2-84.2) Prospective data collection 24 0.65 (0.55, 0.77) 53.4 (31.2-68.5) 45 0.62 (0.55, 0.70) 84.4(79.9-87.9) Concurrent controls 72 0.63 (0.57, 0.69) 81.4 (77.1-84.9) 7 0.77 (0.48, 1.25) 6.3 (0.0-72.6) Matched controls 24 0.56 (0.43, 0.75) 52.0 (23.4-69.9) 55 0.65 (0.58, 0.72) 83.8 (79.6-87.1) Standardized concurrent therapy 4 0.56 (0.23,1.39) 72.3 (21.6-90.2) 75 0.64 (0.57, 0.70) 80.2 (75.6-83.9) Systematic outcome assessment 10 0.75 (0.63, 0.89) 95.8 (93.8-97.1) 69 0.59 (0.52, 0.67) 48.3 (31.5-60.9) Blinded outcome assessment 0 - - 79 0.63 (0.57, 0.70) 79.9 (75.4-83.6) Intention to treat analysis 68 0.63 (0.57, 0.70) 81.8 (77.4-85.3) 11 0.63 (0.41, 0.97) 48.8 (0.0-74.4)
* Odds Ratio. OR<1 indicates that laparoscopy is associated with fewer post-operative complications as compared with open surgery. ♦ I-squared describes the percentage of total variation across studies that is due to heterogeneity instead of chance.
Table 6.7 Univariable meta-regression results among NRS reporting post-operative complications.
ROR* (95% CI) p-value Primary outcome 1.01 (0.72, 1.41) 0.97 Sample size calculation 0.82 (0.28, 2.49) 0.74 Prospective data collection 0.90 (0.69, 1.17) 0.44 Matched controls 1.09 (0.81, 1.48) 0.56 Concurrent controls 1.29 (0.72, 2.29) 0.39 Standardized concurrent therapy 1.11 (0.60, 2.08) 0.74 Systematic outcome assessment 0.82 (0.60, 1.11) 0.20 Blinded outcome assessment§ - - Intention to treat analysis 1.03 (0.68, 1.56) 0.90
* Ratios of odds ratios. Summary Effect estimates from NRS without characteristics were compared with summary effect estimates from NRS with study characteristics. A ROR<1.0 indicates that NRS without a study characteristic yield combined effect estimates that are more extreme than in NRS with the study characteristic.
142
Figure 6.1 Forest plot of meta-analysis results, stratified according to the presence or absence of specific NRS study characteristics for the outcome post-operative complications. Squares indicate odds ratios and error bars indicate 95% confidence intervals.
143
Table 6.8 Random-effects meta-analysis results across outcomes of interest and different study designs.
* Odds Ratio. OR<1 indicates that laparoscopy is associated with fewer post-operative complications as compared with open surgery. ◊ Odds Ratio. OR<1 indicates that laparoscopy is associated with fewer deaths. † Mean Difference, MD=meanlaparoscopy-meanopen. A MD<0 indicates that laparoscopy is associated with a shorter length of stay. § Mean Difference, MD=meanlaparoscopy-meanopen. A MD<0 indicates that laparoscopy is associated with finding fewer lymph nodes in the surgical specimen. ♦ I-squared describes the percentage of total variation across studies that is due to heterogeneity instead of chance. a Ratios of odds ratios. b Difference in mean differences.
Outcome NRS Strong RCTs NRS compared with Strong RCTs
OR (95% CI) I2♦ (95% CI) OR (95% CI) I2♦ (95% CI) RORa (95% CI) p-value Post-operative complications* 0.63 (0.57, 0.70) 77.2 (72.5, 81.1) 0.96 (0.80, 1.15) 15.4 (0.0, 87.1) 0.64 (0.42, 0.97) 0.04 Peri-operative mortality◊ 0.59 (0.47, 0.74) 33.9 (12.7, 50.0) 0.78 (0.46, 1.32) 0.0 (0.0, 73.9) 0.74 (0.38, 1.44) 0.38 MD (95% CI) I2♦ (95% CI) MD (95% CI) I2♦ (95% CI) DMDb (95% CI) p-value Length of stay† -2.95 (-3.39, -2.50) 97.3 (97.0, 97.6) -0.70 (-1.23, -0.17) 92.3 (83.4, 96.4) -2.15 (-4.08, -0.21) 0.03 Number of lymph nodes harvested§ 0.07 (-0.53, 0.67) 93.4 (92.2, 94.5) -0.55 (-1.37, 0.26) 86.3 (66.7, 94.4) 0.49 (-1.43, 2.42) 0.62
144
studies where post-operative complications were not the primary outcome, sample sizes were
absent, data retrospectively collected, controls were concurrent, systematic outcome
assessment was absent and where intention to treat analysis was performed had statistically
significant ratios of odds ratios < 1.0. Ratios of odds ratios were for the remaining
comparisons were also less than 1.0 and thus, the absence of a study characteristic was not
consistently related to more extreme estimates among NRS as compared with Strong RCTs.
Table 6.9 Univariable meta-regression results comparing NRS with or without study characteristics with Strong RCTs.
ROR* (95% CI) p-value§ Primary Outcome
Characteristic Absent 0.64 (0.41, 0.97) 0.04 Characteristic Present 0.63 (0.39, 1.04) 0.07
Strong RCTs Ref - Sample size calculation
Characteristic Absent 0.63 (0.42, 0.96) 0.03 Characteristic Present 0.77 (0.24, 2.41) 0.65
Strong RCTs Ref - Prospective data collection
Characteristic Absent 0.61 (0.39, 0.94) 0.03 Characteristic Present 0.67 (0.43, 1.05) 0.08
Strong RCTs Ref - Matched controls
Characteristic Absent 0.65 (0.42, 0.99) 0.047 Characteristic Present 0.59 (0.37, 0.95) 0.03
Strong RCTs Ref - Concurrent controls
Characteristic Absent 0.80 (0.41, 1.59) 0.53 Characteristic Present 0.63 (0.41, 0.96) 0.03
Strong RCTs Ref - Standardized concurrent therapy
Characteristic Absent 0.64 (0.42, 0.97) 0.04 Characteristic Present 0.57 (0.28, 1.16) 0.12
Strong RCTs Ref - Systematic outcome assessment
Characteristic Absent 0.61 (0.41, 0.91) 0.02 Characteristic Present 0.75 (0.47, 1.19) 0.22
Strong RCTs Ref - Blinded Outcome Assessment
Characteristic Absent 0.64 (0.42, 0.96) 0.03 Strong RCTs Ref -
Intention to treat analysis Characteristic Absent 0.65 (0.37, 1.13) 0.12 Characteristic Present 0.63 (0.42, 0.97) 0.03
Strong RCTs Ref - * Ratios of odds ratios. § Statistically significant p values (<0.05) indicated in bold.
145
6.4.2.2 Length of stay
Estimates for length of stay were reported in 106 NRS studies. These NRS involved 917,990
patients (n=57,900 having laparoscopic surgery and n=860,090 open surgery). Laparoscopy
and open groups did not differ in size (median=55.5, IQR=28.0-103.0 versus median=56.5,
IQR=30.2-142.5, p-value=0.56).
Table 6.10 describes the frequency of study characteristics across these NRS. Length of stay
was rarely the primary outcome of most studies (n=5, 4.7%). Sample size calculations (n=2,
1.9%), standardized post-operative care (n=11, 10.4%) and systematic outcome assessment
(n=11, 10.4%) were similarly infrequent. No studies employed blinded outcome assessment.
Just over a quarter of studies (n=28, 26.4%) used matching to mitigate the effects of selection
bias. Retrospective data collection was again common.
Table 6.10 Distribution of study attributes among NRS reporting length of stay (n=106).
Attribute Present N (%)
Absent N (%)
Primary outcome* 5 (4.7) 101 (95.3) Sample size calculation performed 2 (1.9) 104 (98.1) Prospective data collection 45 (42.5) 61 (57.5) Concurrent controls 94 (88.7) 12 (11.3) Matched controls 28 (26.4) 78 (73.6) Standardized concurrent therapy 9 (8.5) 97 (91.5) Systematic outcome assessment 11 (10.4) 95 (89.6) Blinded outcome assessment 0 (0.0) 106 (100.0) Intention to treat analysis 94 (88.7) 12 (11.3)
Patterns of study characteristics across NRS were analyzed and the ten most common
patterns are outlined in Table 6.11. Approximately half of these studies adhered to one of
two patterns; 25.5% of studies had retrospective data collection, concurrent controls and
146
intention to treat analysis but did not feature sample size calculations, matched controls,
standardized concurrent therapy (i.e. standardized post-operative care), blinded outcome
assessment or systematic outcome assessment. Length of stay was not the primary outcome
of these studies. An additional 21.7% were similar except that data had been collected
prospectively. Pattern 3 (n=12 studies, 11.3%) was identical to Pattern 1 except for the use of
matched controls.
Table 6.12 and Figure 6.2 outline the results of subgroup meta-analyses — effect estimates
were combined for NRS when specific study characteristics were present or absent.
Laparoscopy was associated with a shorter length of stay than open surgery for all subgroup
meta-analyses with two exceptions; NRS lacking concurrent controls did not show a benefit
for laparoscopy (DMD -2.84, 95% CI -3.65, 1.85) and NRS with a sample size calculation
(n=2 studies, MD -9.56, 95% CI -20.02, 0.90) similar did not favor laparoscopy. Notably,
only 2 studies with 149 patients contributed to this last subgroup and confidence intervals are
accordingly wide. In one of these studies, the mean length of stay among patients undergoing
open surgery (n=62 patients) was 35.8 days versus 18.7 days for laparoscopy patients (n=45)
(Marubashi et al. 2000).
Mixed-effects meta-regression modeling was used to compare summary effect estimates
across subgroups. Table 6.13 outlines the results of comparing NRS without a characteristic
to NRS with a specific characteristic. The effect estimate from studies lacking a sample size
calculation (n=104) was statistically different from effect estimates from studies with a stated
sample size calculation (n=2). However, this comparison should be interpreted with caution
given only two small studies contributed to the referent group (i.e. NRS with a sample size
calculation) and one of these studies had very long lengths of stay for enrolled patients. None
of the remaining comparisons were statistically significant and thus, effect estimates did not
differ according to the presence or absence of a specific study characteristic.
147
Table 6.11 Study characteristics patterns across NRS reporting length of stay (n=106 studies).
Pattern*
N %
Primary outcome
Sample size
calculation
Prospective Data
Collection Matched controls
Concurrent controls
Standardized concurrent
therapy
Systematic outcome
assessment
Blinded outcome
assessment
Intention to treat analysis
Pattern 1 27/106 (25.5%) - - - - + - - - +
Pattern 2 23/106 (21.7%) - - + - + - - - +
Pattern 3 12/106 (11.3%) - - - + + - + - +
Pattern 4 5/106 (4.7%) - - + - + - - - +
Pattern 5 5/106 (4.7%) - - - - - - - - +
Pattern 6 4/106 (3.8%) - - + + + - + - +
Pattern 7 4/106 (3.8%) - - + - + - - - -
Pattern 8 3/106 (2.8%) - - - - + - - - +
Pattern 9 3/106 (2.8%) - - - - + - - - -
Pattern 10 2/106 (1.9%) - - + - + + - - +
*Patterns are listed in order of decreasing frequency. The ten most frequent patterns are described, and represent 83.0% of NRS reporting length of stay.
148
Table 6.12 Random-effects meta-analysis results among NRS reporting length of stay (n=106).
Attribute Present Absent
N MD* (95% CI) I2♦ (95% CI) N MD* (95% CI) I2♦ (95% CI) Primary outcome specified 5 -2.98 (-3.97, -1.99) 98.7 (98.1-99.1) 101 -2.94 (-3.41, -2.48) 96.9 (96.5-97.2) Sample size calculation performed 2 -9.56 (-20.02, 0.90) 98.9* 104 -2.79 (-3.14, -2.44) 97.2 (96.9-97.5) Prospective data collection 45 -2.84 (-3.37, -2.31) 87.4 (84.1-90.1) 61 -2.99 (-3.65, -2.32) 98.3 (98.1-98.5) Concurrent controls 94 -2.94 (-3.43, -2.46) 97.6 (97.3-97.8) 12 -2.84 (-3.83, -1.85) 80.3 (66.4-88.4) Matched controls 28 -3.10 (-3.78, -2.42) 97.2 (96.5-97.7) 78 -2.90 (-3.45, -2.35) 97.3 (97.0-97.6) Standardized concurrent therapy 9 -3.47 (-5.22, -1.72) 93.7 (90.1-96.0) 97 -2.89 (-3.33, -2.44) 97.4 (97.1-97.6) Systematic outcome assessment 11 -2.42 (-2.88, -1.96) 98.8 (98.5-99.1) 95 -3.01 (-3.52, -2.52) 96.9 (96.6-97.2) Blinded outcome assessment 0 - - 106 -2.95 (-3.39, -2.50) 97.3 (97.0-97.6) Intention to treat analysis 94 -2.87 (-3.24, -2.49) 97.5 (97.2-97.7) 12 -3.34 (-5.87, -0.80) 95.1 (93.0-96.6)
* Mean Difference, MD=meanlaparoscopy-meanopen. A MD<0 indicates that laparoscopy is associated with a shorter length of stay. ♦ I-squared describes the percentage of total variation across studies that is due to heterogeneity instead of chance.
Table 6.13 Univariable meta-regression results among NRS reporting length of stay.
DMD* (95% CI) p-value Primary Outcome 0.07 (-2.05, 2.18) 0.95 Sample size calculation 6.87 (3.89, 9.84) <0.001 Prospective data collection -0.09 (-0.99, 0.81) 0.84 Matched controls 0.21 (-0.81, 1.24) 0.69 Concurrent controls -0.04 (-1.49, 1.42) 0.96 Standardized concurrent therapy 0.62 (-0.96, 2.20) 0.44 Systematic outcome assessment -0.55 (-1.88, 0.79) 0.43 Blinded Outcome Assessment - - Intention to treat analysis -0.50 (-1.93, 0.94) 0.50
* Differences in mean differences. A negative DMD indicates that NRS without a study characteristics yielded combined effect estimates that are more extreme than NRS with the study characteristic.
149
Figure 6.2 Forest plot of meta-analysis results, stratified according to the presence or absence of specific NRS study characteristics for the outcome length of stay. Squares indicate mean differences and error bars indicate 95% confidence intervals.
150
An inconsistent pattern again emerged when comparing effect estimates from NRS with or
without study characteristics to the gold-standard (i.e. effect estimates from Strong RCTs)
(Table 6.14). Differences in mean differences were statistically significant when NRS lacked
a study characteristic (length of stay was not the primary outcome, prospective data
collection was absent, concurrent therapy was not standardized, systematic outcome
assessment was lacking) as well as when concurrent controls were used. For the
characteristics sample size calculation, matched controls and intention to treat analysis,
DMDs were statistically significant when both study characteristics were present or absent.
The negative DMDs observed indicate that NRS generally attributed shorter lengths of stay
to laparoscopy than Strong RCTs, however, there was no clear pattern of more extreme
estimates of benefit attributable to the absence of NRS study characteristics.
6.4.3 Objective outcomes
6.4.3.1 Peri-operative mortality
Seventy-nine NRS examined the association between surgical approach (laparoscopic or
open colon surgery) and mortality. A total of 1,078,369 patients were included in these NRS
(n=52,485 had laparoscopy and n=1,025,884 open surgery). On average, laparoscopy and
open surgery groups were equal in size (median=61.0, IQR=36.0-138.5 versus median=61.0,
IQR 34.0-147.5, p-value=0.64).
Peri-operative mortality was the primary outcome of 11.4% of studies (Table 6.15). Again,
retrospective data collection was common (n=43, 54.4%) as was the use of concurrent
controls (n=71, 89.9%). Post-operative care was rarely standardized (n=10.1%) and outcome
collection was similarly not standardized in most studies (n=70, 88.6%). Matched controls
were used in 29.1% (n=23 studies).
151
Table 6.14 Univariable meta-regression results comparing NRS with or without study characteristics with Strong RCTs.
DMD* (95% CI) p-value§ Primary Outcome
Characteristic Absent -2.15 (-4.20, -0.10) 0.04 Characteristic Present -2.22 (-5.06, 0.63) 0.13
Strong RCTs Ref - Sample size calculation
Characteristic Absent -2.02 (-3.82, -0.21) 0.03 Characteristic Present -8.89 (-12.28, -5.50) <0.001
Strong RCTs Ref - Prospective data collection
Characteristic Absent -2.19 (-4.28, -0.11) 0.04 Characteristic Present -2.10 (-4.21, 0.01) 0.05
Strong RCTs Ref - Matched controls
Characteristic Absent -2.10 (-4.16, -0.04) 0.046 Characteristic Present -2.31 (-4.49, -0.13) 0.04
Strong RCTs Ref - Concurrent controls
Characteristic Absent -2.18 (-4.60, 0.24) 0.08 Characteristic Present -2.15 (-4.20, -0.10) 0.04
Strong RCTs Ref - Standardized concurrent therapy
Characteristic Absent -2.10 (-4.14, -0.06) 0.04 Characteristic Present -2.73 (-5.21, -0.25) 0.03
Strong RCTs Ref - Systematic outcome assessment
Characteristic Absent -2.22 (-4.27, -0.18) 0.03 Characteristic Present -1.68 (-4.02, 0.66) 0.16
Strong RCTs Ref - Blinded Outcome Assessment
Characteristic Absent -2.15 (-4.19, -0.12) 0.04 Strong RCTs Ref -
Intention to treat analysis Characteristic Absent -2.60 (-5.00, -0.20) 0.04 Characteristic Present -2.10 (-4.15, -0.05) 0.04
Strong RCTs Ref - * Differences in mean differences. § Statistically significant p values (<0.05) indicated in bold.
152
Table 6.15 Distribution of study attributes among NRS reporting peri-operative mortality (n=79).
Attribute Present N (%)
Absent N (%)
Primary outcome* 9 (11.4) 70 (88.6) Sample size calculation performed 1 (1.3) 78 (98.7) Prospective data collection 36 (45.6) 43 (54.4) Concurrent controls 71 (89.9) 8 (10.1) Matched controls 23 (29.1) 56 (70.9) Standardized concurrent therapy 8 (10.1) 71 (89.9) Systematic outcome assessment 9 (11.4) 70 (88.6) Blinded outcome assessment 0 (0.0) 79 (100.0) Intention to treat analysis 70 (88.6) 9 (11.4)
Table 6.16 outlines the most commonly observed patterns of study characteristics across
NRS reporting peri-operative mortality. Half of these studies adhered to either Pattern 1, 2 or
3. A total of 24.1% of studies had retrospective data collection, concurrent controls and
intention to treat analysis but did not match controls, standardize concurrent therapy (i.e.
post-operative care), blind outcome assessors, have systematic outcome assessment or
sample size calculations. Another 20.3% of studies had prospective data collection and were
otherwise identical to Pattern 1. Patterns 1 & 3 only differed in that the latter (n=7 studies,
8.8%) employed matched controls.
Subgroup meta-analyses were performed and studies were divided according to the presence
or absence of NRS study characteristics (Table 6.17 and Figure 6.3). Laparoscopy was
associated with lower peri-operative mortality in most subgroups except when a sample size
calculation had been performed, concurrent therapy standardized, historical controls used and
when intention to treat analysis was absent.
Table 6.18 outlines the results of mixed-effects meta-regression modeling which allowed for
the comparison of effect estimates across NRS subgroups. NRS with retrospective data
collection attributed more benefit to laparoscopy than NRS with prospective data collection
(ROR 0.62, 95% CI 0.44, 0.87, p-value=0.01). NRS whose primary outcome had been peri-
153
Table 6.16 Study characteristics patterns across NRS reporting peri-operative mortality (n=79 studies).
Pattern*
N %
Primary outcome
Sample size
calculation
Prospective Data
Collection Matched controls
Concurrent controls
Standardized concurrent
therapy
Systematic outcome
assessment
Blinded outcome
assessment
Intention to treat analysis
Pattern 1 19/79 (24.1%) - - - - + - - - +
Pattern 2 16/79 (20.3%) - - + - + - - - +
Pattern 3 7/79 (8.9%) - - - + + - - - +
Pattern 4 4/79 (5.1%) - - + + + - - - +
Pattern 5 3/79 (2.5%) - - + - + - - - +
Pattern 6 3/79 (2.5%) - - + - + - - - +
Pattern 7 2/79 (2.5%) - - + + + + + - +
Pattern 8 2/79 (2.5%) - - + - + + + - +
Pattern 9 2/79 (2.5%) - - - + - - - - +
Pattern 10 2/79 (2.5%) - - - - + - + - +
*Patterns are listed in order of decreasing frequency. The ten most frequent patterns are described, and represent 88.4% of NRS reporting peri-operative mortality.
154
Table 6.17 Random-effects meta-analysis results among NRS reporting peri-operative mortality (n=79).
Attribute Present Absent
N OR*
[95% CI] I2♦ (95% CI) N OR*
[95% CI] I2♦ (95% CI) Primary outcome 9 0.36 (0.32, 0.42) 0.0 (0.0-57.4) 70 0.74 (0.67, 0.83) 0.0 (0.0-2.8) Sample size calculation performed 1 2.16 (0.09, 55.08) - 78 0.59 (0.47, 0.73) 34.4 (13.3-50.4) Prospective data collection 36 0.78 (0.70, 0.88) 0.0 (0.0-13.5) 43 0.39 (0.34, 0.45) 0.0 (0.0-14.8) Concurrent controls 71 0.59 (0.46, 0.74) 39.0 (18.7-54.3) 8 0.70 (0.24, 2.10) 0.0 (0.0, 27.2) Matched controls 23 0.61 (0.43, 0.85) 0.0 (0.0-0.0) 56 0.58 (0.43, 0.76) 49.5 (31.2-63.0) Standardized concurrent therapy 8 0.47 (0.19, 1.15) 0.0 (0.0-55.4) 71 0.60 (0.47, 0.76) 37.9 (17.1-53.5) Systematic outcome assessment 9 0.51 (0.35, 0.76) 57.7 (11.4-79.8) 70 0.76 (0.68, 0.85) 0.0 (0.0-0.0) Blinded outcome assessment 0 - - 79 0.59 (0.47, 0.74) 33.9 (12.7-50.0) Intention to treat analysis 70 0.58 (0.45, 0.74) 39.6 (19.3- 54.8) 9 0.80 (0.35, 1.79) 0.0 (0.0-11.7)
* Odds Ratio. OR<1 indicates that laparoscopy is associated with fewer deaths. ♦ I-squared describes the percentage of total variation across studies that is due to heterogeneity instead of chance.
Table 6.18 Univariable meta-regression results among NRS reporting peri-operative mortality.
ROR (95% CI) p-value§ Primary Outcome 1.68 (1.11, 2.52) 0.01 Sample size calculation 0.27 (0.01, 7.30) 0.44 Prospective data collection 0.62 (0.44, 0.87) 0.01 Matched controls 0.85 (0.50, 1.43) 0.54 Concurrent controls 1.22 (0.39, 3.81) 0.73 Standardized concurrent therapy 1.23 (0.46, 3.25) 0.68 Systematic outcome assessment 1.33 (0.89, 1.99) 0.17 Blinded Outcome Assessment - - Intention to treat analysis 1.41 (0.59, 3.38) 0.44
* Ratios of odds ratios. Summary Effect estimates from NRS without characteristics were compared with summary effect estimates from NRS with study characteristics. A ROR<1.0 indicates that NRS without a study characteristic yield combined effect estimates that are more extreme than in NRS with the study characteristic. § Statistically significant p values (<0.05) indicated in bold.
155
Figure 6.3 Forest plot of meta-analysis results, stratified according to the presence or absence of specific NRS study characteristics for the outcome peri-operative mortality. Squares indicate mean differences and error bars indicate 95% confidence intervals. The dashed black line indicates that the confidence interval extends beyond the plot area.
156
operative morality (n= 9 studies) had effect estimates closer to the null, and thus were less in
favour of laparoscopy than NRS where peri-operative mortality was a secondary outcome
(ROR 1.68; 95% CI 1.11, 2.52). The absence of the remaining study characteristics was not
associated with more extremes estimates of benefit for laparoscopy.
Table 6.19 outlines the results of comparing effect estimates from NRS with or without a
study characteristic with the results of Strong RCTs. The presence or absence of NRS study
characteristics was not associated with more extreme estimates of benefit for laparoscopy.
6.4.3.2 Number of lymph nodes harvested
Fifty-nine NRS reported the number of lymph nodes harvested. These studies involved
252,482 participants (n=15,302 underwent laparoscopy and n= 237,180 open surgery).
Laparoscopy and open surgery groups had roughly an equal number of patients (median 50,
IQR 27.0-94.5 versus median 55, IQR 30.5-132.5, p-value=0.46).
Number of lymph nodes harvested was rarely the primary outcome of most NRS (Table
6.20). A minority of studies employed blinding of pathologists (n=3, 5.1%). Only one study
mentioned standardizing the processing and assessment of surgical specimens (1.7%) and
thus was considered to have standardized outcome assessment. Data collection was more
often retrospective than prospective and concurrent controls were more common than
historical ones. Matched controls were used in 23.7% of studies.
157
Table 6.19 Univariable meta-regression results comparing NRS with or without study characteristics with Strong RCTs.
ROR* (95% CI) p-value Primary Outcome
Characteristic Absent 0.91 (0.42, 2.00) 0.15 Characteristic Present 0.54 (0.24, 1.24) 0.82
Strong RCTs Referent - Sample size calculation
Characteristic Absent 0.79 (0.35, 1.80) 0.58 Characteristic Present 2.96 (0.10, 87.60) 0.53
Strong RCTs Referent - Prospective data collection
Characteristic Absent 0.63 (0.29, 1.37) 0.24 Characteristic Present 1.02 (0.47, 2.23) 0.96
Strong RCTs Referent - Matched controls
Characteristic Absent 0.77 (0.34, 1.77) 0.54 Characteristic Present 0.91 (0.36, 2.27) 0.83
Strong RCTs Referent - Concurrent controls
Characteristic Absent 0.97 (0.35, 3.81) 0.97 Characteristic Present 0.79 (0.35, 1.80) 0.58
Strong RCTs Referent - Standardized concurrent therapy
Characteristic Absent 0.81 (0.36, 1.84) 0.61 Characteristic Present 0.66 (0.19, 2.26) 0.51
Strong RCTs Referent - Systematic outcome assessment
Characteristic Absent 0.89 (0.39, 2.00) 0.78 Characteristic Present 0.67 (0.29, 1.52) 0.33
Strong RCTs Referent - Blinded Outcome Assessment
Characteristic Absent 0.74 (0.38, 1.44) 0.38 Strong RCTs Referent -
Intention to treat analysis Characteristic Absent 1.10 (0.35, 3.51) 0.87 Characteristic Present 0.78 (0.34, 1.77) 0.56
Strong RCTs Referent - * Ratios of odds ratios. § Statistically significant p values (<0.05) indicated in bold.
158
Table 6.20 Distribution of study attributes among NRS reporting number of lymph nodes harvested (n=59).
Attribute Present N (%)
Absent N (%)
Primary outcome* 0 (0.0) 59 (100.0) Sample size calculation performed 0 (0.0) 59 (100.0) Prospective data collection 25 (42.4) 34 (57.6) Concurrent controls 51 (86.4) 8 (13.6) Matched controls 14 (23.7) 45 (76.3) Standardized concurrent therapy 5 (8.5) 54 (91.5) Systematic outcome assessment 1 (1.7) 58 (98.3) Blinded outcome assessment 3 (5.1) 56 (94.9) Intention to treat analysis 53 (89.8) 6 (10.2)
Table 6.21 describes the most common patterns of study characteristics across NRS reporting
number of lymph nodes harvested. As with previous outcomes, most studies adhered to one
of three patterns. A total of 24.1% of studies had retrospective data collection, concurrent
controls and intention to treat analysis but had not employed matched controls, standardized
concurrent therapy (i.e. post-operative care), used blinded outcome assessors or systematic
outcome assessment. Pattern 2 (n=16 studies, 27.1%) differed only with regards to the use of
prospective data collection. Pattern 3 (n=8.5%) instead featured the use of matched controls
but was otherwise identical to Pattern 1.
Subgroup meta-analyses were performed with NRS stratified by the presence or absence of
study characteristics (Table 6.22 and Figure 6.4). Mean differences ranged from -0.65 to 1.40
and none were statistically significant. Studies in most strata were highly heterogeneous with
I2 values greater than 50% for 13 of 15 strata. Table 6.23 outlines the results of comparing
effect estimates from NRS with or without a study characteristic with the results of Strong
RCTs. The presence or absence of NRS study characteristics was not associated with more
extreme estimates of benefit for laparoscopy.
Mixed-effects meta-regression modeling was used to compare the results of NRS with or
without study characteristics with the summary effect estimate from Strong RCTs (Table
6.24). As with the previous three outcomes of interest, no clear pattern emerged. Differences
159
Table 6.21 Study characteristics patterns across NRS reporting number of lymph nodes harvested (n=59 studies).
Pattern*
N %
Primary outcome
Sample size
calculation
Prospective Data
Collection Matched controls
Concurrent controls
Standardized concurrent
therapy
Systematic outcome
assessment
Blinded outcome
assessment
Intention to treat analysis
Pattern 1 17/59 28.8% - - - - + - - - +
Pattern 2 16/59 27.1% - - + - + - - - +
Pattern 3 5/59 8.5% - - - + + - - - +
Pattern 4 3/59 5.1% - - + + + - - - +
Pattern 5 3/59 5.1% - - - - - - - - -
Pattern 6 2/59 3.4% - - + - + - - - +
Pattern 7 2/59 3.4% - - - + + + - - +
Pattern 8 2/59 3.4% - - - - - - - - +
Pattern 9 1/59 1.2% - - + + + + - - +
Pattern 10 1/59 1.2% - - + + - - - - +
*Patterns are listed in order of decreasing frequency. The ten most frequent patterns are described, and represent 74.6% of NRS reporting number of lymph nodes harvested.
160
Table 6.22 Random-effects meta-analysis results among NRS reporting number of lymph nodes (n=59).
Attribute Present Absent
N
MD* [95% CI] I2 (95% CI) N
MD* [95% CI] I2 (95% CI)
Primary outcome 0 - - 59 0.07 (-0.53, 0.67) 93.4 (92.2-94.5) Sample size calculation performed 0 - - 59 0.07 (-0.53, 0.67) 93.4 (92.2-94.5) Prospective data collection 25 -0.11 (-0.92, 0.69) 90.5 (87.3-93.0) 34 0.25 (-0.63, 1.14) 94.7 (93.4-95.7) Concurrent controls 51 -0.10 (-0.72, 0.53) 93.6 (92.4-94.7) 8 1.26 (-0.27, 2.80) 67.3 (31.0-84.5) Matched controls 14 -0.15 (-1.08, 0.78) 49.4 (6.3-72.6) 45 0.18 (-0.53, 0.89) 94.6 (93.5-95.5) Standardized concurrent therapy 5 0.26 (-1.17, 1.69) 61.2 (0.0-85.4) 54 0.07 (-0.56, 0.71) 93.6 (92.4-94.6) Systematic outcome assessment† 1 1.40 (0.45, 2.35) - 58 0.04 (-0.57, 0.65) 93.4 (92.1-94.4) Blinded outcome assessment 3 -0.65 (-3.54, 2.24) 98.5 (97.3-99.1) 56 0.14 (-0.46, 0.73) 87.6 (84.7-90.0) Intention to treat analysis 53 0.04 (-0.62, 0.71) 94.1 (92.9-95.0) 6 0.22 (-0.72, 1.16) 0.0 (0.0-55.5)
* Mean Difference, MD=meanlaparoscopy-meanopen. A MD<0 indicates that laparoscopy is associated with finding fewer lymph nodes in the surgical specimen. ♦ I-squared describes the percentage of total variation across studies that is due to heterogeneity instead of chance. †Since one NRS reported systematic outcome assessment, there is no measure of between-study heterogeneity.
Table 6.23 Univariable meta-regression results among NRS reporting number of lymph nodes harvested.
DMD (95% CI) p-value Primary Outcome - - Sample size calculation - - Prospective data collection 0.37 (-0.85, 1.60) 0.55 Matched controls 0.50 (-0.99,. 1.99) 0.51 Concurrent controls 1.31 (-0.47, 3.08) 0.15 Standardized concurrent therapy -0.02 (-2.22, 2.19) 0.99 Systematic outcome assessment -1.36 (-5.31, 2.59) 0.50 Blinded Outcome Assessment 0.84 (-1.54, 3.23) 0.49 Intention to treat analysis 0.30 (-1.70, 2.30) 0.77
* Differences in mean differences. A negative DMD indicates that NRS without a study characteristics yielded combined effect estimates that are more extreme than NRS with the study characteristic.
161
161
Figure 6.4 Forest plot of meta-analysis results, stratified according to the presence or absence of specific NRS study characteristics for the outcome number of lymph nodes harvested. Squares indicate mean differences and error bars indicate 95% confidence intervals.
162
Table 6.24 Univariable meta-regression results comparing NRS with or without study characteristics with Strong RCTs.
DMD (95% CI) p-value Primary Outcome
Characteristic Absent 0.49 (-1.52, 2.50) 0.63 Strong RCTs Referent -
Sample size calculation Characteristic Absent 0.49 (-1.52, 2.50) 0.63
Strong RCTs Referent - Prospective data collection
Characteristic Absent 0.50 (-1.62, 2.62) 0.64 Characteristic Present 0.14 (-2.02, 2.28) 0.90
Strong RCTs Referent - Matched controls
Characteristic Absent 0.44 (-1.64, 2.51) 0.68 Characteristic Present -0.05 (-2.40, 2.29) 0.96
Strong RCTs Referent - Concurrent controls
Characteristic Absent 1.49 (-0.99, 3.97) 0.24 Characteristic Present 0.17 (-1.83, 2.15) 0.87
Strong RCTs Referent - Standardized concurrent therapy
Characteristic Absent 0.33 (-1.73, 2.39) 0.75 Characteristic Present 0.36 (-2.48, 3.19) 0.81
Strong RCTs Referent - Systematic outcome assessment
Characteristic Absent 0.30 (-1.75, 2.34) 0.78 Characteristic Present 1.66 (-2.57, 5.89) 0.44
Strong RCTs Referent - Blinded Outcome Assessment
Characteristic Absent 0.39 (-1.65, 2.44) 0.71 Characteristic Present -0.47 (-3.42, 2.48) 0.76
Strong RCTs Referent - Intention to treat analysis
Characteristic Absent 0.60 (-2.09, 3.29) 0.66 Characteristic Present 0.30 (-1.76, 2.36) 0.77
Strong RCTs Referent - * Differences in mean differences. § Statistically significant p values (<0.05) indicated in bold.
in mean differences were not statistically significant when comparing effect estimates from
NRS without study characteristics and Strong RCTs.
163
6.5 Discussion
The objective of this study was to explore the relationship between study characteristics and
effect estimates in NRS comparing laparoscopy with open surgery for the treatment of colon
cancer. Our overarching aim was to identify specific study characteristics associated with
more extreme estimates of benefit for laparoscopy as compared with Strong RCTs. The
relative frequency of NRS study characteristics is largely unknown and this study sheds light
on how NRS in surgery have been performed; sample size calculations, historical controls,
standardized post-operative care (i.e. standardized concurrent therapy), blinded or systematic
outcome assessment were all rare among NRS. Retrospective data collection was common
and matching was used in approximately a quarter of NRS to overcome selection bias. The
frequency of these characteristics may vary among NRS in other areas of medicine but this
study nonetheless provides some insight into how NRS have been conducted in surgery.
We used mixed-effects meta-regression modeling to evaluate how effect estimates from NRS
with or without certain study characteristics compare. For the outcomes post-operative
complications, length of stay and number of lymph nodes harvested, none of the effect
estimates differed across NRS subgroups. For the outcome peri-operative mortality, NRS
with retrospective data collection had more extreme estimates of benefit for laparoscopy than
NRS with prospective data collection (ROR 0.62, 95% CI 0.44, 0.87, p-value=0.01). In
addition, effect estimates were closer to the null (i.e. less in favour of laparoscopy) in NRS
where the primary outcome was peri-operative mortality as opposed to NRS in which post-
operative death was a secondary outcome (ROR 1.68, 95% CI 1.11, 2.52). However, when
the effect estimates from these subgroups were compared with the results of Strong RCTs,
none proved to be statistically significant. Indeed, across all four outcomes of interest, we did
not observe a consistent pattern of more extreme benefit for laparoscopy with the absence of
a particular NRS study characteristic.
Many of the NRS study characteristics examined appear frequently in popular NRS quality
assessment tools. For example, the Downs and Black checklist assesses whether sample size
164
calculations were performed, concurrent controls used, and if outcome assessors were
blinded, among other criteria (Downs and Black 1998). The Newcastle-Ottawa scale also
includes a consideration of blinded outcome assessment (GA Wells). Furthermore, Wells et
al. have recently described a checklist that evaluates NRS with regards to the use of matched
controls (Wells et al. 2013). Our results suggest that these study characteristics may not help
distinguish between NRS at low or high risk of bias. Indeed, the absence of sample size
calculations, concurrent controls, blinded outcome assessment and matched controls was not
associated with more extreme estimates of benefit. NRS with or without these study
characteristics yielded summary effect measures that did not statistically differ from those of
Strong RCTs. Moreover, since many of the aforementioned characteristics were rarely
present, they are unlikely to be helpful in discriminating NRS at high and low risk of bias.
While we did not identify a relationship between NRS study characteristics and effect
estimates in this study, it is possible one might exist. There are number of reasons why a
Type II error may have occurred. First, we relied on reported study methods to determine the
presence or absence of study characteristics. For example, even though many articles did not
mention systematic outcome assessment, it is possible some investigators had standardized
the collection of outcome data. Indeed, it has been shown that there can be significant
discrepancies between reported study methods and actual study conduct for RCTs (Mhaskar
et al. 2012). Mhaskar et al. found that even though 39% of RCT protocols specified adequate
methods for randomization, only 23% of articles included this information. A similar
discrepancy between published reports and NRS methods may exist. Our study was also
limited by the relative infrequency of many study characteristics. NRS with sample size
calculations, historical controls, standardized post-operative care, systematic outcome
assessment and blinded outcome assessment were relatively rare. Moreover, we had limited
power to detect important effects as we used only 121 NRS for the current analysis. The
strengths of this study include the use of mixed-effects meta-regression modeling to
determine the comparability of effect estimates across subgroups. Furthermore, our analyses
used summary effect estimates from Strong RCTs as the “gold-standard” for comparisons;
165
we chose to handle the known variability in surgical trial quality by pooling effect estimates
from RCTs at the lowest risk of bias.
Multiple meta-epidemiological studies have established that RCTs with inadequate random-
sequence generation, allocation concealment, and double-blinding yield biased estimates of
treatment effect (Schulz et al. 1995; Moher et al. 1998; Kjaergard, Villumsen, and Gluud
2001; Balk et al. 2002; Als-Nielsen et al. 2003; Tierney and Stewart 2005; Pildal et al. 2007;
Nuesch, Reichenbach, et al. 2009; Nuesch, Trelle, et al. 2009; Nuesch et al. 2010; Bassler et
al. 2010; Dechartres et al. 2011; Bafeta et al. 2012; Hrobjartsson et al. 2012, 2013; Wood et
al. 2008; Savovic et al. 2012). The analyses in these studies routinely involved >300 trials,
across multiple interventions. Perhaps a similarly large cohort of NRS, examining multiple
interventions, is required to definitively identify which aspects of NRS-design are associated
with biased treatment effects.
6.6 Conclusion
Effect estimates did not consistently vary according to the presence or absence of NRS
design characteristics among studies comparing laparoscopy and open surgery for the
treatment of colon cancer. Additional studies are necessary to identify the attributes of NRS-
design associated with bias.
166
Chapter 7 General Discussion and Future Directions
7.1 Summary of findings
This thesis focused on examining bias in RCTs and NRS of surgical interventions. NRS
remain far more common in surgery than RCTs even though the latter are considered the
most reliable source of evidence when evaluating therapeutic interventions (Sackett and
Sackett 1991). There are a number of reasons why surgical RCTs remain rare. First, surgical
devices and interventions do not require regulatory approval in the same manner as novel
drugs (McLeod 1999). Accordingly, there is far less funding available to conduct surgical
RCTs. Second, recruitment to surgical trials is hindered by the uncertainty associated with
randomization — such uncertainty affects both patients and physicians (Mills et al. 2003;
Campbell et al. 2010). Third, investigators must also overcome the logistical challenges of
standardizing surgical technique (Meakins 2002). For these reasons, NRS heavily inform the
evidence base in surgery and are likely to continue to do so. However, an important question
remained unanswered; do surgical NRS and RCTs yield similar estimates of treatment
effect? This question was the starting point for this thesis work.
Before we could study bias in NRS, a conceptual framework of this phenomenon was
required. Because no existing framework was identified to guide our analyses, we proceeded
to develop a novel one. In Chapter 3, we conducted a modified framework synthesis to
develop such a framework. Sources of bias were identified from systematic reviews of NRS
quality assessment tools and analyzed thematically. This process yielded a hierarchical
framework with six overarching domains (selection bias, information bias, performance bias,
detection bias, attrition bias, and selective reporting bias), with 37 individual sources of bias
nested under these domains.
167
In Chapter 5, we compared effect estimates from NRS with those from i) All RCTs;
ii) Typical RCTs (i.e. at unclear or high risk of bias) and iii) Strong RCTs (i.e. low risk of
bias trials). Studies evaluating laparoscopy and conventional (open) surgery for the treatment
of colon cancer were used for this case study of bias. Among subjective outcomes (post-
operative complications and length of stay), NRS were associated with more extreme
estimates of benefit for laparoscopy than Strong RCTs. For the outcome post-operative
complications, NRS attributed 36% more benefit to laparoscopy than Strong RCTs.
Laparoscopy was associated with reductions in length of stay that were exaggerated
three-fold in NRS as compared with Strong RCTs. A similar pattern was not observed with
the objective outcomes peri-operative mortality and number of lymph nodes harvested. The
observed differences between NRS and Strong RCTs persisted after adjusting for period
effects and differences in baseline event rates between studies. Moreover, Typical RCTs
were also associated with larger estimates of benefit for laparoscopy as compared to Strong
RCTs. Odds ratios for post-operative complications were 37% smaller (i.e. more benefit) in
Typical than Strong RCTs. Laparoscopy was associated with a reduction in length of stay
that was two-fold larger in Typical than Strong RCTs.
Our findings suggest that surgical NRS are associated with more extreme estimates of benefit
as compared with Strong RCTs. Previous meta-epidemiological studies had identified a
number of design characteristics associated with biased effect estimates in RCTs (Schulz et
al. 1995; Moher et al. 1998; Kjaergard, Villumsen, and Gluud 2001; Balk et al. 2002; Als-
Nielsen et al. 2003; Tierney and Stewart 2005; Pildal et al. 2007; Nuesch, Reichenbach, et al.
2009; Nuesch, Trelle, et al. 2009; Nuesch et al. 2010; Bassler et al. 2010; Dechartres et al.
2011; Bafeta et al. 2012; Hrobjartsson et al. 2012, 2013; Wood et al. 2008; Savovic et al.
2012). We hypothesized that study characteristics may be similarly associated with bias
among NRS. Identifying these characteristics could explain part of the variation observed
between NRS and Strong RCTs. We therefore examined the relationship between NRS
design attributes and estimates of treatment effect. In Chapter 6, nine study characteristics
were examined: (i) whether the outcome of interest was the primary outcome; (ii) presence of
a sample size calculation; (iii) prospective data collection; (iv) concurrent (versus historical)
168
controls; (v) matched controls; (vi) standardized concurrent therapy (i.e. standardized post-
operative care); (vii) systematic outcome assessment; (viii) blinded outcome assessment and
(ix) intention to treat analysis. The majority of NRS were retrospective, had concurrent
controls and applied intention to treat principles. Few studies used standardized concurrent
therapy (i.e. standardized post-operative care), blinded outcome assessment or systematic
outcome assessment. Mixed-effects meta-regression models were used to compare effect
estimates across subgroups of NRS (i.e. those with or without a study characteristic) and
between NRS and Strong RCTs. We did not observe a consistent pattern of more extreme
benefit for laparoscopy with the absence of any particular NRS study characteristic.
The findings of this thesis represent novel contributions to the field of methodological
research in surgery. Many have hypothesized that the results of NRS differ from those of
well-conducted RCTs. Our results provide empirical evidence supporting this assertion; NRS
appear to be associated with notable bias when evaluating subjective outcomes. Moreover,
the results of the most rigorous RCTs appear to differ from those of more Typical RCTs.
This thesis work has broad implications on how we interpret evidence from surgical studies.
7.2 Implications
The work presented in this dissertation has three main implications.
(1) Implications for the meta-analysis of surgical RCTs.
(2) Implications for the interpretation of surgical NRS.
(3) Implications for future meta-epidemiological studies of NRS study characteristics.
169
7.2.1 Implications for the meta-analysis of surgical RCTs
Physicians and policy-makers rely on the synthesis of the best available evidence to inform
decision-making. Generally, the meta-analysis of RCT data is considered the strongest
source of evidence for the evaluation of interventions. In Chapter 5 however, we observed
that a minority of RCTs comparing laparoscopy with open surgery for the treatment of colon
cancer were low risk of bias trials (i.e. Strong RCTs) — the remaining 80% (i.e. Typical
RCTs) had methodological shortcomings. Our results provide empirical support for a
difference in effect estimates between trials at low risk of bias versus those at high/unclear
risk of bias when examining subjective outcomes. Furthermore, the results of unclear/high
risk of bias trials (i.e. Typical RCTs) were very similar to the results of NRS. It is possible
that the trial conduct in Typical RCTs more closely resemble what is encountered in NRS;
RCTs generally differ from NRS not only in how patients are allocated to interventions but
also in other respects, such as study registration, detailed protocols and data monitoring
safety boards. While Typical RCTs indeed involved randomizing patients to treatment, they
generally lacked these other features. We therefore recommend that the meta-analysis of
surgical RCTs should routinely incorporate both risk of bias assessments and subgroup
analyses of low risk of bias trials. In instances where no low risk of bias RCTs are available,
we advise that authors should be wary of drawing conclusions from Typical RCTs.
Moreover, we recommend that RCTs and NRS should be analyzed separately in the course
of a surgical meta-analysis. Considering there was a notable discrepancy between the results
of NRS and Strong RCTs, including NRS in the meta-analysis of surgical RCTs could lead to
misleading conclusions.
7.2.2 Implications for the interpretation of surgical NRS
In Chapter 5, we demonstrated that the results of NRS were 36% more extreme than those of
Strong RCTS when evaluating subjective, binary outcomes. For subjective continuous
170
outcomes, NRS overestimated the benefit associated with laparoscopy between two- and
three-fold. These results suggest that NRS may not be a reliable source of evidence for
decision-making in surgery. Therefore, the meta-analysis of NRS may provide surgeons and
policy-makers with misleading results; the accumulation of numerous studies often leads to
narrow confidence intervals but effect estimates may be nonetheless biased. Others have
previously cautioned investigators about the pitfalls of meta-analyzing NRS (Reeves et al.
2013b). Whereas such caution has been previously advised on theoretical grounds, the
findings of this thesis provide empirical support for such apprehension. We therefore
recommend that the investigators should be cautious when drawing conclusions from the
meta-analysis of surgical NRS, especially when analyzing subjective outcomes. However,
inferences from the meta-analysis of objective outcomes in surgical NRS do not appear to be
biased.
7.2.3 Implications for future meta-epidemiological studies of
NRS study characteristics.
Unfortunately, we were not able to identify NRS study characteristics associated with biased
estimates of treatment effect. This may be due to our limited power to detect important
effects as we used only 121 NRS for the current analysis. We were also limited by the
relative infrequency of many of the study characteristics assessed. However, it is also
possible that design characteristics of NRS may not be associated with bias in a predictable
way. For example, the absence of blinded outcome assessment may bias results towards the
null in one study and away from the null in another. Alternatively, there may interactions
between study characteristics that act in equally unpredictable ways. For example, systematic
outcome assessment may be associated with biased effect estimates only in studies where
outcome assessors were unblinded. Accordingly, in instances where blinded outcome
assessment is employed, the presence or absence of systematic outcome assessment may not
be related to bias. Therefore, while it is likely that NRS study characteristics are associated
171
with bias, such bias may be difficult to quantify and detect. Our work has nonetheless
established that the referent group for future meta-regression analyses trying to isolate NRS-
design attributes associated with bias should be low risk of bias trials and not all RCTs for a
given intervention.
7.3 Limitations
7.3.1 Limitations of available data
The analyses in Chapter 5 relied heavily on reported methods to determine the risk of bias in
RCTs. It is possible however, that actual study conduct may have differed from what was
reported in publications. Indeed, it has been shown that adequate random sequence
generation, allocation concealment and blinding are often unreported in trial publications
(Hill, LaValley, and Felson 2002; Mhaskar et al. 2012). This limitation was partially
overcome by contacting authors and referring to study protocols when assessing risk of bias.
It is nonetheless possible that some RCTs may have been rigorously performed but
misclassified as Typical RCTs. Moreover, various NRS were found to lack sample size
calculations, standardized concurrent therapy, blinded outcome assessment and systematic
outcome assessment. Some of these study characteristics may have been present but simply
not mentioned in publications of NRS. Studies examining the concordance between reported
design characteristics and actual study conduct in NRS would be helpful in assessing the
impact of this limitation.
Another limitation of using published literature for analysis is that data abstraction is limited
to those outcomes which authors chose to report. Selective outcome reporting may have
impacted the findings in this dissertation. Study authors may have chosen to report certain
outcomes based on whether effect estimates favoured laparoscopy or open surgery. If effect
estimates favouring open surgery were specifically omitted, it is possible our findings
172
overestimate the bias associated with NRS and Typical RCTs. An examination of funnel
plots though revealed minor asymmetry for the outcomes post-operative complications and
length of stay. Therefore, the impact of selective outcome reporting is perhaps minimal.
Our analyses may have also been subject to measurement error. Authors of NRS and RCTs
rarely provided definitions for the outcome post-operative complications. It is entirely
possible that these definitions varied between studies. While some authors calculated the
frequency of post-operative complications that delayed discharge, others have calculated the
frequency of all post-operative complications. This variation in outcome definition could
have influenced the effect estimates generated in these studies and thus influenced our
findings.
Our ability to detect an association between NRS-design attributes and effect estimates was
also limited by our sample size — the systematic search strategy identified only 144 NRS
comparing laparoscopy with open surgery for the management of colon cancer. It is possible
that a relationship between specific study characteristics and effect estimates exists, but that a
much larger cohort of NRS is required to detect it.
7.3.2 Limitations of data analysis
One of the limitations of our analyses was the use of the Cochrane Risk of Bias tool to
categorize RCTs as Typical or Strong trials. Other instruments for assessing RCT quality
were considered but could not be used because they lacked rigorous development, focused
heavily on blinding (Jadad et al. 1996) or had unknown validity and reliability (Olivo et al.
2008). In contrast, the Cochrane Risk of Bias tool has been developed using rigorous
methods, pilot tested and recently revised to diminish ambiguity (Higgins et al. 2011).
However, the inter-rater reliability for the most recent iteration of the instrument is unknown.
We partially overcame this limitation by having two assessors evaluate RCTs reporting post-
operative complications. There was perfect agreement between assessors (κ=1.00).
173
Moreover, Strong RCTs were large, multi-center and publicly funded studies. Trials with
these attributes have been shown to be less susceptible to bias (Als-Nielsen et al. 2003;
Dechartres et al. 2011; Bafeta et al. 2012). However, it is nonetheless possible some trials
may have been classified as Typical instead of Strong RCTs. If effect estimates in these
studies were in favour of laparoscopy, the results of NRS would appear less extreme in
comparison. However, there is no reason to assume that effect estimates in misclassified
Typical RCTs systematically favoured laparoscopy.
A fundamental assumption was made when designing the analyses of Chapters 5 and 6; that
effect estimates from Strong RCTs are a reasonable surrogate for the truth. However, it is
possible that these rigorous RCTs may have themselves been biased in ways we were not
able to appreciate.
In Chapter 6, we explored the relationship between NRS study characteristics and effect
estimates. However, it is likely that NRS differed not only with respect to study design but
also patient case-mix. Moreover, there may have been differences in surgeon skill between
studies as well. Unfortunately, we could not adjust our analyses for these sources of clinical
heterogeneity as we did not have access to individual patient or provider data. Specifically,
patient and provider characteristics may have explained part of the between-study
heterogeneity in our regression models. The residual error in these models would have
accordingly decreased and we would have had more power to detect a difference in effect
estimates with the presence or absence of NRS study characteristics. Unfortunately, we not
only lacked individual patient and provider data but also aggregate characteristics; group-
level patient characteristics were inconsistently reported across included NRS. Studies varied
with respect to which patient information was provided and how it was presented. For
example, some authors provided measures of disease severity (e.g. cancer stage) whereas
others did not. Moreover, cancer stage was at times reported according to the Duke’s
classification system and in other articles, investigators used the American Joint Committee
on Cancer (AJCC) TNM staging system. Such variation in reporting precluded the use of
174
aggregate patient-level characteristics in the mixed-effects meta-regression models in
Chapter 6.
7.3.3 Limitations of generalizability
Comparative NRS and RCTs of a single surgical procedure were used for the case study of
bias presented in this dissertation. The overestimation of treatment effect in NRS and Typical
RCTs may be specific to studies examining laparoscopy and conventional surgery for the
treatment of colon cancer. Furthermore, our findings may apply only to surgical studies and
may not reflect the relationship between study design and bias in other areas of medicine. We
also demonstrated that more extreme estimates of benefit are obtained when evaluating the
subjective outcomes post-operative complications and length of stay. Our findings may apply
only to these particular outcomes and not other subjective outcomes such as pain, patient
satisfaction and so on. Additional studies are necessary to determine if the findings of this
dissertation are indeed reproducible. This additional evidence would augment the
generalizability of our findings.
7.4 Future Directions
7.4.1 Outcome reporting in NRS and RCTs
In Chapter 4, we observed that many outcomes were infrequently reported in NRS and
RCTs. Examples of these outcomes included margin status, number of lymph nodes
harvested, and quality of life. Definitions of pertinent outcomes, such as post-operative
complications, were also provided in a minority of NRS. In other instances, outcomes were
reported in a variety of ways; for example, studies reporting overall survival sometimes
175
provided 2-year overall survival whereas others reported 3- and 5-year survival. Others too
have found variation in outcome reporting in surgical studies. For example, in a study by
Blencowe et al., authors examined which complications were reported in articles of
esophageal cancer surgery. They identified 105 NRS and 17 RCTs published between 2005
and 2009. They found that no single complication was reported in all papers. Five studies
(5.1%) categorized complications with a validated grading system. Anastomotic leak was the
most commonly reported complication and was defined in 28.3% (n=28) of studies but 22
different definitions were used. They concluded that outcome reporting is “heterogeneous
and inconsistent, and it lacks methodological rigor” (Blencowe et al. 2012). We observed
similar variation in outcome reporting.
Efforts are required to establish core outcome sets for various surgical interventions to
facilitate the reporting, comparison and combination of results. Efforts are currently
underway by the COMET (Core Outcome Measures in Effectiveness Trials) Initiative to
standardize outcome reporting in studies of colon cancer surgery. Specific knowledge
translation strategies are necessary, potentially in conjunction with national surgical
societies, to facilitate the uptake of these outcome sets.
7.4.2 Investigating the relationship between reporting and
actual RCT quality
A minority of surgical RCTs were found to be at low risk of bias. It is unclear whether these
assessments reflect deficits in reporting or suboptimal trial execution. Further investigation is
required to determine if there is indeed discordance between reported and actual study
quality in surgical RCTs. If such a discrepancy is identified, it would be interesting to
determine why authors are omitting important methodological detail. Are authors unaware of
the CONSORT statement? Do word limits imposed by journal editors limit their ability to
appropriately describe their trials? Alternatively, RCT reports may reflect the real absence of
176
adequate random sequence generation, allocation concealment or other methodological
safeguards against bias. If inferior trial quality is confirmed, specific educational strategies
should be devised by national surgical societies to improve the methods of surgical RCTs.
7.4.3 Ongoing evaluations of NRS study characteristics
In Chapter 6, we did not observe a consistent pattern of more extreme benefit for laparoscopy
with the absence of a particular NRS study characteristic. Further research is required,
examining other interventions and other characteristics, to determine which attributes of NRS
study design are associated with biased estimates of treatment effect. The possibility
however remains that unlike RCTs, it may not be possible to identify which characteristics of
NRS design are associated with bias. The findings of this dissertation suggest that the
referent group for meta-regression analyses should be low risk of bias RCTs. Until empirical
evidence is available, expert consensus will be used to develop a risk of bias tool for NRS.
Indeed, the Cochrane Collaboration is currently developing an extension to the Cochrane
Risk of Bias tool for NRS and we are involved in the Collaboration’s efforts. These efforts
are necessary because NRS can be an important study design to evaluate the effectiveness, as
opposed to the efficacy, of surgical interventions. Once the risk of bias tool for NRS is
complete, additional research will be required to determine the validity and reliability of the
instrument.
7.5 Conclusions
In summary, we have demonstrated that the results of surgical NRS can be significantly
biased as compared with those of low risk of bias RCTs when evaluating subjective
outcomes. However, none of the nine NRS-design characteristics examined was consistently
177
associated with biased effect estimates. Additional research is necessary to determine which
NRS-design attributes, if any, are associated with bias.
178
References
Abraham, N. S., C. J. Byrne, J. M. Young, and M. J. Solomon. 2010. "Meta-analysis of well-designed nonrandomized comparative studies of surgical procedures is as good as randomized controlled trials." Journal of Clinical Epidemiology no. 63 (3):238-45.
Abraham, N. S., C. M. Byrne, J. M. Young, and M. J. Solomon. 2007. "Meta-analysis of non-randomized comparative studies of the short-term outcomes of laparoscopic resection for colorectal cancer." ANZ Journal of Surgery no. 77 (7):508-16.
ACS NSQIP - Classic Variables and Definitions, Chapter 4. 2012. [cited November 15 2012]. Available from http://nsqip.healthsoftonline.com/lib/Documents/Ch_4_Variables_Definitions_062810.pdf.
Agabegi, S. S., and P. J. Stern. 2008. "Bias in research." American Journal of Orthopedics (Belle Mead, N.J.) no. 37 (5):242-8.
Alexander, R. J., B. C. Jaques, and K. G. Mitchell. 1993. "Laparoscopically assisted colectomy and wound recurrence." Lancet no. 341 (8839):249-50.
Allison, Paul David. 2002. Missing data, Sage university papers Quantitative applications in the social sciences. Thousand Oaks, Calif.: Sage Publications.
Als-Nielsen, B., W. Chen, C. Gluud, and L. L. Kjaergard. 2003. "Association of funding and conclusions in randomized drug trials: a reflection of treatment effect or adverse events?" JAMA no. 290 (7):921-8.
Altman, Douglas G., Matthias Egger, and George Davey Smith. 2001. Systematic reviews in health care:meta-analysis in context. 2nd ed. London: BMJ.
American Society of Colon and Rectal Surgeons. 1995. "Position statement on laparoscopic colectomy." Diseases of the Colon and Rectum no. 35:5A.
Antman, K., D. Amato, W. Wood, J. Carson, H. Suit, K. Proppe, R. Carey, J. Greenberger, R. Wilson, and E. Frei, 3rd. 1985. "Selection bias in clinical trials." Journal of Clinical Oncology no. 3 (8):1142-7.
Atkins, D., M. Eccles, S. Flottorp, G. H. Guyatt, D. Henry, S. Hill, A. Liberati, D. O'Connell, A. D. Oxman, B. Phillips, H. Schunemann, T. T. Edejer, G. E. Vist, J. W. Williams, Jr., and Grade Working Group. 2004. "Systems for grading the quality of evidence and the strength of recommendations I: critical appraisal of existing approaches The GRADE Working Group." BMC Health Services Research no. 4 (1):38.
Bafeta, A., A. Dechartres, L. Trinquart, A. Yavchitz, I. Boutron, and P. Ravaud. 2012. "Impact of single centre status on estimates of intervention effects in trials with continuous outcomes: meta-epidemiological study." BMJ no. 344:e813.
Baillie, J. K. 2007. "Activated protein C: controversy and hope in the treatment of sepsis." Curr Opin Investig Drugs no. 8 (11):933-8.
Balk, E. M., P. A. Bonis, H. Moskowitz, C. H. Schmid, J. P. Ioannidis, C. Wang, and J. Lau. 2002. "Correlation of quality measures with estimates of treatment effect in meta-analyses of randomized controlled trials." JAMA no. 287 (22):2973-82.
179
Barkun, J. S., J. K. Aronson, L. S. Feldman, G. J. Maddern, S. M. Strasberg, D. G. Altman, J. M. Blazeby, I. C. Boutron, W. B. Campbell, P. A. Clavien, J. A. Cook, P. L. Ergina, D. R. Flum, P. Glasziou, J. C. Marshall, P. McCulloch, J. Nicholl, B. C. Reeves, C. M. Seiler, J. L. Meakins, D. Ashby, N. Black, J. Bunker, M. Burton, M. Campbell, K. Chalkidou, I. Chalmers, M. de Leval, J. Deeks, A. Grant, M. Gray, R. Greenhalgh, M. Jenicek, S. Kehoe, R. Lilford, P. Littlejohns, Y. Loke, R. Madhock, K. McPherson, P. Rothwell, B. Summerskill, D. Taggart, P. Tekkis, M. Thompson, T. Treasure, U. Trohler, and J. Vandenbroucke. 2009. "Evaluation and stages of surgical innovations." Lancet no. 374 (9695):1089-96.
Barnett-Page, E., and J. Thomas. 2009. "Methods for the synthesis of qualitative research: a critical review." BMC Medical Research Methodology no. 9:59.
Barza, M., T. A. Trikalinos, and J. Lau. 2009. "Statistical considerations in meta-analysis." Infectious Disease Clinics of North America no. 23 (2):195-210, Table of Contents.
Bassler, Dirk, Matthias Briel, Victor M. Montori, Melanie Lane, Paul Glasziou, Qi Zhou, Diane Heels-Ansdell, Stephen D. Walter, Gordon H. Guyatt, Stopit- Study Group, David N. Flynn, Mohamed B. Elamin, Mohammad Hassan Murad, Nisrin O. Abu Elnour, Julianna F. Lampropulos, Amit Sood, Rebecca J. Mullan, Patricia J. Erwin, Clare R. Bankhead, Rafael Perera, Carolina Ruiz Culebro, John J. You, Sohail M. Mulla, Jagdeep Kaur, Kara A. Nerenberg, Holger Schunemann, Deborah J. Cook, Kristina Lutz, Christine M. Ribic, Noah Vale, German Malaga, Elie A. Akl, Ignacio Ferreira-Gonzalez, Pablo Alonso-Coello, Gerard Urrutia, Regina Kunz, Heiner C. Bucher, Alain J. Nordmann, Heike Raatz, Suzana Alves da Silva, Fabio Tuche, Brigitte Strahm, Benjamin Djulbegovic, Neill K. J. Adhikari, Edward J. Mills, Femida Gwadry-Sridhar, Haresh Kirpalani, Heloisa P. Soares, Paul J. Karanicolas, Karen E. A. Burns, Per Olav Vandvik, Fernando Coto-Yglesias, Pedro Paulo M. Chrispim, and Tim Ramsay. 2010. "Stopping randomized trials early for benefit and estimation of treatment effects: systematic review and meta-regression analysis." JAMA no. 303 (12):1180-7.
Baumgaertner, M. R., W. D. Cannon, Jr., J. M. Vittori, E. S. Schmidt, and R. C. Maurer. 1990. "Arthroscopic debridement of the arthritic knee." Clinical Orthopaedics and Related Research (253):197-202.
Benson, K., and A. J. Hartz. 2000. "A comparison of observational studies and randomized, controlled trials." New England Journal of Medicine no. 342 (25):1878-86.
Bhandari, M., P. Tornetta, 3rd, T. Ellis, L. Audige, S. Sprague, J. C. Kuo, and M. F. Swiontkowski. 2004. "Hierarchy of evidence: differences in results between non-randomized studies and randomized trials in patients with femoral neck fractures." Archives of Orthopaedic and Trauma Surgery no. 124 (1):10-6.
Blencowe, N. S., A. G. McNair, C. R. Davis, S. T. Brookes, and J. M. Blazeby. 2012. "Standards of outcome reporting in surgical oncology: a case study in esophageal cancer." Annals of Surgical Oncology no. 19 (13):4012-8.
Borenstein, Michael. 2009. Introduction to meta-analysis. Chichester, U.K.: John Wiley & Sons.
180
Boutron, I., F. Tubach, B. Giraudeau, and P. Ravaud. 2003. "Methodological differences in clinical trials evaluating nonpharmacological and pharmacological treatments of hip and knee osteoarthritis." JAMA no. 290 (8):1062-70.
———. 2004. "Blinding was judged more difficult to achieve and maintain in nonpharmacologic than pharmacologic trials." Journal of Clinical Epidemiology no. 57 (6):543-50.
Briggle, Adam, and Carl Mitcham. Ethics and science : an introduction, Cambridge applied ethics.
Britton, A., M. McKee, N. Black, K. McPherson, C. Sanderson, and C. Bain. 1998. "Choosing between randomised and non-randomised studies: a systematic review." Health Technology Assessment no. 2 (13):i-iv, 1-124.
Campbell, Angela J., Anita Bagley, Ann Van Heest, and Michelle A. James. 2010. "Challenges of randomized controlled surgical trials." Orthopedic Clinics of North America no. 41 (2):145-55.
Canadian Cancer Society’s Steering Committee on Cancer Statistics. 2012. Canadian Cancer Statistics 2012. Toronto, ON: Canadian Cancer Society.
CASS. 1984. "Coronary artery surgery study (CASS): a randomized trial of coronary artery bypass surgery. Comparability of entry characteristics and survival in randomized patients and nonrandomized patients meeting randomization criteria." Journal of the American College of Cardiology no. 3 (1):114-28.
Chan, A. W., A. Hrobjartsson, M. T. Haahr, P. C. Gotzsche, and D. G. Altman. 2004. "Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles." JAMA no. 291 (20):2457-65.
Chang, D. C., S. L. Matsen, and C. E. Simpkins. 2006. "Why should surgeons care about clinical research methodology?" Journal of the American College of Surgeons no. 203 (6):827-30.
Chaudhry, H., R. Mundi, I. Singh, T. A. Einhorn, and M. Bhandari. 2008. "How good is the orthopaedic literature?" Indian Journal of Orthopaedics no. 42 (2):144-9.
Choi, B. C., and A. L. Noseworthy. 1992. "Classification, direction, and prevention of bias in epidemiologic research." Journal of Occupational Medicine no. 34 (3):265-71.
Cobb, L. A., G. I. Thomas, D. H. Dillard, K. A. Merendino, and R. A. Bruce. 1959. "An evaluation of internal-mammary-artery ligation by a double-blind technic." New England Journal of Medicine no. 260 (22):1115-8.
Compton, C. C., L. P. Fielding, L. J. Burgart, B. Conley, H. S. Cooper, S. R. Hamilton, M. E. Hammond, D. E. Henson, R. V. Hutter, R. B. Nagle, M. L. Nielsen, D. J. Sargent, C. R. Taylor, M. Welton, and C. Willett. 2000. "Prognostic factors in colorectal cancer. College of American Pathologists Consensus Statement 1999." Archives of Pathology and Laboratory Medicine no. 124 (7):979-94.
Concato, J., N. Shah, and R. I. Horwitz. 2000. "Randomized, controlled trials, observational studies, and the hierarchy of research designs." New England Journal of Medicine no. 342 (25):1887-92.
Cook, J. A. 2009. "The challenges faced in the design, conduct and analysis of surgical randomised controlled trials." Trials no. 10:9.
181
Crowe, M., and L. Sheppard. 2011. "A review of critical appraisal tools show they lack rigor: Alternative tool structure is proposed." Journal of Clinical Epidemiology no. 64 (1):79-89.
Curry, J. I., B. Reeves, and M. D. Stringer. 2003. "Randomized controlled trials in pediatric surgery: could we do better?" Journal of Pediatric Surgery no. 38 (4):556-9.
Davies, H. T., I. K. Crombie, and M. Tavakoli. 1998. "When can odds ratios mislead?" BMJ no. 316 (7136):989-91.
Dechartres, Agnes, Isabelle Boutron, Ludovic Trinquart, Pierre Charles, and Philippe Ravaud. 2011. "Single-center trials show larger treatment effects than multicenter trials: evidence from a meta-epidemiologic study." Annals of Internal Medicine no. 155 (1):39-51.
Deeks, J. J. 2002. "Issues in the selection of a summary statistic for meta-analysis of clinical trials with binary outcomes." Statistics in Medicine no. 21 (11):1575-600.
Deeks, J. J., J. Dinnes, R. D'Amico, A. J. Sowden, C. Sakarovitch, F. Song, M. Petticrew, D. G. Altman, Group International Stroke Trial Collaborative, and Group European Carotid Surgery Trial Collaborative. 2003. "Evaluating non-randomised intervention studies." Health Technology Assessment no. 7 (27):iii-x.
Delgado-Rodriguez, M., and J. Llorca. 2004. "Bias." Journal of Epidemiology and Community Health no. 58 (8):635-41.
DerSimonian, R., and N. Laird. 1986. "Meta-analysis in clinical trials." Controlled Clinical Trials no. 7 (3):177-88.
Dimond, E. G., C. F. Kittle, and J. E. Crockett. 1960. "Comparison of internal mammary artery ligation and sham operation for angina pectoris." American Journal of Cardiology no. 5:483-6.
Downs, S. H., and N. Black. 1998. "The feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies of health care interventions." Journal of Epidemiology and Community Health no. 52 (6):377-84.
Dwan, K., D. G. Altman, J. A. Arnaiz, J. Bloom, A. W. Chan, E. Cronin, E. Decullier, P. J. Easterbrook, E. Von Elm, C. Gamble, D. Ghersi, J. P. Ioannidis, J. Simes, and P. R. Williamson. 2008. "Systematic review of the empirical evidence of study publication bias and outcome reporting bias." PloS One no. 3 (8):e3081.
Edge, Stephen B., and American Joint Committee on Cancer. 2010. AJCC cancer staging manual. 7th ed. New York: Springer.
Egger, M., P. Juni, C. Bartlett, F. Holenstein, and J. Sterne. 2003. "How important are comprehensive literature searches and the assessment of trial quality in systematic reviews? Empirical study." Health Technology Assessment no. 7 (1):1-76.
Ergina, Patrick L., Jonathan A. Cook, Jane M. Blazeby, Isabelle Boutron, Pierre-Alain Clavien, Barnaby C. Reeves, and Christoph M. Seiler. 2009. "Challenges in evaluating surgical innovation." The Lancet no. 374 (9695):1097-1104.
Ernst, E., and M. H. Pittler. 2001. "Assessment of therapeutic safety in systematic reviews: literature review." BMJ no. 323 (7312):546.
Evans, D. 2003. "Hierarchy of evidence: a framework for ranking evidence evaluating healthcare interventions." Journal of Clinical Nursing no. 12 (1):77-84.
182
Evans, M., and A. V. Pollock. 1985. "A score system for evaluating random control clinical trials of prophylaxis of abdominal surgical wound infection." British Journal of Surgery no. 72 (4):256-60.
Falk, P. M., R. W. Beart Jr, S. D. Wexner, A. G. Thorson, D. G. Jagelman, I. C. Lavery, O. B. Johansen, and R. J. Fitzgibbons Jr. 1993. "Laparoscopic colectomy: A critical appraisal." Diseases of the Colon and Rectum no. 36 (1):28-34.
Farrokhyar, F., P. J. Karanicolas, A. Thoma, M. Simunovic, M. Bhandari, P. J. Devereaux, M. Anvari, A. Adili, and G. Guyatt. 2010. "Randomized controlled trials of surgical interventions." Annals of Surgery no. 251 (3):409-16.
Ferlay, J., H. R. Shin, F. Bray, D. Forman, C. Mathers, and D. M. Parkin. 2010. "Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008." International Journal of Cancer no. 127 (12):2893-917.
Fletcher, J. 2007. "What is heterogeneity and is it important?" BMJ no. 334 (7584):94-6. Fletcher, Robert H., and Suzanne W. Fletcher. 2005. Clinical epidemiology:the essentials.
4th ed. Philadelphia: Lippincott Williams & Wilkins. Fowler, D. L., and S. A. White. 1991. "Laparoscopy-assisted sigmoid resection." Surgical
Laparoscopy and Endoscopy no. 1 (3):183-8. Frazier, S. K., and G. J. Skinner. 2008. "Pulmonary artery catheters: state of the
controversy." Journal of Cardiovascular Nursing no. 23 (2):113-21; quiz 122-3. Freed, C. R., P. E. Greene, R. E. Breeze, W. Y. Tsai, W. DuMouchel, R. Kao, S. Dillon, H.
Winfield, S. Culver, J. Q. Trojanowski, D. Eidelberg, and S. Fahn. 2001. "Transplantation of embryonic dopamine neurons for severe Parkinson's disease." New England Journal of Medicine no. 344 (10):710-9.
Friedrich, J. O., N. K. Adhikari, and J. Beyene. 2011. "Ratio of means for analyzing continuous outcomes in meta-analysis performed as well as mean difference methods." Journal of Clinical Epidemiology no. 64 (5):556-64.
Furlan, Andrea D. 2006. Non-randomized studies: an evaluation of search strategies, taxonomy and comparative effectiveness with randomized trials in the field of low-back pain [dissertation], University of Toronto.
GA Wells, B Shea, D O'Connell, J Peterson, V Welch, M Losos, P Tugwell. The Newcastle-Ottawa Scale (NOS) for assessing the quality of nonrandomised studies in meta-analyses. http://www.ohri.ca/programs/clinical_epidemiology/oxford.asp. Accessed July 2, 2011.
Gagliardi, A. R., M. C. Brouwers, V. A. Palda, L. Lemieux-Charles, and J. M. Grimshaw. 2011. "How can we improve guideline use? A conceptual framework of implementability." Implementation science : IS no. 6:26.
Garas, G., A. Ibrahim, H. Ashrafian, K. Ahmed, V. Patel, K. Okabayashi, P. Skapinakis, A. Darzi, and T. Athanasiou. 2012. "Evidence-based surgery: barriers, solutions, and the role of evidence synthesis." World Journal of Surgery no. 36 (8):1723-31.
Gelman, A., and D. B. Rubin. 1996. "Markov chain Monte Carlo methods in biostatistics." Statistical Methods in Medical Research no. 5 (4):339-55.
Ghaferi, A. A., J. D. Birkmeyer, and J. B. Dimick. 2009. "Variation in hospital mortality associated with inpatient surgery." New England Journal of Medicine no. 361 (14):1368-75.
183
Goodman, S., and K. Dickersin. 2011. "Metabias: a challenge for comparative effectiveness research." Annals of Internal Medicine no. 155 (1):61-2.
Graham, I. D., J. Logan, M. B. Harrison, S. E. Straus, J. Tetroe, W. Caswell, and N. Robinson. 2006. "Lost in knowledge translation: time for a map?" Journal of Continuing Education in the Health Professions no. 26 (1):13-24.
Gray, R., M. Sullivan, D. G. Altman, and A. N. Gordon-Weeks. 2012. "Adherence of trials of operative intervention to the CONSORT statement extension for non-pharmacological treatments: a comparative before and after study." Annals of the Royal College of Surgeons of England no. 94 (6):388-94.
Greenland, S., and K. O'Rourke. 2001. "On the bias produced by quality scores in meta-analysis, and a hierarchical view of proposed solutions." Biostatistics no. 2 (4):463-71.
Grimes, D. A., and K. F. Schulz. 2002. "Bias and causal associations in observational research." Lancet no. 359 (9302):248-52.
———. 2008. "Making sense of odds and odds ratios." Obstetrics and Gynecology no. 111 (2 Pt 1):423-6.
Grodstein, F., M. J. Stampfer, J. E. Manson, G. A. Colditz, W. C. Willett, B. Rosner, F. E. Speizer, and C. H. Hennekens. 1996. "Postmenopausal estrogen and progestin use and the risk of cardiovascular disease." New England Journal of Medicine no. 335 (7):453-61.
Gross, D. E., S. L. Brenner, I. Esformes, and M. L. Gross. 1991. "Arthroscopic treatment of degenerative joint disease of the knee." Orthopedics no. 14 (12):1317-21.
Grossman, J., and F. J. Mackenzie. 2005. "The randomized controlled trial: gold standard, or merely standard?" Perspectives in Biology and Medicine no. 48 (4):516-34.
Guillou, P. J., P. Quirke, H. Thorpe, J. Walker, D. G. Jayne, A. M. H. Smith, R. M. Heath, and J. M. Brown. 2005. "Short-term endpoints of conventional versus laparoscopic-assisted surgery in patients with colorectal cancer (MRC CLASICC trial): Multicentre, randomised controlled trial." Lancet no. 365 (9472):1718-1726.
Gurwitz, J. H., N. F. Col, and J. Avorn. 1992. "The exclusion of the elderly and women from clinical trials in acute myocardial infarction." JAMA no. 268 (11):1417-22.
Guyatt, G. H., A. D. Oxman, R. Kunz, G. E. Vist, Y. Falck-Ytter, H. J. Schunemann, and Grade Working Group. 2008. "What is "quality of evidence" and why is it important to clinicians?" BMJ no. 336 (7651):995-8.
Hadorn, D. C., D. Baker, J. S. Hodges, and N. Hicks. 1996. "Rating the quality of evidence for clinical practice guidelines." Journal of Clinical Epidemiology no. 49 (7):749-54.
Hall, J. C., B. Mills, H. Nguyen, and J. L. Hall. 1996. "Methodologic standards in surgical trials." Surgery no. 119 (4):466-72.
Hannan, E. L. 2008. "Randomized clinical trials and observational studies: guidelines for assessing respective strengths and limitations." JACC: Cardiovascular Interventions no. 1 (3):211-7.
Harrison, J. D., M. J. Solomon, J. M. Young, A. Meagher, G. Hruby, G. Salkeld, and S. Clarke. 2007. "Surgical and oncology trials for rectal cancer: who will participate?" Surgery no. 142 (1):94-101.
184
Hartling, L., K. Bond, K. Harvey, P. L. Santaguida, M. Viswanathan, and D. M. Dryden. 2010. Developing and Testing a Tool for the Classification of Study Designs in Systematic Reviews of Interventions and Exposures. Edited by AHRQ. Rockville MD.
Hartling, L., M. P. Hamm, A. Milne, B. Vandermeer, P. L. Santaguida, M. Ansari, A. Tsertsvadze, S. Hempel, P. Shekelle, and D. M. Dryden. 2012. "Testing the Risk of Bias tool showed low reliability between individual reviewers and across consensus assessments of reviewer pairs." Journal of Clinical Epidemiology.
Hartling, Lisa, Kenneth Bond, P. Lina Santaguida, Meera Viswanathan, and Donna M. Dryden. 2011. "Testing a tool for the classification of study designs in systematic reviews of interventions and exposures showed moderate reliability and low accuracy." Journal of Clinical Epidemiology no. 64 (8):861-71.
Hartling, Lisa, Maria Ospina, Yuanyuan Liang, Donna M. Dryden, Nicola Hooton, Jennifer Krebs Seida, and Terry P. Klassen. 2009. "Risk of bias versus quality assessment of randomised controlled trials: cross sectional study." BMJ no. 339:b4012.
Herbison, P., J. Hay-Smith, and W. J. Gillespie. 2006. "Adjustment of meta-analyses on the basis of quality scores should be abandoned." Journal of Clinical Epidemiology no. 59 (12):1249-56.
Hewett, Peter J., Randall A. Allardyce, Philip F. Bagshaw, Christopher M. Frampton, Francis A. Frizelle, Nicholas A. Rieger, J. Shona Smith, Michael J. Solomon, Jacqueline H. Stephens, and Andrew R. L. Stevenson. 2008. "Short-term outcomes of the Australasian randomized clinical study comparing laparoscopic and conventional open surgical treatments for colon cancer: the ALCCaS trial." Annals of Surgery no. 248 (5):728-38.
Higgins, J. P., D. G. Altman, P. C. Gotzsche, P. Juni, D. Moher, A. D. Oxman, J. Savovic, K. F. Schulz, L. Weeks, J. A. Sterne, Group Cochrane Bias Methods, and Group Cochrane Statistical Methods. 2011. "The Cochrane Collaboration's tool for assessing risk of bias in randomised trials." BMJ no. 343 (oct18 2):d5928.
Higgins, J. P., and S. G. Thompson. 2004. "Controlling the risk of spurious findings from meta-regression." Statistics in Medicine no. 23 (11):1663-82.
Higgins, Julian P. T., Sally Green, and Cochrane Collaboration. 2011. Cochrane handbook for systematic reviews of interventions, Cochrane book series. Chichester, England ; Hoboken, NJ: Wiley-Blackwell.
Higgins, Julian PT , Craig Ramsay, Barnaby C Reeves, Jonathan J Deeks, Beverley Shea, Jeffrey C Valentine, Peter Tugwell, and George Wells. 2013. "Issues relating to study design and risk of bias when including non-randomized studies in systematic reviews on the effects of interventions." Research Synthesis Methods no. 4 (1):12-25.
Hill, C. L., M. P. LaValley, and D. T. Felson. 2002. "Discrepancy between published report and actual conduct of randomized clinical trials." Journal of Clinical Epidemiology no. 55 (8):783-6.
Howes, N., L. Chagla, M. Thorpe, and P. McCulloch. 1997. "Surgical practice is evidence based." British Journal of Surgery no. 84 (9):1220-3.
185
Hozo, S. P., B. Djulbegovic, and I. Hozo. 2005. "Estimating the mean and variance from the median, range, and the size of a sample." BMC Medical Research Methodology no. 5:13.
Hrobjartsson, A., A. S. Thomsen, F. Emanuelsson, B. Tendal, J. Hilden, I. Boutron, P. Ravaud, and S. Brorson. 2012. "Observer bias in randomised clinical trials with binary outcomes: systematic review of trials with both blinded and non-blinded outcome assessors." BMJ no. 344:e1119.
———. 2013. "Observer bias in randomized clinical trials with measurement scale outcomes: a systematic review of trials with both blinded and nonblinded assessors." CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne no. 185 (4):E201-11.
Hutchins, L. F., J. M. Unger, J. J. Crowley, C. A. Coltman, Jr., and K. S. Albain. 1999. "Underrepresentation of patients 65 years of age or older in cancer-treatment trials." New England Journal of Medicine no. 341 (27):2061-7.
Ingraham, A. M., B. Haas, M. E. Cohen, C. Y. Ko, and A. B. Nathens. 2012. "Comparison of hospital performance in trauma vs emergency and elective general surgery: implications for acute care surgery quality improvement." Archives of Surgery no. 147 (7):591-8.
Ioannidis, J. P., and T. A. Trikalinos. 2007. "The appropriateness of asymmetry tests for publication bias in meta-analyses: a large survey." CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne no. 176 (8):1091-6.
Jackson, H. H., J. D. Jackson, S. J. Mulvihill, M. A. Firpo, and R. E. Glasgow. 2004. "Trends in research support and productivity in the changing environment of academic surgery." Journal of Surgical Research no. 116 (2):197-201.
Jacobs, M., J. C. Verdeja, and H. S. Goldstein. 1991. "Minimally invasive colon resection (laparoscopic colectomy)." Surgical Laparoscopy and Endoscopy no. 1 (3):144-50.
Jacquier, I., I. Boutron, D. Moher, C. Roy, and P. Ravaud. 2006. "The reporting of randomized clinical trials using a surgical intervention is in need of immediate improvement: a systematic review." Annals of Surgery no. 244 (5):677-83.
Jadad, A. R., R. A. Moore, D. Carroll, C. Jenkinson, D. J. Reynolds, D. J. Gavaghan, and H. J. McQuay. 1996. "Assessing the quality of reports of randomized clinical trials: is blinding necessary?" Controlled Clinical Trials no. 17 (1):1-12.
Jha, P., M. Flather, E. Lonn, M. Farkouh, and S. Yusuf. 1995. "The antioxidant vitamins and cardiovascular disease. A critical review of epidemiologic and clinical trial data." Annals of Internal Medicine no. 123 (11):860-72.
Juni, P., F. Holenstein, J. Sterne, C. Bartlett, and M. Egger. 2002. "Direction and impact of language bias in meta-analyses of controlled trials: empirical study." International Journal of Epidemiology no. 31 (1):115-23.
Juni, P., A. Witschi, R. Bloch, and M. Egger. 1999. "The hazards of scoring the quality of clinical trials for meta-analysis." JAMA no. 282 (11):1054-60.
Kallmes, D. F., B. A. Comstock, P. J. Heagerty, J. A. Turner, D. J. Wilson, T. H. Diamond, R. Edwards, L. A. Gray, L. Stout, S. Owen, W. Hollingworth, B. Ghdoke, D. J. Annesley-Williams, S. H. Ralston, and J. G. Jarvik. 2009. "A randomized trial of
186
vertebroplasty for osteoporotic spinal fractures." New England Journal of Medicine no. 361 (6):569-79.
Karanicolas, P. J., F. Farrokhyar, and M. Bhandari. 2010. "Practical tips for surgical research: blinding: who, what, when, why, how?" Canadian Journal of Surgery no. 53 (5):345-8.
Katrak, P., A. E. Bialocerkowski, N. Massy-Westropp, S. Kumar, and K. A. Grimmer. 2004. "A systematic review of the content of critical appraisal tools." BMC Medical Research Methodology no. 4:22.
Kazemier, G., H. J. Bonjer, F. J. Berends, and J. F. Lange. 1995. "Port site metastases after laparoscopic colorectal surgery for cure of malignancy.[see comment]." British Journal of Surgery no. 82 (8):1141-2.
Kelly, J., A. Rudd, R. R. Lewis, and B. J. Hunt. 2001. "Screening for subclinical deep-vein thrombosis." QJM no. 94 (10):511-9.
Kelly, M., L. Sharp, F. Dwane, T. Kelleher, and H. Comber. 2012. "Factors predicting hospital length-of-stay and readmission after colorectal resection: a population-based study of elective and emergency admissions." BMC Health Services Research no. 12:77.
Kirchhoff, P., P. A. Clavien, and D. Hahnloser. 2010. "Complications in colorectal surgery: risk factors and preventive strategies." Patient Safety in Surgery no. 4 (1):5.
Kirkham, J. J., K. M. Dwan, D. G. Altman, C. Gamble, S. Dodd, R. Smyth, and P. R. Williamson. 2010. "The impact of outcome reporting bias in randomised controlled trials on a cohort of systematic reviews." BMJ no. 340:c365.
Kjaergard, L. L., J. Villumsen, and C. Gluud. 2001. "Reported methodologic quality and discrepancies between large and small randomized trials in meta-analyses." Annals of Internal Medicine no. 135 (11):982-9.
Kleinbaum, D. G., H. Morgenstern, and L. L. Kupper. 1981. "Selection bias in epidemiologic studies." American Journal of Epidemiology no. 113 (4):452-63.
Konrat, C., I. Boutron, L. Trinquart, G. R. Auleley, P. Ricordeau, and P. Ravaud. 2012. "Underrepresentation of elderly people in randomised controlled trials. The example of trials of 4 widely prescribed drugs." PloS One no. 7 (3):e33559.
Krieger, N., I. Lowy, R. Aronowitz, J. Bigby, K. Dickersin, E. Garner, J. P. Gaudilliere, C. Hinestrosa, R. Hubbard, P. A. Johnson, S. A. Missmer, J. Norsigian, C. Pearson, C. E. Rosenberg, L. Rosenberg, B. G. Rosenkrantz, B. Seaman, C. Sonnenschein, A. M. Soto, J. Thornton, and G. Weisz. 2005. "Hormone replacement therapy, cancer, controversies, and women's health: historical, epidemiological, biological, clinical, and advocacy perspectives." Journal of Epidemiology and Community Health no. 59 (9):740-8.
Kuhry, E., W. F. Schwenk, R. Gaupset, U. Romild, and H. J. Bonjer. 2008. "Long-term results of laparoscopic colorectal cancer resection." Cochrane Database of Systematic Reviews (2):CD003432.
Kunz, R., G. Vist, and A. D. Oxman. 2007. "Randomisation to protect against selection bias in healthcare trials." Cochrane Database of Systematic Reviews (2):MR000012.
Lacy, A. M., J. C. Garcia-Valdecasas, J. M. Pique, S. Delgado, E. Campo, J. M. Bordas, P. Taura, L. Grande, J. Fuster, J. L. Pacheco, and et al. 1995. "Short-term outcome
187
analysis of a randomized study comparing laparoscopic vs open colectomy for colon cancer." Surgical Endoscopy no. 9 (10):1101-5.
Landis, J. R., and G. G. Koch. 1977. "The measurement of observer agreement for categorical data." Biometrics no. 33 (1):159-74.
Lassen, K., A. Hvarphiye, and T. Myrmel. 2012. "Randomised trials in surgery: the burden of evidence." Reviews on Recent Clinical Trials no. 7 (3):244-8.
Lee, P. Y., K. P. Alexander, B. G. Hammill, S. K. Pasquali, and E. D. Peterson. 2001. "Representation of elderly persons and women in published randomized trials of acute coronary syndromes." JAMA no. 286 (6):708-13.
Legare, F., D. Stacey, I. D. Graham, G. Elwyn, P. Pluye, M. P. Gagnon, D. Frosch, M. B. Harrison, J. Kryworuchko, S. Pouliot, and S. Desroches. 2008. "Advancing theories, models and measurement for an interprofessional approach to shared decision making in primary care: a study protocol." BMC Health Services Research no. 8:2.
Leung, E., A. M. Ferjani, N. Stellard, and L. S. Wong. 2009. "Predicting post-operative mortality in patients undergoing colorectal surgery using P-POSSUM and CR-POSSUM scores: a prospective study." International Journal of Colorectal Disease no. 24 (12):1459-64.
Lewis, J. H., M. L. Kilgore, D. P. Goldman, E. L. Trimble, R. Kaplan, M. J. Montello, M. G. Housman, and J. J. Escarce. 2003. "Participation of patients 65 years of age or older in cancer clinical trials." Journal of Clinical Oncology no. 21 (7):1383-9.
Livesley, P. J., M. Doherty, M. Needoff, and A. Moulton. 1991. "Arthroscopic lavage of osteoarthritic knees." Journal of Bone and Joint Surgery (British Volume) no. 73 (6):922-6.
Loke, Y. K., D. Price, A. Herxheimer, and Group Cochrane Adverse Effects Methods. 2007. "Systematic reviews of adverse effects: framework for a structured approach." BMC Medical Research Methodology no. 7:32.
Lunn, D., D. Spiegelhalter, A. Thomas, and N. Best. 2009. "The BUGS project: Evolution, critique and future directions." Statistics in Medicine no. 28 (25):3049-67.
Martel, G., and R. P. Boushey. 2006. "Laparoscopic colon surgery: past, present and future." Surgical Clinics of North America no. 86 (4):867-97.
Marti-Carvajal, A. J., I. Sola, D. Lathyris, and A. F. Cardona. 2012. "Human recombinant activated protein C for severe sepsis." Cochrane Database of Systematic Reviews no. 3:CD004388.
Marubashi, S., H. Yano, T. Monden, T. Hata, H. Takahashi, S. Fujita, T. Kanoh, T. Iwazawa, S. Matsui, Y. Nakano, H. Tateishi, M. Kinuta, S. Takiguchi, and J. Okamura. 2000. "The usefulness, indications, and complications of laparoscopy-assisted colectomy in comparison with those of open colectomy for colorectal carcinoma." Surgery Today no. 30 (6):491-6.
Masoudi, F. A., E. P. Havranek, P. Wolfe, C. P. Gross, S. S. Rathore, J. F. Steiner, D. L. Ordin, and H. M. Krumholz. 2003. "Most hospitalized older persons do not meet the enrollment criteria for clinical trials in heart failure." American Heart Journal no. 146 (2):250-7.
188
Mathieu, S., I. Boutron, D. Moher, D. G. Altman, and P. Ravaud. 2009. "Comparison of registered and published primary outcomes in randomized controlled trials." JAMA no. 302 (9):977-84.
McCulloch, P., D. G. Altman, W. B. Campbell, D. R. Flum, P. Glasziou, J. C. Marshall, J. Nicholl, J. K. Aronson, J. S. Barkun, J. M. Blazeby, I. C. Boutron, P. A. Clavien, J. A. Cook, P. L. Ergina, L. S. Feldman, G. J. Maddern, B. C. Reeves, C. M. Seiler, S. M. Strasberg, J. L. Meakins, D. Ashby, N. Black, J. Bunker, M. Burton, M. Campbell, K. Chalkidou, I. Chalmers, M. de Leval, J. Deeks, A. Grant, M. Gray, R. Greenhalgh, M. Jenicek, S. Kehoe, R. Lilford, P. Littlejohns, Y. Loke, R. Madhock, K. McPherson, J. Meakins, P. Rothwell, B. Summerskill, D. Taggart, P. Tekkis, M. Thompson, T. Treasure, U. Trohler, and J. Vandenbroucke. 2009. "No surgical innovation without evaluation: the IDEAL recommendations." Lancet no. 374 (9695):1105-12.
McCulloch, P., A. Kaul, G. F. Wagstaff, and J. Wheatcroft. 2005. "Tolerance of uncertainty, extroversion, neuroticism and attitudes to randomized controlled trials among surgeons and physicians." British Journal of Surgery no. 92 (10):1293-7.
McDonald, A. M., R. C. Knight, M. K. Campbell, V. A. Entwistle, A. M. Grant, J. A. Cook, D. R. Elbourne, D. Francis, J. Garcia, I. Roberts, and C. Snowdon. 2006. "What influences recruitment to randomised controlled trials? A review of trials funded by two UK funding agencies." Trials no. 7:9.
McIntosh, M. W. 1996. "The population risk as an explanatory variable in research synthesis of clinical trials." Statistics in Medicine no. 15 (16):1713-28.
McLaren, A. C., C. P. Blokker, P. J. Fowler, J. N. Roth, and M. G. Rock. 1991. "Arthroscopic debridement of the knee for osteoarthrosis." Canadian Journal of Surgery no. 34 (6):595-8.
McLeod, R. S. 1999. "Issues in surgical randomized controlled trials." World Journal of Surgery no. 23 (12):1210-4.
McRae, C., E. Cherin, T. G. Yamazaki, G. Diem, A. H. Vo, D. Russell, J. H. Ellgring, S. Fahn, P. Greene, S. Dillon, H. Winfield, K. B. Bjugstad, and C. R. Freed. 2004. "Effects of perceived treatment on quality of life and medical outcomes in a double-blind placebo surgery trial." Archives of General Psychiatry no. 61 (4):412-20.
Meakins, J. L. 2002. "Innovation in surgery: the rules of evidence." American Journal of Surgery no. 183 (4):399-405.
Mhaskar, R., B. Djulbegovic, A. Magazin, H. P. Soares, and A. Kumar. 2012. "Published methodological quality of randomized controlled trials does not reflect the actual quality assessed in protocols." Journal of Clinical Epidemiology no. 65 (6):602-9.
Miles, Matthew B., and A. M. Huberman. 1994. Qualitative data analysis : an expanded sourcebook. 2nd ed. Thousand Oaks: Sage Publications.
Mills, N., J. L. Donovan, M. Smith, A. Jacoby, D. E. Neal, and F. C. Hamdy. 2003. "Perceptions of equipoise are crucial to trial participation: a qualitative study of men in the ProtecT study." Controlled Clinical Trials no. 24 (3):272-82.
Moher, D., B. Pham, A. Jones, D. J. Cook, A. R. Jadad, M. Moher, P. Tugwell, and T. P. Klassen. 1998. "Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses?" Lancet no. 352 (9128):609-13.
189
Moher, D., K. F. Schulz, D. Altman, and Consort Group. 2001. "The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomized trials." JAMA no. 285 (15):1987-91.
Montorsi, M., U. Fumagalli, R. Rosati, S. Bona, B. Chella, and C. Huscher. 1995. "Early parietal recurrence of adenocarcinoma of the colon after laparoscopic colectomy." British Journal of Surgery no. 82 (8):1036-7.
Montreuil, B., Y. Bendavid, and J. Brophy. 2005. "What is so odd about odds?" Canadian Journal of Surgery no. 48 (5):400-8.
Morton, S. C., J. L. Adams, M. J. Suttorp, and P. G. Shekelle. 2004. Meta-regression Approaches: What, Why, When, and How? Rockville MD.
Moseley, J. B., K. O'Malley, N. J. Petersen, T. J. Menke, B. A. Brody, D. H. Kuykendall, J. C. Hollingsworth, C. M. Ashton, and N. P. Wray. 2002. "A controlled trial of arthroscopic surgery for osteoarthritis of the knee." New England Journal of Medicine no. 347 (2):81-8.
National Heart Lung Blood Institute Acute Respiratory Distress Syndrome Clinical Trials Network, A. P. Wheeler, G. R. Bernard, B. T. Thompson, D. Schoenfeld, H. P. Wiedemann, B. deBoisblanc, A. F. Connors, Jr., R. D. Hite, and A. L. Harabin. 2006. "Pulmonary-artery versus central venous catheter to guide treatment of acute lung injury." New England Journal of Medicine no. 354 (21):2213-24.
Nduka, C. C., J. R. Monson, N. Menzies-Gow, and A. Darzi. 1994. "Abdominal wall metastases following laparoscopy.[see comment]." British Journal of Surgery no. 81 (5):648-52.
Nelson, H., N. Petrelli, A. Carlin, J. Couture, J. Fleshman, J. Guillem, B. Miedema, D. Ota, D. Sargent, and Panel National Cancer Institute Expert. 2001. "Guidelines 2000 for colon and rectal cancer surgery." J Natl Cancer Inst no. 93 (8):583-96.
Nelson, H., D. J. Sargent, H. S. Wieand, J. Fleshman, M. Anvari, S. J. Stryker, R. W. Beart Jr, M. Hellinger, R. Flanagan Jr, W. Peters, and D. Ota. 2004. "A Comparison of Laparoscopically Assisted and Open Colectomy for Colon Cancer." New England Journal of Medicine no. 350 (20):2050-2059+2114.
Neumayer, L. A., A. A. Gawande, J. Wang, A. Giobbie-Hurder, K. M. Itani, R. J. Fitzgibbons, Jr., D. Reda, O. Jonasson, and C. S. P. Investigators. 2005. "Proficiency of surgeons in inguinal hernia repair: effect of experience and age." Annals of Surgery no. 242 (3):344-8; discussion 348-52.
NHSN Patient Safety Component Manual. National Healthcare Safety Network. Centres for Disease Control and Prevention. 2012. [cited November 15 2012]. Available from http://www.cdc.gov/nhsn/toc_pscmanual.html.
Nicolaides, K., L. Brizot Mde, F. Patel, and R. Snijders. 1994. "Comparison of chorionic villus sampling and amniocentesis for fetal karyotyping at 10-13 weeks' gestation." Lancet no. 344 (8920):435-9.
Norris, S. L., H. K. Holmer, L. A. Ogden, R. Fu, A. M. Abou-Setta, M. S. Viswanathan, and M. L. McPheeters. 2012. Selective Outcome Reporting as a Source of Bias in Reviews of Comparative Effectiveness. Rockville MD.
Nuesch, Eveline, Stephan Reichenbach, Sven Trelle, Anne W. S. Rutjes, Katharina Liewald, Rebekka Sterchi, Douglas G. Altman, and Peter Juni. 2009. "The importance of
190
allocation concealment and patient blinding in osteoarthritis trials: a meta-epidemiologic study." Arthritis and Rheumatism no. 61 (12):1633-41.
Nuesch, Eveline, Sven Trelle, Stephan Reichenbach, Anne W. S. Rutjes, Elizabeth Burgi, Martin Scherer, Douglas G. Altman, and Peter Juni. 2009. "The effects of excluding patients from the analysis in randomised controlled trials: meta-epidemiological study." BMJ no. 339:b3244.
Nuesch, Eveline, Sven Trelle, Stephan Reichenbach, Anne W. S. Rutjes, Beatrice Tschannen, Douglas G. Altman, Matthias Egger, and Peter Juni. 2010. "Small study effects in meta-analyses of osteoarthritis trials: meta-epidemiological study." BMJ no. 341:c3515.
Olivo, S. A., L. G. Macedo, I. C. Gadotti, J. Fuentes, T. Stanton, and D. J. Magee. 2008. "Scales to assess the quality of randomized controlled trials: a systematic review." Physical Therapy no. 88 (2):156-75.
Owens, D. K., K. N. Lohr, D. Atkins, J. R. Treadwell, J. T. Reston, E. B. Bass, S. Chang, and M. Helfand. 2010. "AHRQ series paper 5: grading the strength of a body of evidence when comparing medical interventions--agency for healthcare research and quality and the effective health-care program." Journal of Clinical Epidemiology no. 63 (5):513-23.
Oxford Centre for Evidence-Based Medicine. 2011. Oxford Centre for Evidence-Based Medicine - Levels of evidence (March 2009) 2009 [cited March 1 2011]. Available from http://www.cebm.net/?o=1025
Pagano, Marcello, and Kimberlee Gauvreau. 2000. Principles of biostatistics. 2nd ed. 1 vols. Pacific Grove, CA: Duxbury.
Panesar, S. S., R. Thakrar, T. Athanasiou, and A. Sheikh. 2006. "Comparison of reports of randomized controlled trials and systematic reviews in surgical journals: literature review." Journal of the Royal Society of Medicine no. 99 (9):470-2.
Pannucci, C. J., and E. G. Wilkins. 2010. "Identifying and avoiding bias in research." Plastic and Reconstructive Surgery no. 126 (2):619-25.
Pildal, J., A. Hrobjartsson, K. J. Jorgensen, J. Hilden, D. G. Altman, and P. C. Gotzsche. 2007. "Impact of allocation concealment on conclusions drawn from meta-analyses of randomized trials.[Erratum appears in Int J Epidemiol. 2008 Apr;37(2):422]." International Journal of Epidemiology no. 36 (4):847-57.
Pyorala, S., N. P. Huttunen, and M. Uhari. 1995. "A review and meta-analysis of hormonal treatment of cryptorchidism." Journal of Clinical Endocrinology and Metabolism no. 80 (9):2795-9.
Rahima Nenshi, Nancy Baxter, Erin Kennedy, Susan E. Schultz, Nadia Gunraj, Andrew S. Wilton, David R. Urbach, and Marko Simunovic. 2008. "Surgery for Colon Cancer." In Cancer Surgery in Ontario: ICES Atlas, edited by Simunovic M Urbach DR, Schultz SE. Toronto: Institute for Clinical Evaluative Sciences.
Reeves B.C., Deeks J.J., Higgins J.P.T., and Wells G.A. 2011. "Chapter 13: Including non-randomized studies. In: Higgins JPT, Green S (editors), Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 (updated March 2011)." In: The Cochrane Collaboration. www.cochrane-handbook.org.
191
Reeves, B.C., J.P.T. Higgins, C. Ramsay, B. Shea, P. Tugwell, and G.A. Wells. 2013a. "An introduction to methodological issues when including non-randomised studies in systematic reviews on the effects of interventions." Research Synthesis Methods no. 4 (1):1-11.
Reeves, Barnaby C, Julian PT Higgins, Craig Ramsay, Beverley Shea, Peter Tugwell, and George A. Wells. 2013b. "An introduction to methodological issues when including non-randomised studies in systematic reviews on the effects of interventions." Research Synthesis Methods no. 4 (1):1-11.
Reilly, W. T., H. Nelson, G. Schroeder, H. S. Wieand, J. Bolton, and M. J. O'Connell. 1996. "Wound recurrence following conventional treatment of colorectal cancer. A rare but perhaps underestimated problem." Diseases of the Colon and Rectum no. 39 (2):200-7.
Reimold, S. C., T. C. Chalmers, J. A. Berlin, and E. M. Antman. 1992. "Assessment of the efficacy and safety of antiarrhythmic therapy for chronic atrial fibrillation: observations on the role of trial design and implications of drug-related mortality." American Heart Journal no. 124 (4):924-32.
RMITG. 1994. "Worldwide collaborative observational study and meta-analysis on allogenic leukocyte immunotherapy for recurrent spontaneous abortion. Recurrent Miscarriage Immunotherapy Trialists Group." American Journal of Reproductive Immunology no. 32 (2):55-72.
Ross, S., A. Grant, C. Counsell, W. Gillespie, I. Russell, and R. Prescott. 1999. "Barriers to participation in randomised controlled trials: a systematic review." Journal of Clinical Epidemiology no. 52 (12):1143-56.
Sackett, D. L. 1979. "Bias in analytic research." Journal of Chronic Diseases no. 32 (1-2):51-63.
Sackett, David L., and David L. Sackett. 1991. Clinical epidemiology : a basic science for clinical medicine. 2nd ed. Boston: Little, Brown.
Sanderson, S., I. D. Tatt, and J. P. Higgins. 2007. "Tools for assessing quality and susceptibility to bias in observational studies in epidemiology: a systematic review and annotated bibliography." International Journal of Epidemiology no. 36 (3):666-76.
Saunders, L. D., G. M. Soomro, J. Buckingham, G. Jamtvedt, and P. Raina. 2003. "Assessing the methodological quality of nonrandomized intervention studies." Western Journal of Nursing Research no. 25 (2):223-37.
Savovic, J., H. E. Jones, D. G. Altman, R. J. Harris, P. Juni, J. Pildal, B. Als-Nielsen, E. M. Balk, C. Gluud, L. L. Gluud, J. P. Ioannidis, K. F. Schulz, R. Beynon, N. J. Welton, L. Wood, D. Moher, J. J. Deeks, and J. A. Sterne. 2012. "Influence of reported study design characteristics on intervention effect estimates from randomized, controlled trials." Annals of Internal Medicine no. 157 (6):429-38.
Schulz, K. F., I. Chalmers, R. J. Hayes, and D. G. Altman. 1995. "Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials." JAMA no. 273 (5):408-12.
192
Schwenk, W., O. Haase, J. Neudecker, and J. M. Muller. 2005. "Short term benefits for laparoscopic colorectal resection." Cochrane Database of Systematic Reviews (3):CD003145.
Shapiro, C. L., and A. Recht. 1994. "Late effects of adjuvant therapy for breast cancer." Journal of the National Cancer Institute. Monographs (16):101-12.
Sharp, S. J., and S. G. Thompson. 2000. "Analysing the relationship between treatment effect and underlying risk in meta-analysis: comparison and development of approaches." Statistics in Medicine no. 19 (23):3251-74.
Sharp, S. J., S. G. Thompson, and D. G. Altman. 1996. "The relation between treatment benefit and underlying risk in meta-analysis." BMJ no. 313 (7059):735-8.
Shikata, S., T. Nakayama, Y. Noguchi, Y. Taji, and H. Yamagishi. 2006. "Comparison of effects in randomized controlled trials with observational studies in digestive surgery." Annals of Surgery no. 244 (5):668-76.
Shikora, S. A., R. Bergenstal, M. Bessler, F. Brody, G. Foster, A. Frank, M. Gold, S. Klein, R. Kushner, and D. B. Sarwer. 2009. "Implantable gastric stimulation for the treatment of clinically severe obesity: results of the SHAPE trial." Surgery for Obesity and Related Diseases no. 5 (1):31-7.
Sinclair, J. C., and M. B. Bracken. 1994. "Clinically useful measures of effect in binary analyses of randomized trials." Journal of Clinical Epidemiology no. 47 (8):881-9.
Smith, A. J., D. K. Driman, K. Spithoff, A. Hunter, R. S. McLeod, M. Simunovic, B. Langer, Colon Expert Panel on, Surgery Rectal Cancer, and Pathology. 2010. "Guideline for optimization of colorectal cancer surgery and pathology." Journal of Surgical Oncology no. 101 (1):5-12.
Solomon, M. J., and R. S. McLeod. 1993. "Clinical studies in surgical journals--have we improved?" Diseases of the Colon and Rectum no. 36 (1):43-8.
Stang, A. 2010. "Critical evaluation of the Newcastle-Ottawa scale for the assessment of the quality of nonrandomized studies in meta-analyses." European Journal of Epidemiology no. 25 (9):603-5.
Sterne, J. A., and M. Egger. 2001. "Funnel plots for detecting bias in meta-analysis: guidelines on choice of axis." Journal of Clinical Epidemiology no. 54 (10):1046-55.
Sterne, J. A., P. Juni, K. F. Schulz, D. G. Altman, C. Bartlett, and M. Egger. 2002. "Statistical methods for assessing the influence of study characteristics on treatment effects in 'meta-epidemiological' research." Statistics in Medicine no. 21 (11):1513-24.
Sterne, J. A., A. J. Sutton, J. P. Ioannidis, N. Terrin, D. R. Jones, J. Lau, J. Carpenter, G. Rucker, R. M. Harbord, C. H. Schmid, J. Tetzlaff, J. J. Deeks, J. Peters, P. Macaskill, G. Schwarzer, S. Duval, D. G. Altman, D. Moher, and J. P. Higgins. 2011. "Recommendations for examining and interpreting funnel plot asymmetry in meta-analyses of randomised controlled trials." BMJ no. 343 (jul22 1):d4002.
Sung, L., J. Hayden, M. L. Greenberg, G. Koren, B. M. Feldman, and G. A. Tomlinson. 2005. "Seven items were identified for inclusion when reporting a Bayesian analysis of a clinical study." Journal of Clinical Epidemiology no. 58 (3):261-8.
Sutton, A. J., K. R. Abrams, D. R. Jones, T. A. Sheldon, and F. Song. 1998. "Systematic reviews of trials and other studies." Health Technology Assessment no. 2 (19):1-276.
193
Swaen, G. M., N. Carmichael, and J. Doe. 2011. "Strengthening the reliability and credibility of observational epidemiology studies by creating an Observational Studies Register." Journal of Clinical Epidemiology no. 64 (5):481-6.
Swank, D. J., S. C. Swank-Bordewijk, W. C. Hop, W. F. van Erp, I. M. Janssen, H. J. Bonjer, and J. Jeekel. 2003. "Laparoscopic adhesiolysis in patients with chronic abdominal pain: a blinded randomised controlled multi-centre trial." Lancet no. 361 (9365):1247-51.
Thompson, S. G., T. C. Smith, and S. J. Sharp. 1997. "Investigating underlying risk as a source of heterogeneity in meta-analysis." Statistics in Medicine no. 16 (23):2741-58.
Thompson, S. G., R. M. Turner, and D. E. Warn. 2001. "Multilevel models for meta-analysis, and their application to absolute risk differences." Statistical Methods in Medical Research no. 10 (6):375-92.
Thompson, S.G., and J. Higgins. 2002. "How should meta-regression analyses be undertaken and interpreted?" Statistics in Medicine no. 21 (11):1559-1573.
Tierney, Jayne F., and Lesley A. Stewart. 2005. "Investigating patient exclusion bias in meta-analysis." International Journal of Epidemiology no. 34 (1):79-87.
Urbach, D. R., and N. N. Baxter. 2004. "Does it matter what a hospital is "high volume" for? Specificity of hospital volume-outcome associations for surgical procedures: analysis of administrative data." BMJ no. 328 (7442):737-40.
van Houwelingen, H. C., L. R. Arends, and T. Stijnen. 2002. "Advanced methods in meta-analysis: multivariate approach and meta-regression." Statistics in Medicine no. 21 (4):589-624.
Van Spall, H. G., A. Toren, A. Kiss, and R. A. Fowler. 2007. "Eligibility criteria of randomized controlled trials published in high-impact general medical journals: a systematic sampling review." JAMA no. 297 (11):1233-40.
Varas-Lorenzo, C., L. A. Garcia-Rodriguez, S. Perez-Gutthann, and A. Duque-Oliart. 2000. "Hormone replacement therapy and incidence of acute myocardial infarction. A population-based nested case-control study." Circulation no. 101 (22):2572-8.
Viechtbauer, W. 2010. "Conducting meta-analyses in R with the metafor package." Journal of Statistical Software no. 36 (3).
Viswanathan, M., and N. D. Berkman. 2011. Development of the RTI Item Bank on Risk of Bias and Precision of Observational Studies. Rockville MD.
Walter, C. J., J. C. Dumville, C. E. Hewitt, K. C. Moore, D. J. Torgerson, P. J. Drew, and J. R. Monson. 2007. "The quality of trials in operative surgery." Annals of Surgery no. 246 (6):1104-9.
Walter, S. D. 2000. "Choice of effect measure for epidemiological data." Journal of Clinical Epidemiology no. 53 (9):931-9.
Wandmacher, Cornelius, and A. I. Johnson. 1995. Metric units in engineering--going SI : how to use the international sytems of measurement units (SI) to solve standard engineering problems. Rev. ed. New York, N.Y.: ASCE Press.
Wang, J. L., T. T. Sun, Y. W. Lin, R. Lu, and J. Y. Fang. 2011. "Methodological reporting of randomized controlled trials in major hepato-gastroenterology journals in 2008 and 1998: a comparative study." BMC Medical Research Methodology no. 11:110.
194
Wells, George A., Beverley Shea, Julian PT Higgins, Jonathan Sterne, Peter Tugwell, and Barnaby C Reeves. 2013. "Checklists of methodological issues for review authors to consider when including non-randomized studies in systematic reviews." Research Synthesis Methods no. 4 (1):63-77.
Wente, M. N., C. M. Seiler, W. Uhl, and M. W. Buchler. 2003. "Perspectives of evidence-based surgery." Digestive Surgery no. 20 (4):263-9.
West, S., V. King, T. S. Carey, K. N. Lohr, N. McKoy, S. F. Sutton, and L. Lux. 2002. "Systems to rate the strength of scientific evidence." Evid Rep Technol Assess (Summ) (47):1-11.
Wexner, S. D., and S. M. Cohen. 1995. "Port site metastases after laparoscopic colorectal surgery for cure of malignancy." British Journal of Surgery no. 82 (3):295-8.
Williams, R. J., T. Tse, W. R. Harlan, and D. A. Zarin. 2010. "Registration of observational studies: is it time?" CMAJ: Canadian Medical Association Journal no. 182 (15):1638-42.
Wolf, B. R., and J. A. Buckwalter. 2006. "Randomized surgical trials and "sham" surgery: relevance to modern orthopaedics and minimally invasive surgery." Iowa Orthopaedic Journal no. 26:107-11.
Wood, Lesley, Matthias Egger, Lise Lotte Gluud, Kenneth F. Schulz, Peter Juni, Douglas G. Altman, Christian Gluud, Richard M. Martin, Anthony J. G. Wood, and Jonathan A. C. Sterne. 2008. "Empirical evidence of bias in treatment effect estimates in controlled trials with different interventions and outcomes: meta-epidemiological study." BMJ no. 336 (7644):601-5.
Yamamoto, H., R. W. Hughes, Jr., K. W. Schroeder, T. R. Viggiano, and E. P. DiMagno. 1992. "Treatment of benign esophageal stricture by Eder-Puestow or balloon dilators: a comparison between randomized and prospective nonrandomized trials." Mayo Clinic Proceedings no. 67 (3):228-36.
Yamamoto, M., J. Okuda, K. Tanaka, K. Kondo, K. Asai, H. Kayano, S. Masubuchi, and K. Uchiyama. 2013. "Evaluating the learning curve associated with laparoscopic left hemicolectomy for colon cancer." American Surgeon no. 79 (4):366-71.
Zmora, O., P. Gervaz, and S. D. Wexner. 2001. "Trocar site recurrence in laparoscopic surgery for colorectal cancer: Myth or real concern?" Surgical Endoscopy no. 15 (8):788-793.
195
Appendix A Literature Search Strategy for the Development of a
Conceptual Framework of Bias in Non-Randomised Studies
Ovid MEDLINE(R) 1946 to January Week 4 2012
# Searches Results
1 scale*.mp. 387231
2 checklist*.mp. 16212
3 check-list*.mp. 2000
4 critic* apprais*.mp. 4672
5 tool*.mp. 306344
6 or/1-5 688068
7 valid*.mp. 332006
8 quality.mp. 560181
9 ((bias* or confounding) and (assess* or measure* or evaluat*)).mp. 67288
10 7 or 8 or 9 904895
11 6 and 10 136206
12 observational stud*.mp. 32638
13 exp Cohort Studies/ 1217162
14 cohort stud*.mp. 171787
15 exp case-control studies/ 579877
16 case control* stud*.mp. 169638
17 Cross-Sectional Studies/ 149188
18 cross sectional stud*.mp. 159276
19 followup stud*.mp. 654
20 follow-up stud*.mp. 470558
21 (nonrandom* adj2 stud*).mp. 2775
22 (non-random* adj2 stud*).mp. 2304
23 or/12-22 1508781
24 11 and 23 28183
25 limit 24 to systematic reviews 1416
196
Appendix B
Literature Search Strategy for the Identification of Comparative Studies Evaluating Laparoscopy versus Conventional Surgery
for Colon Cancer
i) MEDLINE
Ovid MEDLINE(R) 1950 to January 31, 2011.
# Searches
Colon, Colonic Diseases, & Colon Cancer Component
1 exp Colon/
2 exp Colonic Diseases/
3 exp Colorectal Neoplasms/
4 (Adenocarcinom: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
5 (Adenom: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
6 (Cancer: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
7 (Carcinom: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
8 (Malignan: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
9 (Neoplas: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
10 (Tumor: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
11 (Tumour: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
12 or/1-11
13 exp Intestines/
14
exp neoplasms by histologic type/ or exp neoplasms, hormone-dependent/ or exp neoplasms, multiple primary/ or exp neoplasms, post-traumatic/ or exp neoplasms, radiation-induced/ or exp neoplasms, second primary/ or exp neoplastic processes/ or exp neoplastic syndromes, hereditary/
15 13 and 14
197
16 12 or 15
Laparotomy / Open Surgery Component
17 surgical procedures, elective/
18 exp colectomy/
19 Ostomy/
20 enterostomy/
21 cecostomy/
22 colostomy/
23 ileostomy/
24 jejunostomy/
25 proctocolectomy, restorative/
26 surgically-created structures/
27 colonic pouches/
28 surgical stomas/
29 Laparotomy/
30 laparotom*.mp.
31 minilaparotom*.mp.
32 mini-laparotom*.mp.
33 (open adj3 surg*).mp.
34 (operative adj2 therap*).mp.
35 exp colorectal surgery/
36 General surgery/
37 conventional*.mp.
38 convert*.mp.
39 conversion*.mp.
40 reoperation/
41 suture techniques/
42 surgical stapling/
43 anastomos*.mp.
44 cecostom*.mp.
45 colectom*.mp.
46 coloanal pouch*.mp.
47 colo-anal pouch*.mp.
48 colocolonic.mp.
198
49 colo-colonic.mp.
50 colostom*.mp.
51 diversion?.mp.
52 enterostom*.mp.
53 hartmann*.mp.
54 hemicolectom*.mp.
55 hemi-colectom*.mp.
56 ileocolic*.mp.
57 ileostom*.mp.
58 "j-pouch*".mp.
59 jejunostom*.mp.
60 (mesorectal* adj2 excis*).mp.
61 ostom*.mp.
62 proctectom*.mp.
63 proctocolectom*.mp.
64 rectosigmoidectom*.mp.
65 recto-sigmoidectom*.mp.
66 sigmoidectom*.mp.
67 (surgical* adj2 approach*).mp.
68 (surgical* adj2 therap*).mp.
69 (digestive adj2 (surgic* or surger*)).mp.
70 surgeon*.mp.
71 su.fs.
72 traditional*.mp.
73 conservative*.mp.
74 or/17-73
Laparoscopy & Related Terms Component
75 exp Laparoscopy/
76 exp laparoscopes/
77 laparoscop:.mp.
78 celioscop*.mp.
79 coelioscop*.mp.
80 peritoneoscop*.mp.
81 Laparoendoscop*.mp.
199
82 Laparo-endoscop*.mp.
83 Minimal* invasive*.mp.
84 Surgical Procedures, Minimally Invasive/
85 Video-assisted Surgery/
86 or/75-85
87 16 and 74 and 86 Colon + Open Surgery + Laparoscopic Surgery
Limits:
88 limit 87 to humans
89 limit 88 to english language
90 remove duplicates from 89
200
ii) EMBASE
EMBASE 1980 to January 31, 2011. # Searches
Colon, Colonic Diseases, & Colon Cancer Component 1 exp Colon/ 2 exp Colon Diseases/ 3 exp Rectum Cancer/ 4 exp Colon Cancer/
5 (Adenocarcinom: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
6 (Adenom: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
7 (Cancer: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
8 (Carcinom: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
9 (Malignan: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
10 (Metasta* adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
11 (Neoplas: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
12 (Tumor: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
13 (Tumour: adj3 (colorect: or colon: or rect: or intestine: or large bowel: or bowel: or anal or anus or perianal or peri-anal or circumanal or sigmoid:)).mp.
14 or/1-13 15 exp Intestine/ 16 exp Neoplasm/ 17 15 and 16 18 14 or 17
Laparotomy / Open Surgery Component 19 Elective Surgery/ 20 exp Colon Surgery/ 21 exp Enterostomy/ 22 exp Colorectal Surgery/ 23 (surgically-created adj2 structure*).mp. 24 Surgical Approach/ 25 Laparotomy/
201
26 laparotom*.mp. 27 minilaparotom*.mp. 28 mini-laparotom*.mp. 29 (open adj3 surg*).mp. 30 (operative adj2 therap*).mp. 31 General surgery/ 32 conventional*.mp. 33 convert*.mp. 34 conversion*.mp. 35 reoperation/ 36 suturing method/ 37 surgical stapling/ 38 anastomos*.mp. 39 cecostom*.mp. 40 colectom*.mp. 41 coloanal pouch*.mp. 42 colo-anal pouch*.mp. 43 colocolonic.mp. 44 colo-colonic.mp. 45 colostom*.mp. 46 diversion?.mp. 47 enterostom*.mp. 48 hartmann*.mp. 49 hemicolectom*.mp. 50 hemi-colectom*.mp. 51 ileocolic*.mp. 52 ileostom*.mp. 53 "j-pouch*".mp. 54 jejunostom*.mp. 55 (mesorectal* adj2 excis*).mp. 56 ostom*.mp. 57 proctectom*.mp. 58 proctocolectom*.mp. 59 rectosigmoidectom*.mp. 60 recto-sigmoidectom*.mp. 61 sigmoidectom*.mp. 62 (surgical* adj2 approach*).mp.
202
63 (surgical* adj2 therap*).mp. 64 (digestive adj2 (surgic* or surger*)).mp. 65 surgeon*.mp. 66 su.fs. 67 traditional*.mp. 68 conservative*.mp. 69 or/19-68
Laparoscopy & Related Terms Component 70 exp Laparoscopy/ 71 exp laparoscope/ 72 exp Laparoscopic Surgery/ 73 Endoscopic Surgery/ 74 laparoscop:.mp. 75 celioscop*.mp. 76 coelioscop*.mp. 77 peritoneoscop*.mp. 78 Laparoendoscop*.mp. 79 Laparo-endoscop*.mp. 80 Minimal* invasive*.mp. 81 Minimally Invasive Surgery/ 82 (video-assisted adj2 surg*).mp. 83 (videoassisted adj2 surg*).mp. 84 or/70-83 85 18 and 69 and 84 Colon + Open Sx + Laparoscopic Sx 86 limit 85 to humans 87 remove duplicates from 86 88 limit 87 to english language
203
Appendix C
Criteria for judging risk of bias in the Cochrane Risk of Bias Tool
RANDOM SEQUENCE GENERATION Selection bias (biased allocation to interventions) due to inadequate generation of a randomised sequence. Criteria for a judgement of ‘Low risk’ of bias.
The investigators describe a random component in the sequence generation process such as:
• Referring to a random number table; • Using a computer random number generator; • Coin tossing; • Shuffling cards or envelopes; • Throwing dice; • Drawing of lots; • Minimization*.
*Minimization may be implemented without a random element, and this is considered to be equivalent to being random.
Criteria for the judgement of ‘High risk’ of bias.
The investigators describe a non-random component in the sequence generation process. Usually, the description would involve some systematic, non-random approach, for example:
• Sequence generated by odd or even date of birth; • Sequence generated by some rule based on date (or day) of
admission; • Sequence generated by some rule based on hospital or clinic
record number. Other non-random approaches happen much less frequently than the systematic approaches mentioned above and tend to be obvious. They usually involve judgement or some method of non-random categorization of participants, for example:
• Allocation by judgement of the clinician; • Allocation by preference of the participant; • Allocation based on the results of a laboratory test or a series
of tests; • Allocation by availability of the intervention.
204
ALLOCATION CONCEALMENT Selection bias (biased allocation to interventions) due to inadequate concealment of allocations prior to assignment. Criteria for a judgement of ‘Low risk’ of bias.
Participants and investigators enrolling participants could not foresee assignment because one of the following, or an equivalent method, was used to conceal allocation:
• Central allocation (including telephone, web-based and pharmacy-controlled randomization);
• Sequentially numbered drug containers of identical appearance;
• Sequentially numbered, opaque, sealed envelopes. Criteria for the judgement of ‘High risk’ of bias.
Participants or investigators enrolling participants could possibly foresee assignments and thus introduce selection bias, such as allocation based on:
• Using an open random allocation schedule (e.g. a list of random numbers);
• Assignment envelopes were used without appropriate safeguards (e.g. if envelopes were unsealed or nonopaque or not sequentially numbered);
• Alternation or rotation; • Date of birth; • Case record number; • Any other explicitly unconcealed procedure.
Criteria for the judgement of ‘Unclear risk’ of bias.
Insufficient information to permit judgement of ‘Low risk’ or ‘High risk’. This is usually the case if the method of concealment is not described or not described in sufficient detail to allow a definite judgement – for example if the use of assignment envelopes is described, but it remains unclear whether envelopes were sequentially numbered, opaque and sealed.
BLINDING OF PARTICIPANTS AND PERSONNEL Performance bias due to knowledge of the allocated interventions by participants and personnel during the study. Criteria for a judgement of ‘Low risk’ of bias.
Any one of the following: • No blinding or incomplete blinding, but the review authors
judge that the outcome is not likely to be influenced by lack of blinding;
• Blinding of participants and key study personnel ensured, and unlikely that the blinding could have been broken.
Criteria for the judgement of ‘High risk’ of bias.
Any one of the following: • No blinding or incomplete blinding, and the outcome is likely to
be influenced by lack of blinding; • Blinding of key study participants and personnel attempted, but
likely that the blinding could have been broken, and the outcome is likely to be influenced by lack of blinding.
Criteria for the Any one of the following:
205
judgement of ‘Unclear risk’ of bias.
• Insufficient information to permit judgement of ‘Low risk’ or ‘High risk’;
• The study did not address this outcome. BLINDING OF OUTCOME ASSESSMENT Detection bias due to knowledge of the allocated interventions by outcome assessors. Criteria for a judgement of ‘Low risk’ of bias.
Any one of the following: • No blinding of outcome assessment, but the review authors
judge that the outcome measurement is not likely to be influenced by lack of blinding;
• Blinding of outcome assessment ensured, and unlikely that the blinding could have been broken.
Criteria for the judgement of ‘High risk’ of bias.
Any one of the following: • No blinding of outcome assessment, and the outcome
measurement is likely to be influenced by lack of blinding; • Blinding of outcome assessment, but likely that the blinding
could have been broken, and the outcome measurement is likely to be influenced by lack of blinding.
Criteria for the judgement of ‘Unclear risk’ of bias.
Any one of the following: • Insufficient information to permit judgement of ‘Low risk’ or
‘High risk’; • The study did not address this outcome
INCOMPLETE OUTCOME DATA Attrition bias due to amount, nature or handling of incomplete outcome data. Criteria for a judgement of ‘Low risk’ of bias.
Any one of the following: • No missing outcome data; • Reasons for missing outcome data unlikely to be related to true
outcome (for survival data, censoring unlikely to be introducing bias);
• Missing outcome data balanced in numbers across intervention groups, with similar reasons for missing data across groups;
• For dichotomous outcome data, the proportion of missing outcomes compared with observed event risk not enough to have a clinically relevant impact on the intervention effect estimate;
• For continuous outcome data, plausible effect size (difference in means or standardized difference in means) among missing outcomes not enough to have a clinically relevant impact on observed effect size;
• Missing data have been imputed using appropriate methods. Criteria for the judgement of ‘High risk’ of bias.
Any one of the following: • Reason for missing outcome data likely to be related to true
outcome, with either imbalance in numbers or reasons for missing data across intervention groups;
• For dichotomous outcome data, the proportion of missing
206
outcomes compared with observed event risk enough to induce clinically relevant bias in intervention effect estimate;
• For continuous outcome data, plausible effect size (difference in means or standardized difference in means) among missing outcomes enough to induce clinically relevant bias in observed effect size;
• ‘As-treated’ analysis done with substantial departure of the intervention received from that assigned at randomization;
• Potentially inappropriate application of simple imputation. Criteria for the judgement of ‘Unclear risk’ of bias.
Any one of the following: • Insufficient reporting of attrition/exclusions to permit judgement
of ‘Low risk’ or ‘High risk’ (e.g. number randomized not stated, no reasons for missing data provided);
• The study did not address this outcome. SELECTIVE REPORTING Reporting bias due to selective outcome reporting. Criteria for a judgement of ‘Low risk’ of bias.
Any of the following: • The study protocol is available and all of the study’s pre-
specified (primary and secondary) outcomes that are of interest in the review have been reported in the pre-specified way;
• The study protocol is not available but it is clear that the published reports include all expected outcomes, including those that were pre-specified (convincing text of this nature may be uncommon).
Criteria for the judgement of ‘High risk’ of bias.
Any one of the following: • Not all of the study’s pre-specified primary outcomes have
been reported; • One or more primary outcomes is reported using
measurements, analysis methods or subsets of the data (e.g. subscales) that were not pre-specified;
• One or more reported primary outcomes were not pre-specified (unless clear justification for their reporting is provided, such as an unexpected adverse effect);
• One or more outcomes of interest in the review are reported incompletely so that they cannot be entered in a meta-analysis;
• The study report fails to include results for a key outcome that would be expected to have been reported for such a study.
Criteria for the judgement of ‘Unclear risk’ of bias.
Insufficient information to permit judgement of ‘Low risk’ or ‘High risk’. It is likely that the majority of studies will fall into this category.
207
OTHER BIAS Bias due to problems not covered elsewhere in the table. Criteria for a judgement of ‘Low risk’ of bias.
The study appears to be free of other sources of bias.
Criteria for the judgement of ‘High risk’ of bias.
There is at least one important risk of bias. For example, the study: • Had a potential source of bias related to the specific study
design used; or • Has been claimed to have been fraudulent; or • Had some other problem.
Criteria for the judgement of ‘Unclear risk’ of bias.
There may be a risk of bias, but there is either: • Insufficient information to assess whether an important risk of
bias exists; or • Insufficient rationale or evidence that an identified problem will
introduce bias. Adapted from Table 8.5.d, Higgins JPT, Altman DG, Sterne JAC (editors). Chapter 8: Assessing risk of bias in included studies. In: Higgins JPT, Green S (editors). Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 (updated March 2011). The Cochrane Collaboration, 2011. Available from www.cochrane-handbook.org.
208
Appendix D Comparative Studies of Laparoscopy versus Conventional
Surgery for Colon Cancer Meeting a priori Exclusion Criteria i) Systematic Reviews and Meta-Analyses 1. Abraham NS, Byrne CJ, Young JM, Solomon MJ. Meta-analysis of well-designed
nonrandomized comparative studies of surgical procedures is as good as randomized controlled trials. J Clin Epidemiol. Mar 2010;63(3):238-245.
2. Abraham NS, Byrne CM, Young JM, Solomon MJ. Meta-analysis of non-randomized comparative studies of the short-term outcomes of laparoscopic resection for colorectal cancer.[see comment]. ANZ Journal of Surgery. 2007;77(7):508-516.
3. Abraham NS, Young JM, Solomon MJ. Meta-analysis of short-term outcomes after laparoscopic resection for colorectal cancer.[see comment]. British Journal of Surgery. 2004;91(9):1111-1124.
4. Angst E, Hiatt JR, Gloor B, Reber HA, Hines OJ. Laparoscopic surgery for cancer: A systematic review and a way forward. J Am Coll Surg. September 2010;211 (3):412-423.
5. Bai HL, Chen B, Zhou Y, Wu XT. Five-year long-term outcomes of laparoscopic surgery for colon cancer. World Journal of Gastroenterology. October 21 2010;16 (39):4992-4997.
6. Bonjer HJ, Hop WCJ, Nelson H, et al. Laparoscopically assisted vs open colectomy for colon cancer: A meta-analysis. Archives of Surgery. Mar 2007;142(3):298-303.
7. Chan M. Erratum: Systematic review on the short term outcome of laparoscopic resection for colon and rectosigmoid cancer [5]. Colorectal Disease. Mar 2008;10(3):305-306.
8. Chapman AE, Levitt MD, Hewett P, Woods R, Sheiner H, Maddern GJ. Laparoscopic-assisted resection of colorectal malignancies: a systematic review. Annals of Surgery. 2001;234(5):590-606.
9. Coratti F, Coratti A, Malatesti R, Testi W, Tani F. [Laparoscopic versus open resection for colorectal cancer: meta-analysis of the chief trials]. G Chir. Aug-Sep 2009;30(8-9):377-384.
10. Dowson HM, Cowie AS, Ballard K, Gage H, Rockall TA. Systematic review of quality of life following laparoscopic and open colorectal surgery. Colorectal Disease. 2008;10(8):757-768.
11. Dowson HM, Huang A, Soon Y, Gage H, Lovell DP, Rockall TA. Systematic review of the costs of laparoscopic colorectal surgery. Dis Colon Rectum. 2007;50(6):908-919.
12. Fingerhut A, Ata T, Chouillard E, Alexakis N, Veyrie N. Laparoscopic approach to colonic cancer: critical appraisal of the literature. Digestive Diseases. 2007;25(1):33-43.
13. Gervaz P, Pikarsky A, Utech M, et al. Converted laparoscopic colorectal surgery. Surg Endosc. 2001;15(8):827-832.
209
14. Gujral S, Avery KNL, Blazeby JM. Quality of life after surgery for colorectal cancer: clinical implications of results from randomised trials. Supportive Care in Cancer. 2008;16(2):127-132.
15. Hayes JL, Hansen P. Is laparoscopic colectomy for cancer cost-effective relative to open colectomy? ANZ Journal of Surgery. 2007;77(9):782-786.
16. Hernandez RA, De Verteuil RM, Fraser CM, Vale LD. Systematic review of economic evaluations of laparoscopic surgery for colorectal cancer. Colorectal Disease. 2008;10(9):859-868.
17. Hildebrandt U, Kreissler-Haag D, Lindemann W. [Laparoscopy-assisted colorectal resections: morbidity, conversions, outcomes of a decade]. Zentralbl Chir. 2001;126(4):323-332.
18. Jackson TD, Kaplan GG, Arena G, Page JH, Rogers Jr SO. Laparoscopic Versus Open Resection for Colorectal Cancer: A Metaanalysis of Oncologic Outcomes. J Am Coll Surg. Mar 2007;204(3):439-446.
19. Jun L. Systematic review of laparoscopic versus open surgery for colorectal cancer (Br J Surg 2006; 93; 921-928).[comment]. British Journal of Surgery. 2007;94(2):250; author reply 250.
20. Kahnamoui K, Cadeddu M, Farrokhyar F, Anvari M. Laparoscopic surgery for colon cancer: a systematic review. Can J Surg. 2007;50(1):48-57.
21. Kaido T. Current evidence supporting indications for laparoscopic surgery in colorectal cancer. Hepato-Gastroenterology. 2008;55(82-83):438-441.
22. Kehlet H. Systematic review of laparoscopic versus open surgery for colorectal cancer (Br J Surg 2006; 93: 921-928).[comment]. British Journal of Surgery. 2006;93(11):1434-1435.
23. Korolija D, Tadic S, Simic D. Extent of oncological resection in laparoscopic vs. open colorectal surgery: meta-analysis. Langenbecks Arch Surg. 2003;387(9-10):366-371.
24. Kuhry E, Schwenk W, Gaupset R, Romild U, Bonjer J. Long-term outcome of laparoscopic surgery for colorectal cancer: a cochrane systematic review of randomised controlled trials. Cancer Treatment Reviews. 2008;34(6):498-504.
25. Kuhry E, Schwenk WF, Gaupset R, Romild U, Bonjer HJ. Long-term results of laparoscopic colorectal cancer resection. Cochrane Database of Systematic Reviews. 2008(2):CD003432.
26. Kurian MS, Patterson E, Andrei VE, Edye MB. Hand-assisted laparoscopic surgery: an emerging technique. Surg Endosc. 2001;15(11):1277-1281.
27. Li J, Ding K-f, Zhang S-z. [Meta-analysis of short-term efficacy and safety after laparoscopic resection for colorectal cancer]. Chung Hua I Hsueh Tsa Chih. 2006;86(35):2485-2490.
28. Liang Y, Li G, Chen P, Yu J. Laparoscopic versus open colorectal resection for cancer: a meta-analysis of results of randomized controlled trials on recurrence. European Journal of Surgical Oncology. 2008;34(11):1217-1224.
29. Liang Y-c, Li G-x, Chen P-y, Yu J, Zhang C. [Laparoscopic versus conventional open resection for colorectal cancer: a meta-analysis on recurrence]. Zhonghua Wei Chang Wai Ke Za Zhi. Sep 2008;11(5):414-420.
210
30. Lourenco T, Murray A, Grant A, McKinley A, Krukowski Z, Vale L. Laparoscopic surgery for colorectal cancer: Safe and effective? - A systematic review. Surg Endosc. May 2008;22(5):1146-1160.
31. Lovett BE, Taylor I. Randomized controlled trials in colorectal disease; a review of recent trials. Colorectal Disease. 2001;3(1):58-64.
32. Luck A, Hensman C, Hewett P. Laparoscopic colectomy for cancer: a review. Australian & New Zealand Journal of Surgery. 1998;68(5):318-327.
33. Manterola C, Pineda V, Vial M. [Open versus laparoscopic resection in non-complicated colon cancer. A systematic review]. Cir Esp. 2005;78(1):28-33.
34. Martel G, Boushey RP, Marcello PW. Results of the Laparoscopic Colon Cancer Randomized Trials: An Evidence-Based Review. Seminars in Colon and Rectal Surgery. Dec 2007;18(4):210-219.
35. Maxwell-Armstrong CA, Robinson MH, Scholefield JH. Laparoscopic colorectal cancer surgery. American Journal of Surgery. 2000;179(6):500-507.
36. McLeod RS, Stern H, McKenzie ME. Canadian Association of General Surgeons Evidence Based Reviews in Surgery. 10. Laparoscopy-assisted colectomy versus open colectomy for treatment of nonmetastatic colon cancer: A randomized trial. Can J Surg. Jun 2004;47(3):209-211.
37. Moloo H, Haggar F, Grimshaw JM, et al. Hand assisted laparoscopic surgery versus conventional laparoscopy for colorectal surgery. Cochrane Database of Systematic Reviews. 2007;(3)(CD006585).
38. Murray A, Lourenco T, de Verteuil R, et al. Clinical effectiveness and cost-effectiveness of laparoscopic surgery for colorectal cancer: systematic reviews and economic evaluation. Health Technology Assessment. 2006;10(45):1-141, iii-iv.
39. Noel JK, Fahrbach K, Estok R, et al. Minimally Invasive Colorectal Resection Outcomes: Short-term Comparison with Open Procedures. J Am Coll Surg. Feb 2007;204(2):291-307.
40. Reza MM, Blasco JA, Andradas E, Cantero R, Mayol J. Systematic review of laparoscopic versus open surgery for colorectal cancer.[see comment]. British Journal of Surgery. 2006;93(8):921-928.
41. Sammour T, Kahokehr A, Chan S, Booth RJ, Hill AG. The humoral response after laparoscopic versus open colorectal surgery: a meta-analysis. J Surg Res. Nov 2010;164(1):28-37.
42. Sammour T, Kahokehr A, Connolly AB, Bissett IP, Hill AG. Does laparoscopic colectomy have a higher intraoperative complication rate than open colectomy? Annals of Surgery. March 2010;251 (3):577-578.
43. Sato K, Adachi Y, Kitano S. [Trends in laparoscopic surgery for colorectal cancer: 10-year experience worldwide]. Nippon Geka Gakkai Zasshi. 2001;Journal of Japan Surgical Society. 102(2):236-242.
44. Schaeff B, Paolucci V, Thomopoulos J. Port site recurrences after laparoscopic surgery. A review. Digestive Surgery. 1998;15(2):124-134.
211
45. Schwenk W, Haase O, Neudecker J, Muller JM. Short term benefits for laparoscopic colorectal resection. Cochrane Database of Systematic Reviews. 2005(3):CD003145.
46. Stocchi L, Nelson H. Wound recurrences following laparoscopic-assisted colectomy for cancer. Archives of Surgery. 2000;135(8):948-958.
47. Tilney HS, Lovegrove RE, Purkayastha S, Heriot AG, Darzi AW, Tekkis PP. Laparoscopic vs open subtotal colectomy for benign and malignant disease. Colorectal Disease. Jun 2006;8(5):441-450.
48. Tjandra JJ, Chan MKY. Systematic review on the short-term outcome of laparoscopic resection for colon and rectosigmoid cancer. Colorectal Disease. Jun 2006;8(5):375-388.
49. Vlug MS, Wind J, van der Zaag E, Ubbink DT, Cense HA, Bemelman WA. Systematic review of laparoscopic vs open colonic surgery within an enhanced recovery programme. Colorectal Disease. 2009;11(4):335-343.
50. Yamamoto S, Fujita S, Ishiguro S, Akasu T, Moriya Y. Wound infection after a laparoscopic resection for colorectal cancer. Surgery Today. 2008;38(7):618-622.
i) Biochemical Outcomes 1. Belizon A, Balik E, Feingold DL, et al. Major abdominal surgery increases plasma levels of
vascular endothelial growth factor: Open more so than minimally invasive methods. Annals of Surgery. Nov 2006;244(5):792-798.
2. Bessa X, Castells A, Lacy AM, et al. Laparoscopic-assisted vs. open colectomy for colorectal cancer: influence on neoplastic cell mobilization. J Gastrointest Surg. 2001;5(1):66-73.
3. Bono A, Bianchi PP, Locatelli A, et al. Angiogenic cells, macroparticles and RNA transcripts in laparoscopic vs. open surgery for colorectal cancer. Cancer Biology and Therapy. 01 Oct 2010;10 (7):682-685.
4. Braga M, Vignali A, Zuliani W, et al. Metabolic and functional results after laparoscopic colorectal surgery: a randomized, controlled trial. Dis Colon Rectum. 2002;45(8):1070-1077.
5. Brokelman W, Holmdahl L, Falk P, Klinkenbijl J, Reijnen M. The peritoneal fibrinolytic response to conventional and laparoscopic colonic surgery. J Laparoendosc Adv Surg Tech A. Aug 2009;Part A. 19(4):489-493.
6. Buchmann P, Christen D, Moll C, Flury R. [Intraperitoneal tumor seeding in colorectal carcinoma surgery--a comparison of laparoscopic versus open procedures in a longitudinal study]]. Langenbecks Archiv fur Chirurgie - Supplement - Kongressband. 1996;113:573-576.
7. Buchmann P, Christen D, Moll C, Flury R, Sartoretti C. [Tumor cells in peritoneal irrigation fluid in conventional and laparoscopic surgery for colorectal carcinoma]. Swiss Surgery. 1996;Suppl 4:45-49.
8. Delgado S, Lacy AM, Filella X, et al. Acute phase response in laparoscopic and open colectomy in colon cancer: randomized study. Dis Colon Rectum. 2001;44(5):638-646.
212
9. Evans C, Galustian C, Kumar D, et al. Impact of surgery on immunologic function: comparison between minimally invasive techniques and conventional laparotomy for surgical resection of colorectal tumors. American Journal of Surgery. 2009;197(2):238-245.
10. Fukushima R, Kawamura YJ, Saito H, et al. Interleukin-6 and stress hormone responses after uncomplicated gasless laparoscopic-assisted and open sigmoid colectomy. Dis Colon Rectum. 1996;39(10 Suppl):S29-34.
11. Gelpi JR, Dorsey-Tyler K, Luchtefeld MA, Senagore AJ. Prospective comparison of gastric emptying after laparoscopic-aided colectomy versus open colectomy. American Surgeon. 1996;62(7):594-596; discussion 596-597.
12. Hammer JH, Basse L, Svendsen MN, et al. Impact of elective resection on plasma TIMP-1 levels in patients with colon cancer. Colorectal Disease. 2006;8(3):168-172.
13. Han S-A, Lee WY, Park C-M, Yun SH, Chun H-K. Comparison of immunologic outcomes of laparoscopic vs open approaches in clinical stage III colorectal cancer. International Journal of Colorectal Disease. May 2010;25(5):631-638.
14. Hewitt PM, Ip SM, Kwok SPY, et al. Laparoscopic-assisted vs. open surgery for colorectal cancer: Comparative study of immune effects. Diseases of the Colon and Rectum. Jul 1998;41(7):901-909.
15. Hildebrandt U, Kessler K, Plusczyk T, Pistorius G, Vollmar B, Menger MD. Comparison of surgical stress between laparoscopic and open colonic resections. Surg Endosc. 2003;17(2):242-246.
16. Hu X, Li H-z, Zhang J, An W-d, Zhang C-j. [Evaluation of the minimal invasiveness of laparoscopic operation for colorectal carcinoma]. Zhonghua Wei Chang Wai Ke Za Zhi. Sep 2005;8(5):404-406.
17. Jingli C, Rong C, Rubai X. Influence of colorectal laparoscopic surgery on dissemination and seeding of tumor cells. Surg Endosc. 2006;20(11):1759-1761.
18. Kim SH, Milsom JW, Gramlich TL, et al. Does laparoscopic vs. conventional surgery increase exfoliated cancer cells in the peritoneal cavity during resection of colorectal cancer? Dis Colon Rectum. 1998;41(8):971-978.
19. Kirman I, Cekic V, Poltaratskaia N, et al. Plasma from patients undergoing major open surgery stimulates in vitro tumor growth: Lower insulin-like growth factor binding protein 3 levels may, in part, account for this change. Surgery. 2002;132(2):186-192.
20. Kirman I, Cekic V, Poltaratskaia N, et al. The percentage of CD31<sup>+</sup> T cells decreases after open but not laparoscopic surgery. Surg Endosc. 01 2003;17(5):754-757.
21. Kirman I, Cekic V, Poltoratskaia N, et al. Open surgery induces a dramatic decrease in circulating intact IGFBP-3 in patients with colorectal cancer not seen with laparoscopic surgery. Surg Endosc. 2005;19(1):55-59.
22. Kirman I, Poltaratskaia N, Cekic V, et al. Depletion of circulating insulin-like growth factor binding protein 3 after open surgery is associated with high interleukin-6 levels. Dis Colon Rectum. 2004;47(6):911-917; discussion 917-918.
213
23. Leung KL, Lai PB, Ho RL, et al. Systemic cytokine response after laparoscopic-assisted resection of rectosigmoid carcinoma: A prospective randomized trial. Annals of Surgery. 2000;231(4):506-511.
24. Leung KL, Tsang KS, Ng MHL, et al. Lymphocyte subsets and natural killer cell cytotoxicity after laparoscopically assisted resection of rectosigmoid carcinoma. Surg Endosc. 2003;17(8):1305-1310.
25. Mehigan BJ, Hartley JE, Drew PJ, et al. Changes in T cell subsets, interleukin-6, and C-reactive protein after laparoscopic and open colorectal resection for malignancy. Surg Endosc. 2001;15(11):1289-1293.
26. Neudecker J, Junghans T, Raue W, Ziemer S, Schwenk W. Fibrinolytic capacity in peritoneal fluid after laparoscopic and conventional colorectal resection: data from a randomized controlled trial.[see comment]. Langenbecks Arch Surg. 2005;390(6):523-527.
27. Neudecker J, Junghans T, Ziemer S, Raue W, Schwenk W. Effect of laparoscopic and conventional colorectal resection on peritoneal fibrinolytic capacity: a prospective randomized clinical trial. International Journal of Colorectal Disease. 2002;17(6):426-429.
28. Neudecker J, Junghans T, Ziemer S, Raue W, Schwenk W. Prospective randomized trial to determine the influence of laparoscopic and conventional colorectal resection on intravasal fibrinolytic capacity. Surg Endosc. 2003;17(1):73-77.
29. Neudecker J, Neudecker BA, Raue W, Stern R, Schwenk W. Hyaluronan levels during laparoscopic versus open colonic resections. Surg Endosc. 2008;22(3):660-663.
30. Ordemann J, Jacobi CA, Schwenk W, Stosslein R, Muller JM. Cellular and humoral inflammatory response after laparoscopic and conventional colorectal resections. Surg Endosc. 2001;15(6):600-608.
31. Ozawa A, Konishi F, Nagai H, Okada M, Kanazawa K. Cytokine and hormonal responses in laparoscopic-assisted colectomy and conventional open colectomy. Surgery Today. 2000;30(2):107-111.
32. Schwenk W, Jacobi C, Mansmann U, Bohm B, Muller JM. Inflammatory response after laparoscopic and conventional colorectal resections - results of a prospective randomized trial. Langenbecks Arch Surg. 2000;385(1):2-9.
33. Sietses C, Havenith CEG, Eijsbouts QAJ, et al. Laparoscopic surgery preserves monocyte-mediated tumor cell killing in contrast to the conventional approach. Surg Endosc. May 2000;14(5):456-460.
34. Svendsen MN, Werther K, Christensen IJ, Basse L, Nielsen HJ. Influence of open versus laparoscopically assisted colectomy on soluble vascular endothelial growth factor (sVEGF) and its soluble receptor 1 (sVEGFR1). Inflammation Research. 2005;54(11):458-463.
35. Tan M, Xu FF, Peng JS, et al. Changes in the level of serum liver enzymes after laparoscopic surgery. World Journal of Gastroenterology. 15 2003;9(2):364-367.
36. Tang CL, Eu KW, Tai BC, Soh JGS, Machin D, Seow-Choen F. Randomized clinical trial of the effect of open versus laparoscopically assisted colectomy on systemic immunity in patients with colorectal cancer. British Journal of Surgery. 2001;88(6):801-807.
214
37. Vignali A, Di Palo S, Orsenigo E, Ghirardelli L, Radaelli G, Staudacher C. Effect of prednisolone on local and systemic response in laparoscopic vs. open colon surgery: a randomized, double-blind, placebo-controlled trial. Dis Colon Rectum. 2009;52(6):1080-1088.
38. Voloshin T, Gingis-Velitski S, Shaked Y. The angiogenic profile of colorectal cancer patients following open or laparoscopic colectomy. Cancer Biology and Therapy. 01 Oct 2010;10 (7):686-688.
39. Whelan RL, Franklin M, Holubar SD, et al. Postoperative cell mediated immune response is better preserved after laparoscopic vs open colorectal resection in humans. Surg Endosc. 01 2003;17(6):972-978.
40. Wichmann MW, Huttl TP, Winter H, et al. Immunological effects of laparoscopic vs open colorectal surgery. A prospective clinical study. Archives of Surgery. Jul 2005;140(7):692-697.
41. Wu FPK, Hoekman K, Sietses C, et al. Systemic and peritoneal angiogenic response after laparoscopic or conventional colon resection in cancer patients: a prospective, randomized trial. Dis Colon Rectum. 2004;47(10):1670-1674.
42. Wu FPK, Sietses C, von Blomberg BME, van Leeuwen PAM, Meijer S, Cuesta MA. Systemic and peritoneal inflammatory response after laparoscopic or conventional colon resection in cancer patients: a prospective, randomized trial. Dis Colon Rectum. 2003;46(2):147-155.
43. Zhao G, Xiao G, Huang M-x, Long H-k. [Effect of laparoscopic radical operation on systemic immunity in patients with colorectal cancer]. Zhonghua Wei Chang Wai Ke Za Zhi. Sep 2005;8(5):407-409.
i) Non-English Articles 1. Baccari P, Di Palo S, Redaelli A, Carlucci M, Staudacher C. [Laparoscopic versus
conventional surgery in the treatment of colorectal diseases]. Chir Ital. 2000;52(1):17-27.
2. Bohm B, Schwenk W, Grundel K, Junghans T, Muller JM. [Value of laparoscopic technique in primary colorectal carcinoma]. Chirurg. 1997;68(3):231-236.
3. Brummer S, Sohr D, Ruden H, Gastmeier P. [Surgical site infection rates using a laparoscopic approach: results of the German national nosocomial infections surveillance system]. Chirurg. 2007;78(10):910-914.
4. Buchmann P, Bischofberger U, De Lorenzi D, Christen D. [Early postoperative nutrition after laparoscopic and open colorectal resection]. Swiss Surgery. 1998;4(3):146-155.
5. Buchmann P, Christen D, Buschta G, Sartoretti C. [Intraperitoneal tumor seeding in colorectal carcinoma surgery--follow-up of a comparison of laparoscopic versus open procedure]. Langenbecks Archiv fur Chirurgie - Supplement - Kongressband. 1997;114:1122-1124.
215
6. Buchmann P, Christen D, Flury R, Luthy A, Bischofberger U. [Does laparoscopic colonic carcinoma surgery satisfy the radicality criteria of open surgery?]. Schweizerische Medizinische Wochenschrift. 1995;Journal Suisse de Medecine. 125(39):1825-1829.
7. Chi P, Lin H-m, Chen Y-c, Xu Z-b. [Feasibility of lymphadenectomy with skeletonization in extended right hemicolectomy by hand-assisted laparoscopic surgery]. Zhonghua Wei Chang Wai Ke Za Zhi. Sep 2005;8(5):410-412.
8. Chi P, Lin H-m, Xu Z-b. [Comparison of surgical complication rate between laparoscopic and open radical resection for colorectal cancer]. Zhonghua Wei Chang Wai Ke Za Zhi. May 2006;9(3):221-224.
9. Frasson M, Braga M, Vignali A, et al. [Laparoscopic-assisted versus open surgery for colorectal cancer: postoperative morbidity in a single center randomized trial]. Minerva Chir. 2006;61(4):283-292.
10. Gong T, Wang T. Laparoscopic surgery for colorectal cancer. [Chinese]. World Chinese Journal of Digestology. 18 Jul 2010;18 (20):2121-2126.
11. Gutt CN, Hanisch E. [Laparoscopic resection in comparison with open resection of adenocarcinoma of the colon]. Zeitschrift fur Gastroenterologie. 1998;36(5):471-473.
12. Habr-Gama A, de Silva e Souza Junior AH, Araujo SE. [Videolaparoscopic access in the surgical treatment of colorectal cancer: critical analysis]. Revista Da Associacao Medica Brasileira. 1997;43(4):352-356.
13. Innocenti P, Aceto L, Di Bartolomeo N, et al. [Conventional versus laparoscopic surgery in tumors of the colon]. Supplementi di Tumori: Official Journal of Societa Italiana di Cancerologia. 2002;1(3):S1-4.
14. Junghans T, Raue W, Haase O, Neudecker J, Schwenk W. [Value of laparoscopic surgery in elective colorectal surgery with "fast-track"-rehabilitation]. Zentralbl Chir. 2006;131(4):298-303.
15. Kohler L, Eypasch E, Holthausen U, Troidl H. [Laparoscopic colon resection in carcinoma--beneficial or not?]. Langenbecks Archiv fur Chirurgie - Supplement - Kongressband. 1996;113:577-579.
16. Kohler L, Holthausen U, Troidl H. [Laparoscopic colorectal surgery--attempt at evaluating a new technology].[see comment]. Chirurg. 1997;68(8):794-800; discussion 800.
17. Konishi F, Nagai H, Kanazawa K. Laparoscopic colectomy for colerectal carcinomas. [Japanese]. Japanese Journal of Gastroenterological Surgery. 1999;32(8):2172-2176.
18. Kruger IM, Nilius J, Krings F, Bullermann C. [Analysis of the cost-income ratio for open and laparoscopic sigmoid resection]. Zentralbl Chir. 2004;129(4):285-290.
19. Kube R, Ptok H, Steinert R, et al. [Clinical value of laparoscopic surgery for colon cancer]. Chirurg. 2008;79(12):1145-1150.
20. Kube R, Ptok H, Steinert R, et al. Clinical value of laparoscopic surgery for colon cancer. [German]. Chirurg. December 2008;79 (12):1145-1150.
21. Kuhry E, Saetnan E, Graeslie H, Gaupset R. [Laparoscopic surgery for colorectal cancer]. Tidsskrift for Den Norske Laegeforening. 2007;127(22):2946-2949.
216
22. Mao Z-h, Chen H-z, Li J-w, et al. [Comparison of inflammatory response after laparoscopic and conventional surgery for colorectal carcinoma]. Zhonghua Wei Chang Wai Ke Za Zhi. Jul 2006;9(4):297-300.
23. Martinek L, Dostalik J, Gunka I, Gunkova P, Vavra P. [Comparison of oncological outcomes between laparoscopic and open procedures in non-metastazing colonic carcinomas]. Rozhl Chir. Dec 2009;88(12):725-729.
24. Martinek L, Dostalik J, Vavra P, Gunikova P, Gunka I. [Implementation of POSSUM scoring system in assessing morbidity after laparoscopic colorectal surgery]. Rozhl Chir. 2008;87(1):26-31.
25. Mou Y-p, Yang P, Yan J-f, et al. [Clinical evaluation of laparoscopic radical resection of colon cancer]. Chung Hua Wai Ko Tsa Chih. 2006;44(9):581-583.
26. Pommergaard H-C, Olsen JA, Burgdorf SK, Achiam MP. [Laparoscopic versus right-sided hemicolectomy in cancer of colon therapy]. Ugeskr Laeger. Mar 29 2010;172(13):1034-1038.
27. Procacciante F, Flati D, Diamantini G, et al. [Severe postoperative complications in colorectal surgery for cancer. Incidence related to the techniques employed: open versus laparoscopic colectomy]. Chir Ital. 2008;60(3):329-336.
28. Ptok H, Steinert R, Meyer F, et al. [Long-term oncological results after laparoscopic, converted and primary open procedures for rectal carcinoma. Results of a multicenter observational study]. Chirurg. 2006;77(8):709-717.
29. Qian L-y, Wu J-h, Chen D-j, Li X-r, Wei Y-s. [Comparative study on long-term results of laparoscopic and open radical resection for colorectal carcinoma]. Zhonghua Wei Chang Wai Ke Za Zhi. Jul 2006;9(4):294-296.
30. Ramacciato G, D'Angelo F, Aurello P, et al. [Right hemicolectomy for colon cancer: a prospective randomised study comparing laparoscopic vs. open technique]. Chir Ital. 2008;60(1):1-7.
31. Sazhin VP, Gostkin PA, Soboleva VI, Siatkin DA, Sazhin IV, Bublikov ID. [Complex approach to the complicated forms of colorectal cancer]. Khirurgiia (Mosk). 2010(7):15-19.
32. Sazhin VP, Savel'ev VM, Pigin AS, Malashenko PA. [Role and perspectives of the use of laparoscopic surgery in colo-proctology]. Khirurgiia (Mosk). 1995(5):25-27.
33. Schneider C, Scheidbach H, Scheuerlein H, Kockerling F. [Prospective multicenter study of laparoscopic colorectal surgery. Quality assurance during introduction of new methods]. Zentralbl Chir. 2000;125 Suppl 2:164-168.
34. Schwenk W, Raue W, Haase O, Junghans T, Muller JM. ["Fast-track" colonic surgery-first experience with a clinical procedure for accelerating postoperative recovery]. Chirurg. 2004;75(5):508-514.
35. Shamsia RA. [Quality of life in patients after laparoscopic and open interventions for colonic tumors]. Klinicheskaia Khirurgiia. 2005(1):24-28.
36. Siani LM, Ferranti F, Marzano M, De Carlo A, Quintiliani A. [Five-year oncological results of laparoscopic versus open left hemicolectomy]. Chir Ital. Sep-Dec 2009;61(5-6):579-583.
217
37. Siani LM, Ferranti F, Marzano M, De Carlo A, Quintiliani A. [Laparoscopic versus open right hemicolectomy: 5-year oncology results]. Chir Ital. Sep-Dec 2009;61(5-6):573-577.
38. Smedh K, Strand E, Jansson P, et al. [Rapid recovery after colonic resection. Multimodal rehabilitation by means of Kehlet's method practiced in Vasteras]. Lakartidningen. 2001;98(21):2568-2574.
39. Wang Z-d, Wu Z-y, Li Y, Wu W-l, Lin F. [Clinical efficacy comparison between laparoscopy and open radical resection for 191 advanced colorectal cancer patients]. Zhonghua Wei Chang Wai Ke Za Zhi. Jul 2009;12(4):368-370.
40. Xu M, Yang XB, Shi LG, Wang YJ. Effect of laparoscopic resection on systemic stress responses in colorectal cancer patients. [Chinese]. Journal of Dalian Medical University. 2009;31 (3):328-330.
218
Appendix E Comparative Studies of Laparoscopy versus Conventional Surgery for Colon Cancer Meeting a priori Inclusion Criteria
1. Lohsiriwat V, Lohsiriwat D, Chinswangwatanakul V, Akaraviputh T, Lert-Akyamanee N.
Comparison of short-term outcomes between laparoscopically-assisted vs. transverse-incision open right hemicolectomy for right-sided colon cancer: a retrospective study. World Journal of Surgical Oncology. 2007;5:49.
2. Ng SSM, Leung KL, Lee JFY, Yiu RYC, Li JCM, Hon SSF. Long-term morbidity and oncologic outcomes of laparoscopic-assisted anterior resection for upper rectal cancer: ten-year results of a prospective, randomized trial. Dis Colon Rectum. 2009;52(4):558-566.
3. Lin JH, Whelan RL, Sakellarios NE, et al. Prospective study of ambulation after open and laparoscopic colorectal resection. Surgical Innovation. 2009;16(1):16-20.
4. Shabbir A, Roslani AC, Wong K-S, Tsang CBS, Wong H-B, Cheong W-K. Is laparoscopic colectomy as cost beneficial as open colectomy?[see comment]. ANZ Journal of Surgery. 2009;79(4):265-270.
5. Scarpa M, Erroi F, Ruffolo C, et al. Minimally invasive surgery for colorectal cancer: quality of life, body image, cosmesis, and functional results. Surg Endosc. 2009;23(3):577-582.
6. Kennedy GD, Heise C, Rajamanickam V, Harms B, Foley EF. Laparoscopy decreases postoperative complication rates after abdominal colectomy: results from the national surgical quality improvement program. Annals of Surgery. 2009;249(4):596-601.
7. Zmora O, Hashavia E, Munz Y, et al. Laparoscopic colectomy is associated with decreased postoperative gastrointestinal dysfunction. Surg Endosc. 2009;23(1):87-89.
8. Kemp JA, Finlayson SRG. Outcomes of laparoscopic and open colectomy: a national population-based comparison. Surgical Innovation. 2008;15(4):277-283.
9. Bilimoria KY, Bentrem DJ, Merkow RP, et al. Laparoscopic-assisted vs. open colectomy for cancer: comparison of short-term outcomes from 121 hospitals. J Gastrointest Surg. 2008;12(11):2001-2009.
10. Imai E, Ueda M, Kanao K, et al. Surgical site infection risk factors identified by multivariate analysis for patient undergoing laparoscopic, open colon, and gastric surgery. American Journal of Infection Control. 2008;36(10):727-731.
11. Mirza MS, Longman RJ, Farrokhyar F, Sheffield JP, Kennedy RH. Long-term outcomes for laparoscopic versus open resection of nonmetastatic colorectal cancer. J Laparoendosc Adv Surg Tech A. 2008;Part A. 18(5):679-685.
12. Andersen LPH, Klein M, Gogenur I, Rosenberg J. Incisional hernia after open versus laparoscopic sigmoid resection. Surg Endosc. 2008;22(9):2026-2029.
219
13. Hewett PJ, Allardyce RA, Bagshaw PF, et al. Short-term outcomes of the Australasian randomized clinical study comparing laparoscopic and conventional open surgical treatments for colon cancer: the ALCCaS trial. Annals of Surgery. 2008;248(5):728-738.
14. Bilimoria KY, Bentrem DJ, Nelson H, et al. Use and outcomes of laparoscopic-assisted colectomy for cancer in the United States. Archives of Surgery. 2008;143(9):832-839; discussion 839-840.
15. Varela JE, Asolati M, Huerta S, Anthony T. Outcomes of laparoscopic and open colectomy at academic centers. American Journal of Surgery. 2008;196(3):403-406.
16. Lacy AM, Delgado S, Castells A, et al. The long-term results of a randomized clinical trial of laparoscopy-assisted versus open surgery for colon cancer.[see comment]. Annals of Surgery. 2008;248(1):1-7.
17. Buchanan GN, Malik A, Parvaiz A, Sheffield JP, Kennedy RH. Laparoscopic resection for colorectal cancer. British Journal of Surgery. 2008;95(7):893-902.
18. Nakamura T, Mitomi H, Ihara A, et al. Risk factors for wound infection after surgery for colorectal cancer. World Journal of Surgery. 2008;32(6):1138-1141.
19. Delaney CP, Chang E, Senagore AJ, Broder M. Clinical outcomes and resource utilization associated with laparoscopic and open colectomy using a large national database. Annals of Surgery. 2008;247(5):819-824.
20. Seitz G, Seitz EM, Kasparek MS, Konigsrainer A, Kreis ME. Long-term quality-of-life after open and laparoscopic sigmoid colectomy. Surg Laparosc Endosc Percutan Tech. 2008;18(2):162-167.
21. Lordan JT, Tilney HS, Shirol S, Jourdan I, Gudgeon AM. Does the laparoscopic colorectal surgery learning curve adversely affect the results of colorectal cancer resection? A 3-year prospective study in a district general hospital. Colorectal Disease. 2008;10(4):363-369.
22. Law WL, Fan JKM, Poon JTC, Choi HK, Lo OSH. Laparoscopic bowel resection in the setting of metastatic colorectal cancer. Annals of Surgical Oncology. 2008;15(5):1424-1428.
23. Ihedioha U, Mackay G, Leung E, Molloy RG, O'Dwyer PJ. Laparoscopic colorectal resection does not reduce incisional hernia rates when compared with open colorectal resection.[see comment]. Surg Endosc. 2008;22(3):689-692.
24. Frasson M, Braga M, Vignali A, Zuliani W, Di Carlo V. Benefits of laparoscopic colorectal resection are more pronounced in elderly patients. Dis Colon Rectum. 2008;51(3):296-300.
25. Steele SR, Brown TA, Rush RM, Martin MJ. Laparoscopic vs open colectomy for colon cancer: results from a large nationwide population-based analysis. J Gastrointest Surg. 2008;12(3):583-591.
26. Braga M, Frasson M, Vignali A, Zuliani W, Di Carlo V. Open right colectomy is still effective compared to laparoscopy: results of a randomized trial. Annals of Surgery. 2007;246(6):1010-1014; discussion 1014-1015.
27. Chung CC, Ng DCK, Tsang WWC, et al. Hand-assisted laparoscopic versus open right colectomy: a randomized controlled trial. Annals of Surgery. 2007;246(5):728-733.
220
28. McCloskey CA, Wilson MA, Hughes SJ, Eid GM. Laparoscopic colorectal surgery is safe in the high-risk patient: a NSQIP risk-adjusted analysis.[erratum appears in Surgery. 2008 Feb;143(2):301]. Surgery. 2007;142(4):594-597; discussion 597.e591-592.
29. Hinojosa MW, Murrell ZA, Konyalian VR, Mills S, Nguyen NT, Stamos MJ. Comparison of laparoscopic vs open sigmoid colectomy for benign and malignant disease at academic medical centers. J Gastrointest Surg. 2007;11(11):1423-1429; discussion 1429-1430.
30. Park J-S, Kang S-B, Kim S-W, Cheon G-N. Economics and the laparoscopic surgery learning curve: comparison with open surgery for rectosigmoid cancer. World Journal of Surgery. 2007;31(9):1827-1834.
31. Osarogiagbon RU, Ogbeide O, Ogbeide E, George RK. Hand-assisted laparoscopic colectomy compared with open colectomy in a nontertiary care setting. Clinical Colorectal Cancer. 2007;6(8):588-592.
32. Tong DKH, Law WL. Laparoscopic versus open right hemicolectomy for carcinoma of the colon. J Soc Laparoendosc Surg. 2007;11(1):76-80.
33. Salimath J, Jones MW, Hunt DL, Lane MK. Comparison of return of bowel function and length of stay in patients undergoing laparoscopic versus open colectomy. J Soc Laparoendosc Surg. 2007;11(1):72-75.
34. Napolitano L, Waku M, De Nicola P, et al. Laparoscopic colectomy in colon cancer. A single-center clinical experience. G Chir. 2007;28(4):126-133.
35. Janson M, Lindholm E, Anderberg B, Haglind E. Randomized trial of health-related quality of life after open and laparoscopic surgery for colon cancer. Surg Endosc. 2007;21(5):747-753.
36. MacKay G, Ihedioha U, McConnachie A, Serpell M, Molloy RG, O'Dwyer PJ. Laparoscopic colonic resection in fast-track patients does not enhance short-term recovery after elective surgery.[see comment]. Colorectal Disease. 2007;9(4):368-372.
37. Noblett SE, Horgan AF. A prospective case-matched comparison of clinical and financial outcomes of open versus laparoscopic colorectal resection. Surg Endosc. 2007;21(3):404-408.
38. Law WL, Lee YM, Choi HK, Seto CL, Ho JW. Impact of laparoscopic resection for colorectal cancer on operative outcomes and survival.[see comment]. Annals of Surgery. 2007;245(1):1-7.
39. Liang J-T, Huang K-C, Lai H-S, Lee P-H, Jeng Y-M. Oncologic results of laparoscopic versus conventional open surgery for stage II or III left-sided colon cancers: a randomized controlled trial. Annals of Surgical Oncology. 2007;14(1):109-117.
40. Del Rio P, Dell'Abate P, Soliani P, Tacci S, Arcuri MF, Sianesi M. Standardized laparoscopic right hemicolectomy technique for colon cancer. Minerva Chir. 2006;61(4):293-297.
41. Ng SSM, Li JCM, Lee JFY, Yiu RYC, Leung KL. Laparoscopic total colectomy for colorectal cancers: a comparative study. Surg Endosc. 2006;20(8):1193-1196.
42. Law WL, Lee YM, Choi HK, Seto CL, Ho JWC. Laparoscopic and open anterior resection for upper and mid rectal cancer: an evaluation of outcomes. Dis Colon Rectum. 2006;49(8):1108-1115.
221
43. Nakamura T, Mitomi H, Ohtani Y, et al. Comparison of long-term outcome of laparoscopic and conventional surgery for advanced colon and rectosigmoid cancer. Hepato-Gastroenterology. 2006;53(69):351-353.
44. Lezoche E, Guerrieri M, De Sanctis A, et al. Long-term results of laparoscopic versus open colorectal resections for cancer in 235 patients with a minimum follow-up of 5 years. Surg Endosc. 2006;20(4):546-553.
45. King PM, Blazeby JM, Ewings P, et al. Randomized clinical trial comparing laparoscopic and open surgery for colorectal cancer within an enhanced recovery programme. British Journal of Surgery. 2006;93(3):300-308.
46. Wahl P, Hahnloser D, Chanson C, Givel J-C. Laparoscopic and open colorectal surgery in everyday practice: retrospective study. ANZ Journal of Surgery. 2006;76(1-2):20-27.
47. Gonzalez R, Smith CD, Mason E, et al. Consequences of conversion in laparoscopic colorectal surgery. Dis Colon Rectum. 2006;49(2):197-204.
48. Salloum RM, Bulter DC, Schwartz SI. Economic evaluation of minimally invasive colectomy.[see comment]. J Am Coll Surg. 2006;202(2):269-274.
49. Sample CB, Watson M, Okrainec A, Gupta R, Birch D, Anvari M. Long-term outcomes of laparoscopic surgery for colorectal cancer. Surg Endosc. 2006;20(1):30-34.
50. Sahakitrungruang C, Pattana-arun J, Tantiphlachiva K, Rojanasakul A. Laparoscopic versus open surgery for rectosigmoid and rectal cancer. J Med Assoc Thai. 2005;88 Suppl 4:S59-64.
51. Braga M, Frasson M, Vignali A, Zuliani W, Civelli V, Di Carlo V. Laparoscopic vs. open colectomy in cancer patients: long-term complications, quality of life, and survival. Dis Colon Rectum. 2005;48(12):2217-2223.
52. Vignali A, Di Palo S, Tamburini A, Radaelli G, Orsenigo E, Staudacher C. Laparoscopic vs. open colectomies in octogenarians: a case-matched control study. Dis Colon Rectum. 2005;48(11):2070-2075.
53. Veldkamp R, Kuhry E, Hop WCJ, et al. Laparoscopic surgery versus open surgery for colon cancer: short-term outcomes of a randomised trial. Lancet Oncol. 2005;6(7):477-484.
54. Pokala N, Delaney CP, Senagore AJ, Brady KM, Fazio VW. Laparoscopic vs open total colectomy: a case-matched comparative study. Surg Endosc. 2005;19(4):531-535.
55. Neri V, Ambrosi A, Fersini A, Valentino TP. Right colectomy for cancer: validity of laparoscopic approach. Annali Italiani di Chirurgia. 2004;75(6):649-653.
56. Kaiser AM, Kang J-C, Chan LS, Vukasin P, Beart RW, Jr. Laparoscopic-assisted vs. open colectomy for colon cancer: a prospective randomized trial. J Laparoendosc Adv Surg Tech A. 2004;Part A. 14(6):329-334.
57. Kojima M, Konishi F, Okada M, Nagai H. Laparoscopic colectomy versus open colectomy for colorectal carcinoma: a retrospective analysis of patients followed up for at least 4 years. Surgery Today. 2004;34(12):1020-1024.
58. Vignali A, Braga M, Zuliani W, Frasson M, Radaelli G, Di Carlo V. Laparoscopic colorectal surgery modifies risk factors for postoperative morbidity. Dis Colon Rectum. 2004;47(10):1686-1693.
222
59. Baker RP, Titu LV, Hartley JE, Lee PWR, Monson JRT. A case-control study of laparoscopic right hemicolectomy vs. open right hemicolectomy. Dis Colon Rectum. 2004;47(10):1675-1679.
60. Capussotti L, Massucco P, Muratore A, Amisano M, Bima C, Zorzi D. Laparoscopy as a prognostic factor in curative resection for node positive colorectal cancer: results for a single-institution nonrandomized prospective trial. Surg Endosc. 2004;18(7):1130-1135.
61. Kang JC, Chung MH, Chao PC, et al. Hand-assisted laparoscopic colectomy vs open colectomy: a prospective randomized study. Surg Endosc. 2004;18(4):577-581.
62. Janson M, Bjorholt I, Carlsson P, et al. Randomized clinical trial of the costs of open and laparoscopic surgery for colonic cancer.[see comment]. British Journal of Surgery. 2004;91(4):409-417.
63. Kiran RP, Delaney CP, Senagore AJ, Millward BL, Fazio VW. Operative blood loss and use of blood products after laparoscopic and conventional open colorectal operations. Archives of Surgery. 2004;139(1):39-42.
64. Kayser J, Faber C, Bisdorff J, et al. Review of laparoscopic and open colorectal surgery in the "Zitha" Hospital (Luxembourg) in the year 2002. Bulletin de la Societe des Sciences Medicales du Grand-Duche de Luxembourg. 2003(1):7-16.
65. Inoue Y, Kimura T, Noro H, et al. Is laparoscopic colorectal surgery less invasive than classical open surgery? Quantitation of physical activity using an accelerometer to assess postoperative convalescence. Surg Endosc. 2003;17(8):1269-1273.
66. Basse L, Madsen JL, Billesbolle P, Bardram L, Kehlet H. Gastrointestinal transit after laparoscopic versus open colonic resection. Surg Endosc. 2003;17(12):1919-1922.
67. Kasparek MS, Muller MH, Glatzle J, et al. Postoperative colonic motility in patients following laparoscopic-assisted and open sigmoid colectomy. J Gastrointest Surg. 2003;7(8):1073-1081; discussion 1081.
68. Adachi Y, Sato K, Kakisako K, Inomata M, Shiraishi N, Kitano S. Quality of life after laparoscopic or open colonic resection for cancer. Hepato-Gastroenterology. 2003;50(53):1348-1351.
69. Sklow B, Read T, Birnbaum E, Fry R, Fleshman J. Age and type of procedure influence the choice of patients for laparoscopic colectomy. Surg Endosc. 2003;17(6):923-929.
70. Patankar SK, Larach SW, Ferrara A, et al. Prospective comparison of laparoscopic vs. open resections for colorectal adenocarcinoma over a ten-year period. Dis Colon Rectum. 2003;46(5):601-611.
71. Hasegawa H, Kabeshima Y, Watanabe M, Yamamoto S, Kitajima M. Randomized controlled trial of laparoscopic versus open colectomy for advanced colorectal cancer. Surg Endosc. 2003;17(4):636-640.
72. Senagore AJ, Madbouly KM, Fazio VW, Duepree HJ, Brady KM, Delaney CP. Advantages of laparoscopic colectomy in older patients. Archives of Surgery. 2003;138(3):252-256.
73. Vasilev K, Ivanov P, Gurbev G. Laparoscopic versus conventional colorectal surgery--a comparative trial. Acta chir. 2002;49(2):77-78.
223
74. Law WL, Chu KW, Tung PHM. Laparoscopic colorectal resection: a safe option for elderly patients. J Am Coll Surg. 2002;195(6):768-773.
75. Braga M, Vignali A, Gianotti L, et al. Laparoscopic versus open colorectal surgery: a randomized trial on short-term outcome. Annals of Surgery. 2002;236(6):759-766; disscussion 767.
76. Winslow ER, Fleshman JW, Birnbaum EH, Brunt LM. Wound complications of laparoscopic vs open colectomy. Surg Endosc. 2002;16(10):1420-1425.
77. Lezoche E, Feliciotti F, Paganini AM, Guerrieri M, De Sanctis A, Campagnacci R. Laparoscopic colonic resection. J Laparoendosc Adv Surg Tech A. 2001;Part A. 11(6):401-408.
78. Hong D, Tabet J, Anvari M. Laparoscopic vs. open resection for colorectal adenocarcinoma. Dis Colon Rectum. 2001;44(1):10-18; discussion 18-19.
79. Yamamoto S, Watanabe M, Hasegawa H, Kitajima M. Oncologic outcome of laparoscopic versus open surgery for advanced colorectal cancer. Hepato-Gastroenterology. 2001;48(41):1248-1251.
80. Nishiguchi K, Okuda J, Toyoda M, Tanaka K, Tanigawa N. Comparative evaluation of surgical stress of laparoscopic and open surgeries for colorectal carcinoma. Dis Colon Rectum. 2001;44(2):223-230.
81. Mall JW, Schwenk W, Rodiger O, Zippel K, Pollmann C, Muller JM. Blinded prospective study of the incidence of deep venous thrombosis following conventional or laparoscopic colorectal resection. British Journal of Surgery. 2001;88(1):99-100.
82. Curet MJ, Putrakul K, Pitcher DE, Josloff RK, Zucker KA. Laparoscopically assisted colon resection for colon carcinoma: perioperative results and long-term outcome. Surg Endosc. 2000;14(11):1062-1066.
83. Lezoche E, Feliciotti F, Paganini AM, Guerrieri M, Campagnacci R, De Sanctis A. Laparoscopic colonic resections versus open surgery: a prospective non-randomized study on 310 unselected cases. Hepato-Gastroenterology. 2000;47(33):697-708.
84. Hartley JE, Mehigan BJ, MacDonald AW, Lee PW, Monson JR. Patterns of recurrence and survival after laparoscopic and conventional resections for colorectal carcinoma. Annals of Surgery. 2000;232(2):181-186.
85. Marubashi S, Yano H, Monden T, et al. The usefulness, indications, and complications of laparoscopy-assisted colectomy in comparison with those of open colectomy for colorectal carcinoma. Surgery Today. 2000;30(6):491-496.
86. Kakisako K, Sato K, Adachi Y, Shiraishi N, Miyahara M, Kitano S. Laparoscopic colectomy for Dukes A colon cancer. Surg Laparosc Endosc Percutan Tech. 2000;10(2):66-70.
87. Stocchi L, Nelson H, Young-Fadok TM, Larson DR, Ilstrup DM. Safety and advantages of laparoscopic vs. open colectomy in the elderly: matched-control study. Dis Colon Rectum. 2000;43(3):326-332.
88. Delgado S, Lacy AM, Garcia Valdecasas JC, et al. Could age be an indication for laparoscopic colectomy in colorectal cancer? Surg Endosc. 2000;14(1):22-26.
224
89. Stewart BT, Stitz RW, Lumley JW. Laparoscopically assisted colorectal surgery in the elderly. British Journal of Surgery. 1999;86(7):938-941.
90. Leung KL, Meng WC, Lee JF, Thung KH, Lai PB, Lau WY. Laparoscopic-assisted resection of right-sided colonic carcinoma: a case-control study. J Surg Oncol. 1999;71(2):97-100.
91. Santoro E, Carlini M, Carboni F, Feroce A. Colorectal carcinoma: laparoscopic versus traditional open surgery. A clinical trial. Hepato-Gastroenterology. 1999;46(26):900-904.
92. Schwenk W, Bohm B, Witt C, Junghans T, Grundel K, Muller JM. Pulmonary function following laparoscopic or conventional colorectal resection: a randomized controlled evaluation. Archives of Surgery. 1999;134(1):6-12; discussion 13.
93. Bouvet M, Mansfield PF, Skibber JM, et al. Clinical, pathologic, and economic parameters of laparoscopic colon resection for cancer. American Journal of Surgery. 1998;176(6):554-558.
94. Schwenk W, Bohm B, Muller JM. Postoperative pain and fatigue after laparoscopic or conventional colorectal resections. A prospective randomized trial. Surg Endosc. 1998;12(9):1131-1136.
95. Lacy AM, Delgado S, Garcia-Valdecasas JC, et al. Port site metastases and recurrence after laparoscopic colectomy. A randomized trial. Surg Endosc. 1998;12(8):1039-1042.
96. Khalili TM, Fleshner PR, Hiatt JR, et al. Colorectal cancer: comparison of laparoscopic with open approaches. Dis Colon Rectum. 1998;41(7):832-838.
97. Milsom JW, Bohm B, Hammerhofer KA, Fazio V, Steiger E, Elson P. A prospective, randomized trial comparing laparoscopic versus conventional techniques in colorectal cancer surgery: a preliminary report.[see comment]. J Am Coll Surg. 1998;187(1):46-54; discussion 54-45.
98. Psaila J, Bulley SH, Ewings P, Sheffield JP, Kennedy RH. Outcome following laparoscopic resection for colorectal cancer.[see comment]. British Journal of Surgery. 1998;85(5):662-664.
99. Schwenk W, Bohm B, Haase O, Junghans T, Muller JM. Laparoscopic versus conventional colorectal resection: a prospective randomised study of postoperative ileus and early postoperative feeding. Langenbecks Arch Surg. 1998;383(1):49-55.
100. Leung KL, Kwok SP, Lau WY, et al. Laparoscopic-assisted resection of rectosigmoid carcinoma. Immediate and medium-term results. Archives of Surgery. 1997;132(7):761-764; discussion 765.
101. Goh YC, Eu KW, Seow-Choen F. Early postoperative results of a prospective series of laparoscopic vs. Open anterior resections for rectosigmoid cancers. Dis Colon Rectum. 1997;40(7):776-780.
102. Philipson BM, Bokey EL, Moore JW, Chapuis PH, Bagge E. Cost of open versus laparoscopically assisted right hemicolectomy for cancer. World Journal of Surgery. 1997;21(2):214-217.
103. Ortiz H, Armendariz P, Yarnoz C. Early postoperative feeding after elective colorectal surgery is not a benefit unique to laparoscopy-assisted procedures.[see comment]. International Journal of Colorectal Disease. 1996;11(5):246-249.
225
104. Begos DG, Arsenault J, Ballantyne GH. Laparoscopic colon and rectal surgery at a VA hospital. Analysis of the first 50 cases. Surg Endosc. 1996;10(11):1050-1056.
105. Gellman L, Salky B, Edye M. Laparoscopic assisted colectomy. Surg Endosc. 1996;10(11):1041-1044.
106. Franklin ME, Jr., Rosenthal D, Abrego-Medina D, et al. Prospective comparison of open vs. laparoscopic colon surgery for carcinoma. Five-year results. Dis Colon Rectum. 1996;39(10 Suppl):S35-46.
107. Bokey EL, Moore JW, Chapuis PH, Newland RC. Morbidity and mortality following laparoscopic-assisted right hemicolectomy for cancer. Dis Colon Rectum. 1996;39(10 Suppl):S24-28.
108. Hotokezaka M, Dix J, Mentis EP, Minasi JS, Schirmer BD. Gastrointestinal recovery following laparoscopic vs open colon surgery. Surg Endosc. 1996;10(5):485-489.
109. Fleshman JW, Fry RD, Birnbaum EH, Kodner IJ. Laparoscopic-assisted and minilaparotomy approaches to colorectal diseases are similar in early outcome. Dis Colon Rectum. 1996;39(1):15-22.
110. Lacy AM, Garcia-Valdecasas JC, Pique JM, et al. Short-term outcome analysis of a randomized study comparing laparoscopic vs open colectomy for colon cancer. Surg Endosc. 1995;9(10):1101-1105.
111. Franklin ME, Jr., Rosenthal D, Norem RF. Prospective evaluation of laparoscopic colon resection versus open colon resection for adenocarcinoma. A multicenter study. Surg Endosc. 1995;9(7):811-816.
112. Saba AK, Kerlakian GM, Kasper GC, Hearn AT. Laparoscopic assisted colectomies versus open colectomy. Journal of Laparoendoscopic Surgery. 1995;5(1):1-6.
113. Ramos JM, Beart RW, Jr., Goes R, Ortega AE, Schlinkert RT. Role of laparoscopy in colorectal surgery. A prospective evaluation of 200 cases.[see comment]. Dis Colon Rectum. 1995;38(5):494-501.
114. Van Ye TM, Cattey RP, Henry LG. Laparoscopically assisted colon resections compare favorably with open technique. Surgical Laparoscopy & Endoscopy. 1994;4(1):25-31.
115. Senagore AJ, Luchtefeld MA, Mackeigan JM, Mazier WP. Open colectomy versus laparoscopic colectomy: are there differences? American Surgeon. 1993;59(8):549-553; discussion 553-544.
116. Poon JT, Law WL, Wong IW, et al. Impact of laparoscopic colorectal resection on surgical site infection. Annals of Surgery. January 2009;249(1):77-81.
117. Faiz O, Brown T, Colucci G, Kennedy RH. A cohort study of results following elective colonic and rectal resection within an enhanced recovery programme. Colorectal Disease. 2009;11(4):366-372.
118. Survival after laparoscopic surgery versus open surgery for colon cancer: long-term outcome of a randomised clinical trial. The Lancet Oncology. January 2009;10(1):44-52.
119. Park JS, Kang SB, Kim DW, Lee KH, Kim YH. Laparoscopic versus open resection without splenic flexure mobilization for the treatment of rectum and sigmoid cancer: A study from a
226
single institution that selectively used splenic flexure mobilization. Surgical Laparoscopy, Endoscopy and Percutaneous Techniques. February 2009;19(1):62-68.
120. Chikkappa MG, Jagger S, Griffith JP, Ausobsky JR, Steward MA, Davies JB. In-house colorectal laparoscopic preceptorship: A model for changing a unit's practice safely and efficiently. International Journal of Colorectal Disease. 2009;24(7):771-776.
121. Gameiro M, Eichler W, Schwandner O, et al. Patient mood and neuropsychological outcome after laparoscopic and conventional colectomy. Surgical Innovation. 2008;15(3):171-178.
122. Cermak K, Thill V, Simoens CH, Smets D, Ngongang CH, Mendes Da Costa P. Surgical resection for colon cancer: Laparoscopic assisted vs. open colectomy. Hepato-Gastroenterology. Mar 2008;55(82-83):412-417.
123. King PM, Blazeby JM, Ewings P, Kennedy RH. Detailed evaluation of functional recovery following laparoscopic or open surgery for colorectal cancer within an enhanced recovery programme. International Journal of Colorectal Disease. Aug 2008;23(8):795-800.
124. Boni L, Di Giuseppe M, Bertoglio C, et al. Preliminary results of laparoscopic colorectal resections: Does surgeon's age influences outcomes? Surgical Oncology. Dec 2007;16:57-60.
125. Gonzalez IA, Fernandez EMLT, Pinero YH, et al. Effectiveness of colorectal laparoscopic surgery on patients at high anesthetic risk: An intervention cohort study. International Journal of Colorectal Disease. Jan 2008;23(1):101-106.
126. Fleshman J, Sargent DJ, Green E, et al. Laparoscopic colectomy for cancer is not inferior to open surgery based on 5-year data from the COST Study Group trial. Annals of Surgery. Oct 2007;246(4):655-662.
127. Jayne DG, Guillou PJ, Thorpe H, et al. Randomized trial of laparoscopic-assisted resection of colorectal carcinoma: 3-Year results of the UK MRC CLASICC trial group. Journal of Clinical Oncology. 20 2007;25(21):3061-3068.
128. Choi YS, Lee SI, Lee TG, Kim SW, Cheon G, Kang SB. Economic outcomes of laparoscopic versus open surgery for colorectal cancer in Korea. Surgery Today. Feb 2007;37(2):127-132.
129. Guo DY, Eteuati J, Hung Nguyen M, Lloyd D, Ragg JL. Laparoscopic assisted colectomy: Experience from a rural centre. ANZ Journal of Surgery. Apr 2007;77(4):283-286.
130. Feng B, Zheng MH, Mao ZH, et al. Clinical advantages of laparoscopic colorectal cancer surgery in the elderly. Aging - Clinical and Experimental Research. Jun 2006;18(3):191-195.
131. Franks PJ, Bosanquet N, Thorpe H, et al. Short-term costs of conventional vs laparoscopic assisted surgery in patients with colorectal cancer (MRC CLASICC trial). British Journal of Cancer. 03 2006;95(1):6-12.
132. Delaney CP, Pokala N, Senagore AJ, et al. Is laparoscopic colectomy applicable to patients with body mass index >30? A case-matched comparative study with open colectomy. Diseases of the Colon and Rectum. May 2005;48(5):975-981.
133. Guillou PJ, Quirke P, Thorpe H, et al. Short-term endpoints of conventional versus laparoscopic-assisted surgery in patients with colorectal cancer (MRC CLASICC trial): Multicentre, randomised controlled trial. Lancet. 14 2005;365(9472):1718-1726.
227
134. Zheng MH, Feng B, Lu AG, et al. Laparoscopic versus open right hemicolectomy with curative intent for colon carcinoma. World Journal of Gastroenterology. 21 2005;11(3):323-326.
135. Nelson H, Sargent DJ, Wieand HS, et al. A Comparison of Laparoscopically Assisted and Open Colectomy for Colon Cancer. New England Journal of Medicine. 13 2004;350(20):2050-2059+2114.
136. Leung KL, Kwok SPY, Lam SCW, et al. Laparoscopic resection of rectosigmoid carcinoma: Prospective randomised trial. Lancet. 10 2004;363(9416):1187-1192.
137. Delaney CP, Kiran RP, Senagore AJ, Brady K, Fazio VW. Case-Matched Comparison of Clinical and Financial Outcome after Laparoscopic or Open Colorectal Surgery. Annals of Surgery. Jul 2003;238(1):67-72.
138. Lezoche E, Feliciotti F, Guerrieri M, et al. Laparoscopic versus open hemicolectomy. [Italian, English]. Minerva Chir. Aug 2003;58(4):491-507.
139. Ma HF, Wang HM. Comparison between complications of laparoscopic anterior resction and conventional anterior resection for sigmoid colon cancer. Formosan Journal of Surgery. Jul 2003;36(4):166-172.
140. Feliciotti F, Paganini AM, Guerrieri M, De Sanctis A, Campagnacci R, Lezoche E. Results of laparoscopic vs open resections for colon cancer in patients with a minimum follow-up of 3 years. Surg Endosc. 2002;16(8):1158-1161.
141. Lacy AM, Garcia-Valdecasas JC, Delgado S, et al. Laparoscopy-assisted colectomy versus open colectomy for treatment of non-metastatic colon cancer: A randomised trial. Lancet. 29 2002;359(9325):2224-2229.
142. Champault GG, Barrat C, Raselli R, Elizalde A, Catheline JM. Laparoscopic versus open surgery for colorectal carcinoma: A prospective clinical trial involving 157 cases with a mean follow-up of 5 years. Surgical Laparoscopy, Endoscopy and Percutaneous Techniques. 2002;12(2):88-95.
143. Lujan HJ, Plasencia G, Jacobs M, Viamonte IM, Hartmann RF. Long-term survival after laparoscopic colon resection for cancer: Complete five-year follow-up. Diseases of the Colon and Rectum. 2002;45(4):491-501.
144. Lezoche E, Feliciotti F, Paganini AM, et al. Laparoscopic vs open hemicolectomy for colon cancer: Long-term outcome. Surg Endosc. 2002;16(4):596-602.
145. Weeks JC, Nelson H, Gelber S, Sargent D, Schroeder G. Short-term quality-of-life outcomes following laparoscopic- assisted colectomy vs open colectomy for colon cancer: A randomized trial. Journal of the American Medical Association. 16 2002;287(3):321-328.
146. Braga M, Vignali A, Zuliani W, et al. Training period in laparoscopic colorectal surgery: A case-matched comparative study with open surgery. Surg Endosc. 2002;16(1):31-35.
147. Chen WTL, Chen HC, Chiu CM, Lai YC, Hsu GH, Huang TM. Laparoscopic resection of colorectal cancer. Formosan Journal of Surgery. 2000;33(5):215-220.
148. Chen HH, Wexner SD, Iroatulam AJN, et al. Laparoscopic colectomy compares favorably with colectomy by laparotomy for reduction of postoperative ileus. Diseases of the Colon and Rectum. Jan 2000;43(1):61-65.
228
149. Schwandner O, Schiedeck THK, Killaitis C, Bruch HP. A case-control-study comparing laparoscopic versus open surgery for rectosigmoidal and rectal cancer. International Journal of Colorectal Disease. Aug 1999;14(3):158-163.
150. Stage JG, Schulze S, Moller P, et al. Prospective randomized study of laparoscopic versus open colonic resection for adenocarcinoma. British Journal of Surgery. 1997;84(3):391-396.
151. Ou H. Laparoscopic-assisted mini laparotomy with colectomy. Diseases of the Colon and Rectum. 1995;38(3):324-326.
152. Gray D, Lee H, Schlinkert R, Beart Jr RW. Adequacy of lymphadenectomy in laparoscopic-assisted colectomy for colorectal cancer: A preliminary report. J Surg Oncol. 1994;57(1):8-10.
153. Musser DJ, Boorse RC, Madera F, Reed IJF. Laparoscopic colectomy: At what cost? Surgical Laparoscopy and Endoscopy. 1994;4(1):1-5.
154. Tate JJT, Kwok S, Dawson JW, Lau WY, Li AKC. Prospective comparison of laparoscopic and conventional anterior resection. British Journal of Surgery. 1993;80(11):1396-1398.
155. Peters WR, Bartels TL. Minimally invasive colectomy: Are the potential benefits realized? Diseases of the Colon and Rectum. 1993;36(8):751-756.
156. Falk PM, Beart Jr RW, Wexner SD, et al. Laparoscopic colectomy: A critical appraisal. Diseases of the Colon and Rectum. 1993;36(1):28-34.
157. Wilks JA, Balentine CJ, Berger DH, et al. Establishment of a minimally invasive program at a Veterans' Affairs Medical Center leads to improved care in colorectal cancer patients. American Journal of Surgery. Nov 2009;198(5):685-692.
158. Taylor GW, Jayne DG, Brown SR, et al. Adhesions and incisional hernias following laparoscopic versus open surgery for colorectal cancer in the CLASICC trial. British Journal of Surgery. Jan 2010;97(1):70-78.
159. Allardyce RA, Bagshaw PF, Frampton CM, et al. Australasian Laparoscopic Colon Cancer Study shows that elderly patients may benefit from lower postoperative complication rates following laparoscopic versus open resection. British Journal of Surgery. Jan 2010;97(1):86-91.
160. Neudecker J, Klein F, Bittner R, et al. Short-term outcomes from a prospective randomized trial comparing laparoscopic and open surgery for colorectal cancer. British Journal of Surgery. Dec 2009;96(12):1458-1467.
161. Ptok H, Kube R, Schmidt U, et al. Conversion from laparoscopic to open colonic cancer resection - associated factors and their influence on long-term oncological outcome. European Journal of Surgical Oncology. Dec 2009;35(12):1273-1279.
162. Yin W-Y, Wei C-K, Tseng K-C, et al. Open colectomy versus laparoscopic-assisted colectomy supported by hand-assisted laparoscopic colectomy for resectable colorectal cancer: a comparative study with minimum follow-up of three years. Hepato-Gastroenterology. Jul-Aug 2009;56(93):998-1006.
163. Kim HJ, Lee IK, Lee YS, et al. A comparative study on the short-term clinicopathologic outcomes of laparoscopic surgery versus conventional open surgery for transverse colon cancer. Surg Endosc. Aug 2009;23(8):1812-1817.
229
164. Tan WS, Chew MH, Ooi BS, et al. Laparoscopic versus open right hemicolectomy: A comparison of short-term outcomes. International Journal of Colorectal Disease. 2009;24(11):1333-1339.
165. Faiz O, Warusavitarne J, Bottle A, Tekkis PP, Darzi AW, Kennedy RH. Laparoscopically assisted vs. open elective colonic and rectal resection: A comparison of outcomes in english national health service trusts between 1996 and 2006. Diseases of the Colon and Rectum. October 2009;52(10):1695-1704.
166. Konishi F, Okada M, Nagai H, Ozawa A, Kashiwagi H, Kanazawa K. Laparoscopic-assisted colectomy with lymph node dissection for invasive carcinoma of the colon. Surg Today. 1996;26(11):882-889.
167. Hoffman GC, Baker JW, Fitchett CW, Vansant JH. Laparoscopic-assisted colectomy. Initial experience. Annals of Surgery. Jun 1994;219(6):732-740; discussion 740-733.
168. Abdel-Halim MRE, Moore HM, Cohen P, Dawson P, Buchanan GN. Impact of laparoscopic right hemicolectomy for colon cancer. Ann R Coll Surg Engl. Apr 2010;92(3):211-217.
169. Akiyoshi T, Kuroyanagi H, Fujimoto Y, et al. Short-term outcomes of laparoscopic colectomy for transverse colon cancer. J Gastrointest Surg. May 2010;14(5):818-823.
170. Balentine CJ, Marshall C, Robinson C, et al. Obese patients benefit from minimally invasive colorectal cancer surgery. J Surg Res. Sep 2010;163(1):29-34.
171. Braga M, Frasson M, Zuliani W, Vignali A, Pecorelli N, Di Carlo V. Randomized clinical trial of laparoscopic versus open left colonic resection. British Journal of Surgery. Aug 2010;97(8):1180-1186.
172. da Luz Moreira A, Kiran RP, Kirat HT, et al. Laparoscopic versus open colectomy for patients with American Society of Anesthesiology (ASA) classifications 3 and 4: the minimally invasive approach is associated with significantly quicker recovery and reduced costs. Surg Endosc. Jun 2010;24(6):1280-1286.
173. El-Gazzaz G, Geisler D, Hull T. Risk of clinical leak after laparoscopic versus open bowel anastomosis. Surg Endosc. Aug 2010;24(8):1898-1903.
174. El-Gazzaz G, Hull T, Hammel J, Geisler D. Does a laparoscopic approach affect the number of lymph nodes harvested during curative surgery for colorectal cancer? Surg Endosc. Jan 2010;24(1):113-118.
175. Fujii S, Ota M, Ichikawa Y, et al. Comparison of short, long-term surgical outcomes and mid-term health-related quality of life after laparoscopic and open resection for colorectal cancer: a case-matched control study. International Journal of Colorectal Disease. Nov 2010;25(11):1311-1323.
176. Han KS, Choi GS, Park JS, Kim HJ, Park SY, Jun SH. Short-term outcomes of a laparoscopic left hemicolectomy for descending colon cancer: Retrospective comparison with an open left hemicolectomy. Journal of the Korean Society of Coloproctology. October 2010;26 (5):347-353.
177. Hemandas AK, Abdelrahman T, Flashman KG, et al. Laparoscopic colorectal surgery produces better outcomes for high risk cancer patients compared to open surgery. Annals of Surgery. Jul 2010;252(1):84-89.
230
178. Jayne DG, Thorpe HC, Copeland J, Quirke P, Brown JM, Guillou PJ. Five-year follow-up of the Medical Research Council CLASICC trial of laparoscopically assisted versus open surgery for colorectal cancer. British Journal of Surgery. Nov 2010;97(11):1638-1645.
179. Jiang JK, Chen WS, Wang SJ, Lin JK. A novel lifting system for minimally accessed surgery: A prospective comparison between "Laparo-V" gasless and CO2 pneumoperitoneum laparoscopic colorectal surgery. International Journal of Colorectal Disease. August 2010;25 (8):997-1004.
180. Kiran RP, El-Gazzaz GH, Vogel JD, Remzi FH. Laparoscopic approach significantly reduces surgical site infections after colorectal surgery: data from national surgical quality improvement program. J Am Coll Surg. Aug 2010;211(2):232-238.
181. Kurian AA, Suryadevara S, Vaughn D, et al. Laparoscopic colectomy in octogenarians and nonagenarians: a preferable option to open surgery? J Surg Educ. May-Jun 2010;67(3):161-166.
182. Lian L, Kalady M, Geisler D, Kiran RP. Laparoscopic colectomy is safe and leads to a significantly shorter hospital stay for octogenarians. Surgical Endoscopy and Other Interventional Techniques. August 2010;24 (8):2039-2043.
183. Lloyd GM, Kirby R, Hemingway DM, Keane FB, Miller AS, Neary P. The RAPID protocol enhances patient recovery after both laparoscopic and open colorectal resections. Surg Endosc. Jun 2010;24(6):1434-1439.
184. Madbouly KM, Senagore AJ, Delaney CP. Endogenous morphine levels after laparoscopic versus open colectomy.[Erratum appears in Br J Surg. 2010 Aug;97(8):1314]. British Journal of Surgery. May 2010;97(5):759-764.
185. Maeda T, Tan KY, Konishi F, et al. Accelerated learning curve for colorectal resection, open versus laparoscopic approach, can be attained with expert supervision. Surgical Endoscopy and Other Interventional Techniques. November 2010;24 (11):2850-2854.
186. Marshall CL, Chen GJ, Robinson CN, et al. Establishment of a minimally invasive surgery program leads to decreased inpatient cost of care in veterans with colon cancer. American Journal of Surgery. Nov 2010;200(5):632-635.
187. Morris EJA, Jordan C, Thomas JD, et al. Comparison of treatment and outcome information between a clinical trial and the National Cancer Data Repository. British Journal of Surgery. Feb 2011;98(2):299-307.
188. Nakamura T, Onozato W, Mitomi H, et al. Retrospective, matched case-control study comparing the oncologic outcomes between laparoscopic surgery and open surgery in patients with right-sided colon cancer. Surgery Today. 2009;39(12):1040-1045.
189. Pascual M, Alonso S, Pares D, et al. Randomized clinical trial comparing inflammatory and angiogenic response after open versus laparoscopic curative resection for colonic cancer. British Journal of Surgery. Jan 2011;98(1):50-59.
190. Tei M, Ikeda M, Haraguchi N, et al. Postoperative complications in elderly patients with colorectal cancer: comparison of open and laparoscopic surgical procedures. Surg Laparosc Endosc Percutan Tech. Dec 2009;19(6):488-492.
231
191. Basse L, Jakobsen DH, Bardram L, et al. Functional recovery after open versus laparoscopic colonic resection: a randomized, blinded study. Annals of Surgery. Mar 2005;241(3):416-423.
192. Braga M, Vignali A, Zuliani W, Frasson M, Di Serio C, Di Carlo V. Laparoscopic versus open colorectal surgery: cost-benefit analysis in a single-center randomized trial. Annals of Surgery. Dec 2005;242(6):890-895, discussion 895-896.
232
Appendix F Bayesian Models
A) Meta-Analysis Model (Binary Outcome) model{ for (i in 1:TrialNum){ #Trial[i]~dnorm(0,0.0001) #StudyType[i]~dnorm(0,0.0001)
#Likelihood ControlOC[i]~dbin(pControl[i], ControlTotal[i]) TreatLAP[i]~dbin(pTreat[i], TreatTotal[i]) #Linear model for logit of probability logit(pControl[i]) <- mu[i] logit(pTreat[i]) <- mu[i] + delta[i] #Prior on baseline logit mu[i] ~ dnorm (0, 0.0001) #Prior on log-odds-ratios delta[i] ~ dnorm(d, tau) } #Prior on hyperparameter: mean of log-odds ratio d ~ dnorm(0, 0.00001) populationOR <- exp(d) #Prior for random effects variance tau<- 1/(sd*sd) sd ~ dunif (0,3)
#Generate predictive interval deltaNew~dnorm(d,tau) ORnew<-exp(deltaNew) ProbORnewLT1<-step(1-ORnew)
}
233
B) Meta-Analysis Model (Continuous Outcome) model{ for (i in 1:TrialNum){ #Likelihood ControlOC[i]~dnorm(muControl[i], precisionControl[i]) TreatLAP[i]~dnorm(muTreat[i], precisionTreat[i]) precisionControl[i] <- nC[i]/(sdC[i]*sdC[i]) precisionTreat[i] <- nT[i]/(sdT[i]*sdT[i]) #Linear model for mean muTreat[i] <- muControl[i] + delta[i] #Prior on baseline mean muControl[i] ~ dnorm(0, 0.0001) #Prior on difference in LOS delta[i] ~ dnorm(d, tau) } #Prior on hyperparameter: mean of delta (treatment effect) d ~ dnorm(0, 0.00001) #Prior for random effects variance tau<- 1/(sd*sd) sd ~ dunif (0,15)
#Generate new predictive interval deltaNew~dnorm(d,tau) #probablilty that LOS is -1 or more negative (favorable to LAP) DiffProb <- 1-step(d+1) #probablilty that LOS is -2 or more negative (favorable to LAP) #DiffProb2 <- 1-step(d+2) }
234
C) Sensitivity Analysis Model (Binary Outcome) model{ for (i in 1:TrialNum) { #Trial[i]~dnorm(0,0.0001) #StudyType[i]~dnorm(0,0.0001) #Likelihood ControlOC[i]~dbin(pControl[i], ControlTotal[i]) TreatLAP[i]~dbin(pTreat[i], TreatTotal[i]) #Linear model for logit of probability logit(pControl[i]) <- mu[i] logit(pTreat[i]) <- mu[i] + delta[i] #Prior on baseline logit mu[i] ~ dnorm (0, 0.0001) #Prior on log-odds-ratios delta.star[i] ~ dnorm(d, tau) #Model DESIGN + Baseline Rate + Year delta[i] <- delta.star[i]+beta*(DESIGN[i]) + beta.rate*(mu[i]-mean.mu) + beta.year*(year[i]-mean(year[])) #Dummy Statements #DESIGN[i]~dnorm(0,1) #year[i]~dnorm(0,1) } #mean.mu, mean(mu[]) mean.mu <- -4.2 #prior on hyperparameter: mean of log-odds ratio d ~ dnorm(0, 0.00001) #prior for random effects variance tau<- 1/(sd*sd) sd ~ dunif (0,3) v<- sd*sd
235
#Prior for betas beta~dnorm(0,0.1) beta.year~dnorm(0,0.1) beta.rate~dnorm(0,0.1) PopulationOR<-exp(d) ROR.year<-exp(beta.year) ROR.rate<-exp(beta.rate) ROR.design<-exp(beta) } }
236
D) Sensitivity Analysis Model (Continuous Outcome)
model{ for (i in 1:TrialNum){ #Likelihood ControlOC[i]~dnorm(muControl[i], precisionControl[i]) TreatLAP[i]~dnorm(muTreat[i], precisionTreat[i]) precisionControl[i] <- nC[i]/(sdC[i]*sdC[i]) precisionTreat[i] <- nT[i]/(sdT[i]*sdT[i]) #Linear model for mean muTreat[i] <- muControl[i] + delta[i] #Prior on baseline mean muControl[i] ~ dnorm (0, 0.0001) #Prior on difference in LOS deltastar[i] ~ dnorm(d, tau) #Treatment effects at the average control group LOS delta[i]<- deltastar[i] + beta*(DESIGN[i]) + beta.rate*( muControl[i]-mean.mu) + beta.year*( year[i]-mean(year[]) ) year[i]~dnorm(0,1) } mean.mu<- 4.2 #Prior on hyperparameter: mean of delta (treatment effect) d ~ dnorm(0, 0.00001) #Prior for random effects variance tau<- 1/(sd*sd) sd ~ dunif (0,15) #Prior for the beta beta~dnorm(0,0.1) beta.rate~dnorm(0,0.1) beta.year~dnorm(0,0.1) }
237
Appendix G Bayesian Meta-Analysis Results
A) Bayesian Meta-Analysis Results
Table G.1 Bayesian random-effects meta-analysis results for studies reporting post-operative complications.
# of Studies OR* 95% CrI† sd§ Probability OR<1♦ All Studies 99 0.61 0.54,0.69 0.44 0.87 NRS 79 0.59 0.51,0.68 0.46 0.87 RCTs 20 0.71 0.53,0.91 0.42 0.80 Typical RCTs 16 0.60 0.41,0.84 0.46 0.87 Strong RCTs 4 0.99 0.64,1.44 0.31 0.58
* Odds Ratio. OR<1 indicates that laparoscopy is associated with fewer post-operative complications † Credible Interval § Standard deviation (between-study heterogeneity) ♦ Probability that the OR<1or more negative (favoring laparoscopy).
Table G.2 Bayesian random-effects meta-analysis results for studies reporting mortality.
# of Studies OR* 95% CrI† sd§ Probability OR<1♦ All Studies 96 0.56 0.44, 0.70 0.40 0.93 NRS 79 0.51 0.39, 0.66 0.44 0.93 RCTs 17 0.84 0.48, 1.39 0.32 0.72 Typical RCTs 13 0.99 0.30, 2.50 0.80 0.59 Strong RCTs 4 0.95 0.00, 2.52 0.66 0.66
* Odds Ratio. OR<1 indicates that laparoscopy is associated with lower mortality. † Credible Interval § Standard deviation (between-study heterogeneity) ♦ Probability that the OR<1or more negative (favoring laparoscopy).
238
Table G.3 Bayesian random-effects meta-analysis results for studies reporting length of stay.
# of Studies MD* 95% CrI† sd§ Probability MD>-1♦ All Studies 128 -2.74 -3.13, -2.35 1.97 1.00 NRS 106 -2.94 -3.40, -2.49 2.07 1.00 RCTs 22 -1.82 -2.53, -1.12 1.39 0.99 Typical RCTs 18 -2.16 -2.98, -1.32 1.42 1.00 Strong RCTs 4 -0.74 -2.30, 0.73 1.17 0.28
* Mean Difference, MD=meanlaparoscopy-meanopen. A MD<0 indicates that laparoscopy is associated with a shorter length of stay. † Credible Interval § Standard deviation (between-study heterogeneity) ♦ Probability that the MD is -1 or more negative (favoring laparoscopy).
Table G.4 Bayesian random-effects meta-analysis results for studies reporting number of lymph nodes harvested.
# of Studies MD* 95% CrI† sd§ Probability MD>-1♦ All Studies 76 -0.02 -0.52, 0.48 1.76 0.00 NRS 59 0.08 -0.55, 0.08 1.98 0.00 RCTs 17 -0.34 -1.03, 0.39 1.00 0.03 Typical RCTs 13 -0.25 -1.29, 0.75 1.20 0.06 Strong RCTs 4 -0.47 -2.64, 1.85 1.68 0.23 * Mean Difference, MD=meanlaparoscopy-meanopen. A MD<0 indicates that laparoscopy is associated with
finding fewer lymph nodes in the surgical specimen. † Credible Interval § Standard deviation (between-study heterogeneity) ♦ Probability that the MD is -1 or more negative (favoring open surgery).
Top Related