Summary of Findings & Assessment of Quality of Evidence: Grade Workshop Sunday, October 17, 2010...
-
Upload
jamal-breton -
Category
Documents
-
view
212 -
download
0
Transcript of Summary of Findings & Assessment of Quality of Evidence: Grade Workshop Sunday, October 17, 2010...
Summary of Findings & Assessment of Quality of Evidence:
Grade WorkshopSunday, October 17, 2010
0900 to 1700
Introduction
Introduction to facilitators
• Michelle Kho• Jan Brozek• Nancy Santesso• Holger Schunemann• Ingvil von Mehren Sæterdal
AgendaMix of presentations, interactive sessions, hands-on work and small group discussions
Systematic review process
Systematic review process
Risk of Bias
Meta-analysis
Sensitivity analysesHigh versus lower protein diets (studies with <20% losses to follow-up)
Change in Systolic blood pressure (mmHg)
Subgroup analysis
Funnel PlotMedlineSearch Strategy for RCTs and Reviews--------------------------------------------------------------------------------1 diet, protein-restricted/ 2 diet, carbohydrate-restricted/ 3 1 or 2 4 diet fads/ 5 (carbohydrate* or protein*).ti,ab. 6 4 and 5 7 exp dietary proteins/ 8 dietary carbohydrates/ 9 (diet* or intake*).ti,ab. 10 (high* or increas* or rich or low* or restrict* or decreas* or reduc*).ti,ab. 11 (7 or 8) and 9 and 10 •((carbohydrate* or protein*) adj3 (high* or increas*
or rich or low* or restrict* or decreas* or reduc*)).ti,ab. 13 12 and 9 14 3 or 6 or 11 or 13 15 randomized controlled trial.pt. 16 controlled clinical trial.pt. 17 randomized.ab. 18 placebo.ab. 19 clinical trials as topic.sh. 20 randomly.ab. 21 trial.ti. 22 or/15-21 23 humans.sh. 24 22 and 23 25 14 and 24
Systematic review process
Chapter 11: Presenting results and Summary of Findings Tables
Chapter 12: Interpreting results and drawing conclusions
Cochrane Handbook
Overview: Interpreting results of a review and GRADE
• how does GRADE fit into the process of moving from results to conclusions in systematic reviews• what are the basic principles behind GRADE
Consider the following examples of moving from results to conclusions
How would you interpret the results of the meta-analyses and conclusions made by
the authors?
Authors’ conclusions• Short term beneficial effects were found for fasting for
7 to 10 days followed by a vegetarian diet when compared to ordinary diet.
The pooled SMD for pain reduction comparing glucosamine to placebo was 0.61, which represents a moderate clinically significant treatment benefit in favour of glucosamine
• What information do you think would increase or decrease your confidence in these results?
• What information do you think would indicate that more research is or is not necessary?
Work with your neighbor and discuss for 5 mins
Weighing the criteria for overall quality of evidence
• In fact in this example,– Allocation concealment is unclear in one of the
studies– Only three of five studies measured major bleeding -
a primary outcome in anticoagulation studies – suggesting selective outcome reporting
– The confidence intervals include potential for harm or no harm
• I might say that my confidence in the results is “low” and that more research is likely to change the results
The pooled SMD for pain reduction comparing glucosamine to placebo was 0.61, which represents a moderate clinically significant treatment benefit in favour of glucosamine
Likelihood of and confidence in an outcome
Quality of evidence across studies for an outcome
High Further research is very unlikely to change our confidence in the estimate of effect or accuracy.
Moderate Further research is likely to have an important impact on our confidence in the estimate of effect or accuracy and may change the estimate.
Low Further research is very likely to have an important impact on our confidence in the estimate of effect or accuracy and is likely to change the estimate.
Very low Any estimate of effect or accuracy is very uncertain.
GRADE: recommendation – quality of evidence
Clear separation:1) 4 categories of quality of evidence: (High),
(Moderate), (Low), (Very low)?– methodological quality of evidence– likelihood of bias– by outcome and across outcomes
2) Recommendation: 2 grades – conditional (aka weak) or strong (for or against an intervention)?– Balance of benefits and downsides, values and
preferences, resource use and quality of evidence*www.GradeWorking-Group.org
GRADE Quality of EvidenceIn the context of a systematic review• The quality of evidence reflects the extent to which
we are confident that an estimate of effect is correct.
In the context of making recommendations • The quality of evidence reflects the extent to which
our confidence in an estimate of the effect is adequate to support a particular recommendation.
Determinants of quality• RCTs
• observational studies
• 5 factors that can lower quality1. limitations in detailed design and execution (risk of bias criteria)2. Inconsistency (or heterogeneity)3. Indirectness (PICO and applicability)4. Imprecision (number of events and confidence intervals)5. Publication bias
• 3 factors can increase quality1. large magnitude of effect2. all plausible residual confounding may be working to reduce
the demonstrated effect or increase the effect if no effect was observed
3. dose-response gradient
1. Design and Execution/Risk of Bias
Examples:• Inappropriate selection of exposed and unexposed groups• Failure to adequately measure/control for confounding• Selective outcome reporting• Failure to blind (e.g. outcome assessors)• High loss to follow-up• Lack of concealment in RCTs• Intention to treat principle violated
Design and Execution/RoB
From Cates , CDSR 2008
Design and Execution/RoB
Overall judgment required
2. Inconsistency of results(Heterogeneity)
• if inconsistency, look for explanation– patients, intervention, comparator, outcome
• if unexplained inconsistency lower quality
Reminders for immunization uptake
• Judgment– variation in size of effect– overlap in confidence intervals– statistical significance of heterogeneity– I2
Inconsistency when 1 study?
• Do not downgrade
3. Directness of Evidencegeneralizability, transferability, applicability
• differences in– populations/patients (HIC – L/MIC, women in general – pregnant
women)– interventions (all techniques, new - old)– comparator appropriate (newer technique – old or no
technique)– outcomes (important – surrogate: CIN I – cancer)
• indirect comparisons– interested in A versus B– have A versus C and B versus C– Cryo + antibiotics versus no intervention versus Cryo - antibiotics
EVIDENCE PROFILEQuestion: Cyrotherapy with antibiotics vs no antibiotics for histologically confirmed CIN
1 All rates presented at 12 months with assumption that events would occur within this time frame.2 Indirect analysis between single arm observational studies
Quality assessment No of patients Effect
Quality ImportanceNo of studies Design Limitations Inconsisten
cy Indirectness Imprecision Other Cryotherap
y with antibiotics
No antibiotics
Relative(95% CI) Absolute
Major infection (follow-up 12 months1; requiring hospitalisation or blood transfusion)
16 observational studies
no serious limitations
no serious inconsistency
serious2 no serious imprecision
none0/1600
(0%)10/4573 (0.22%)
RD 0 (0 to 0)
0 fewer per 1000 IMPORTANT
Resource use - not measured
All severe adverse events (follow-up 12 months; (major infections and bleeding, pelvic inflammatory disease, stenosis, etc )
17 observational studies
no serious limitations
no serious inconsistency
serious2 no serious imprecision
none0/1705
(0%)22/5142 (0.43%)
RD 0 (0 to 0)
0 fewer per 1000 IMPORTANT
4. Publication Bias
• Should always be suspected– Only small “positive” studies– For profit interest– Various methods to evaluate – none perfect, but
clearly a problem
Egger M, Smith DS. BMJ 1995;310:752-54 45
I.V. Mg in acute
myocardial infarction
Publication bias
Meta-analysisYusuf S.Circulation 1993
ISIS-4Lancet 1995
Egger M, Cochrane Colloquium Lyon 2001 46
Funnel plotS
tand
ard
Err
or
Odds ratio
0.1 0.3 1 3
3
2
1
0
100.6
Symmetrical:No publication bias
Egger M, Cochrane Colloquium Lyon 2001 47
Funnel plotS
tand
ard
Err
or
Odds ratio
0.1 0.3 1 3
3
2
1
0
100.6
Asymmetrical:Publication bias?
0.4
5. Imprecision
• Small sample size– small number of events
• Wide confidence intervals– uncertainty about magnitude of effect
• Extent to which confidence in estimate of effect adequate to support decision
Example: Immunization in children
For systematic reviews
• If the 95% CI excludes a relative risk (RR) of 1.0 and the total number of events or patients exceeds the OIS criterion, precision is adequate. If the 95% CI includes appreciable benefit or harm (we suggest a RR of under 0.75 or over 1.25 as a rough guide) rating down for imprecision may be appropriate even if OIS criteria are met.
Optimal information size
• We suggest the following: if the total number of patients included in a systematic review is less than the number of patients generated by a conventional sample size calculation for a single adequately powered trial, consider rating down for imprecision. Authors have referred to this threshold as the “optimal information size” (OIS)
025.0%
025.0%
025.0%
025.0%
2.0 0.5 0 0.5
Ischemic strokepoint estimateand confidence interval
Figure 1, Rating down for imprecision in guidelines: Thresholds are key
Favors Intervention Favors Control
Risk difference (%)
Threshold if side effects and toxicity appreciable, NNT = 100.
Confidence interval crossesthreshold, rate down for
imprecision
Threshold if side effects, toxicityand cost minimal, NNT = 200.
Entire confidence interval to leftof threshold, do not rate down for
imprecision
Figure 2: Corticosteroids to reduce hospital mortality in septic shock
1 50.5 0.1 0.05 0.01
Study Year Treatment Control Relative Risk (95% CI)
Annane 2002 95/151 103/149 0.91 [0.77, 1.07]
Bollaert 1998 8/22 12/19 0.58 [0.30, 1.10]
Briegel 1999 5/20 6/20 0.83 [0.30, 2.29]
Chawla 1999 6/23 10/21 0.55 [0.24, 1.25]
Confalonierl 2005 0/23 7/23 0.07 [0.005, 1.10]
Rinaldi 2006 2/20 2/20 1.00 [0.16, 6.42]
Sprung 2008 111/251 100/245 1.08 [0.88, 1.33]
Tandan 2005 11/14 13/14 0.85 [0.62, 1.15]
Yildiz 2002 8/20 12/20 0.67 [0.35, 1.27]
Random Effects Estimate, p=0.22 for heterogeneity, I²=25% 0.88 [0.75, 1.03]
Control group event rate
Tota
l sa
mp
le s
ize
re
qu
ire
d
0.2 0.4 0.6 0.8 1.0
01
00
02
00
03
00
04
00
05
00
06
00
0
RRR=30%
RRR=25%
RRR=20%
Figure 4: Optimal information size given alpha of 0.05 and beta of 0.2 for varying control event rates and relative risks
For any chosen line, evidence meetsoptimal information size criterion if sample size above the line
Total Number of EventsRelative Risk
ReductionImplications for meeting OIS threshold
100 or less < 30%Will almost never meet threshold whatever control event
rate
200 30%Will meet threshold for control event rates for ~ 25% or
greater
200 25%Will meet threshold for control event rates for ~ 50% or
greater
200 20%Will meet threshold only for control event rates for ~ 80%
or greater
300 > 30% Will meet threshold
300 25%Will meet threshold for control event rates ~ 25% or
greater
300 20%Will meet threshold for control event rates ~ 60% or
greater
400 or more > 25% Will meet threshold for any control event rate
400 or more 20%Will meet threshold for control event rates of ~ 40% or
greater
Table 1: Optimal information size implications from Figure 5
What can raise quality?
1. large magnitude can upgrade (RRR 50%/RR 2)– very large two levels (RRR 80%/RR 5)– criteria
• everyone used to do badly• almost everyone does well
BMJ 2003
BMJ, 2003
Reminders for immunization uptake
What can raise quality?2. dose response relation
– (higher INR – increased bleeding)– childhood lymphoblastic leukemia
• risk for CNS malignancies 15 years after cranial irradiation• no radiation: 1% (95% CI 0% to 2.1%) • 12 Gy: 1.6% (95% CI 0% to 3.4%) • 18 Gy: 3.3% (95% CI 0.9% to 5.6%)
In terms of high altitude sickness, symptoms generally do not manifest below 1500 m. From about 1500 to 2500 m, symptoms are generally mild, if experienced at all. At 2500 m, symptoms of mild to moderate acute mountain sickness (AMS) become quite common among unacclimatized visitors after rapid ascent. At this altitude high altitude pulmonary edema (HAPE) may also occur, but it is more common above 3000 m. Above 3000 to 4000 m, AMS is common among people who have not properly acclimatized, and the risk of severe consequences, including life-threatening HAPE and cerebral edema, is substantial.
What can raise quality?
3. all plausible residual confounding may be working to reduce the demonstrated effect or increase the effect if no effect was observed
All plausible residual confoundingwould result in an overestimate of effect
Hypoglycaemic drug phenformin causes lactic acidosis
The related agent metformin is under suspicion for the same toxicity.
Large observational studies have failed to demonstrate an association– Clinicians would be more alert to lactic acidosis in
the presence of the agent• Vaccine – adverse effects
Quality assessment criteria Study design Initial quality of a
body of evidence Lower if Higher if Quality of a body
of evidence
Randomised trials
High Risk of Bias - 1 Serious
-2 Very serious
Inconsistency - 1 Serious
-2 Very serious
Indirectness - 1 Serious
-2 Very serious
Imprecision
- 1 Serious
-2 Very serious
Publication bias
- 1 Likely
-2 Very likely
Large effect + 1 Large +2 Very large
Dose response
+1 Evidence of a gradient
All plausible residual confounding
+1 Would reduce a demonstrated effect
+1 Would suggest a spurious effect if no effect was observed
A/High (four plus: )
B/Moderate (three plus: )
Observational studies
Low C/Low (two plus: )
D/Very low (one plus: )
Pulling it all together and drawing conclusions
• carefully consider and assess all the factors that may influence the quality of evidence
• bear in mind that down- and upgrading for specific quality factors should be done in the context of all of the factors that influence the quality of evidence
• downgrading for one quality criterion may influence how the next quality criterion is dealt with
Within and among
• Downgrade or upgrade on a continuum
• Downgrade or upgrade – WITHIN each category – AMONG the categories
Example: • Meta- analysis of 5 studies
• uncertainty about three factors: study limitations/RoB, inconsistency, and imprecision
• Uncertainty not serious enough to downgrade each factor
• Option to pick one or two levels to downgrade
• Indicate in footnotes why and why did not downgrade for those factors (e.g. There was some uncertainty but already downgraded for...)
Survival
• HR 0.77 (0.65 to 0.91)
• How confident are you that these results are true?
Study limitations
No, there are no serious limitations
Yes, there are serious limitations
Yes, there are very serious limitations
Would you downgrade for risk of bias?
From risk of bias to limitations in design
From risk of bias to limitations in design
From risk of bias to limitations in design
Quality now?
• High
Inconsistency
Who believes there is important inconsistency (rather than random
error)?
No, there is no serious inconsistency
Yes, there is serious inconsistency
Yes, there is very serious inconsistency
Quality now?
• High
Indirectness
• Direct comparison?• Population?• Intervention?• Outcome?
Quality now?
• High
Publication bias
Quality now?
• High
Imprecision
Quality now?
• High• No upgrading
Major bleeding
• RR 1.50 (0.26 – 8.80)
Study limitations
No, there are no serious limitations – although there ….
Yes, there are serious limitations – most people would agree that selective reporting is….
Yes, there are very serious limitations – there is a risk of bias but only for the one criteria of selective reporting
Would you downgrade for risk of bias?
From risk of bias to limitations in design
From risk of bias to limitations in design
From risk of bias to limitations in design
Quality now?
• Moderate
Imprecision
Quality now?
• Low• Observational studies could have provided
higher quality evidence
Flavanoids for Hemorrhoids
• venotonic agents– mechanism unclear, increase venous return
• popularity– 90 venotonics commercialized in France– none in Sweden and Norway– France 70% of world market
• possibilities– French misguided– rest of world missing out
Systematic Review• 14 trials, 1432 patients
• key outcome– risk not improving/persistent symptoms– 11 studies, 1002 patients, 375 events– RR 0.4, 95% CI 0.29 to 0.57
• minimal side effects
• is France right?
• what is the quality of evidence?
What can lower quality?
• Study limitations/risk of bias– lack of detail re concealment– questionnaires not validated
• rate down quality for study limitations/RoB?
• indirectness – no problem
• inconsistency, need to look at the results
Review : Phlebotonics for hemorrhoidsComparison: 01 Venotonics vs placebp Outcome: 08 Overall improvement: no improvement/some improvement
Study RR (random) Weight RR (random)or sub-category log[RR] (SE) 95% CI % 95% CI
01 Up to seven daysChauvenet -0.8916 (0.2376) 12.67 0.41 [0.26, 0.65] Cospite -2.2073 (0.6117) 5.51 0.11 [0.03, 0.36] Thanapongsathorn -0.4308 (0.2985) 11.18 0.65 [0.36, 1.17]
Subtotal (95% CI) 29.36 0.37 [0.18, 0.77]Test for heterogeneity: Chi² = 6.92, df = 2 (P = 0.03), I² = 71.1%Test for overall effect: Z = 2.67 (P = 0.008)
02 Up to four w eeksAnnoni F -1.6094 (0.7073) 4.50 0.20 [0.05, 0.80] Clyne MB -0.9943 (0.3983) 8.94 0.37 [0.17, 0.81] Pirard J -1.1712 (0.3086) 10.94 0.31 [0.17, 0.57] Thanapongsathorn -1.1087 (1.1098) 2.18 0.33 [0.04, 2.91] Thorp 0.2624 (0.3291) 10.46 1.30 [0.68, 2.48] Titapan -0.8916 (0.3691) 9.56 0.41 [0.20, 0.85] Wijayanegara -0.5978 (0.1375) 14.97 0.55 [0.42, 0.72]
Subtotal (95% CI) 61.54 0.48 [0.32, 0.72]Test for heterogeneity: Chi² = 13.87, df = 6 (P = 0.03), I² = 56.7%Test for overall effect: Z = 3.57 (P = 0.0004)
03 Further than four w eeksGodeberg -1.7719 (0.3906) 9.10 0.17 [0.08, 0.37]
Subtotal (95% CI) 9.10 0.17 [0.08, 0.37]Test for heterogeneity: not applicableTest for overall effect: Z = 4.54 (P < 0.00001)
Total (95% CI) 100.00 0.40 [0.29, 0.57]Test for heterogeneity: Chi² = 28.66, df = 10 (P = 0.001), I² = 65.1%Test for overall effect: Z = 5.14 (P < 0.00001)
0.001 0.01 0.1 1 10 100 1000
Favours treatment Favours control
Publication bias?
• size of studies– 40 to 234 patients, most around 100
• all industry sponsored
Review : Phlebotonics for hemorrhoidsComparison: 01 Venotonics vs placebp Outcome: 08 Overall improvement: no improvement/some improvement
0.001 0.01 0.1 1 10 100 1000
0.0
0.4
0.8
1.2
1.6
RR (fixed)
What can lower quality?
• risk of bias– lack of detail re concealment– questionnaires not validated
• Inconsistency– heterogeneity p < 0.001; I2 65.1%
• indirectness
• imprecision– RR 0.4, 95% CI 0.29 to 0.57
• Publication bias– 40 to 234 patients, most around 100
Conclusions WHO guidelines should be based on the best available
evidence to be evidence based GRADE is the approach used by WHO and gaining
acceptance internationally combines what is known in health research methodology and
provides a structured approach to improve communication Does not avoid judgments but provides framework Criteria for evidence assessment across questions and
outcomes Criteria for moving from evidence to recommendations Transparent, systematic
four categories of quality of evidence two grades for strength of recommendations
Transparency in decision making and judgments is key