Assessing Applicability Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training...

Assessing ApplicabilityPrepared for:

The Agency for Healthcare Research and Quality (AHRQ)

Training Modules for Systematic Reviews Methods Guide

www.ahrq.gov

Systematic Review Process Overview

To describe applicability and substantiate its importance

To delineate a systematic approach to assessing applicability by using PICOS domains To recognize common features that may impact the

applicability of studies

Learning Objectives

PICOS domains = population, intervention, comparators, outcomes, and setting

Applicability is defined as the extent to which the results observed in published studies are likely to reflect the expected outcomes when an intervention is applied to broader populations under real-world conditions.

Similar terms include external validity, generalizability, directness, and relevance.

Defining Applicability

Unless the evidence synthesized within a systematic review is applicable to relevant populations under real-world practice conditions, the conclusions may be inappropriate for guiding clinical or policy decisions.

The Importance of Applicabilityin Systematic Reviews

Frame issues of applicability with reference to specific clinical or policy questions the review is intended to inform.

Applicability needs to be considered at the outset: when the scope of the review is determined, and when the key questions are identified.

This module focuses on handling applicability issues at the analyze-and-synthesize-data stage.

Framing Applicability Issues

Clinical experts and stakeholders can provide general information important to framing applicability issues. What does the population of interest looks like? What types of care or procedures are routine or represent

standard of care? Are certain subpopulations characteristically different from

others?

Applicability Resources

Registry or epidemiological information, practice guidelines, consensus papers, book chapters, and general reviews can provide useful applicability information. Applicability issues do not have to be reviewed for each

study. They are used to place the available literature in context.

Other Applicability Resources

Studies often do not report on factors needed to judge applicability. Increased use of CONSORT diagrams helps, but deficiencies

remain.

Only a small number of studies are designed with applicability in mind. Most trials are efficacy trials that maximize internal validity

at the expense of applicability.

How Individual Studies Consider Applicability

CONSORT = Consolidated Standards of Reporting Trials

Study parameters may affect applicability through these different mechanisms: Effect modifiers Modifiers of baseline rates Factors that reduce ability to generalize results to everyday

practice

Study Parameters Affect Applicability

Applicability should be judged separately for each important question. Factors influencing benefits and harms often depend on

distinct physiologic processes. The most applicable evidence may differ when considering

a benefit or a harm.

Applicability Judged for Each Question

Applicability of a body of evidence is not the same as applicability of individual studies. A collection of studies may provide broad applicable

evidence even if individual studies do not. Consistency of study results across different populations

and settings increases confidence in the applicability of the results.

Applicability depends on context and cannot be assessed with a simple rating scale. There are no empiric data validating any scoring system for

applicability.

Applicability of a Body of Evidence

Applicability is best reported separately from quality or strength of evidence. Internal validity concerns are not altered by changes in

perspective, but applicability is. Some evidence hierarchies (GRADE) consider applicability

“or directness” when assessing the quality of evidence. For systematic reviews in which multiple perspectives

(clinician, patient, policymaker) are considered, applicability should be assessed separately.

Judge Applicability andStrength of Evidence Separately

Stepwise approach to applicability:1. Identify beforehand a limited number of factors that are

most likely to influence applicability of evidence for the specific issue being reviewed and report the relevant elements from individual studies.

2. Judge limitations to the applicability of individual studies based on those factors.

3. Summarize the applicability of the body of evidence from multiple studies with respect to specific populations, interventions, comparisons, and outcomes.

Three Steps To Assess Applicability

Identify most important applicability issues Understand the real-world context Understand how studies differ from the real-world context Use PICOS format to organize factors

Report features from individual studies that impact applicability in evidence tables

Step 1. Determine and Report theMost Important Factors in PICOS Format

PICOS = population, intervention, comparators, outcomes, and setting

In a trial of women with osteoporosis, only 4,000 of the 54,000 women screened were enrolled; the enrollees were younger, healthier, and more adherent to therapy than is typical of women with osteoporosis.

A trial of etanercept for juvenile diabetes excluded patients with side effects during an active run-in period; the trial found a low incidence of adverse events.

Clinical trials used to inform Medicare decisions enrolled patients who were younger (60 vs. 75 years of age) and more often male (75 vs. 42%) than is typical of Medicare patients with cardiovascular disease.

Population and Applicability: Examples

Conditions That Limit Applicability

Features That Should Be Extracted Into Evidence

TablesNarrow eligibility criteria, high exclusion rate, low enrollment

Eligibility criteria, proportion of screened individuals enrolled

Differences between patients in study and the community

Demographics (range and mean): age, gender, race, ethnicity

Narrow or unrepresentative severity or stage of illness

Severity or stage of illness (referral or primary care population)

Run-in periods with high exclusion rates

Run-in period: attrition rate before randomization and reasons (e.g., nonadherence, adverse drug events, no response)

Events rates markedly different than in community

Event rates in treatment and control groups

Disease prevalence in study population different than community

Prevalence of disease (for diagnostic studies)

Population and Applicability

Studies of behavioral modification to promote healthy diet employ a larger number and longer duration of visits than those available to most community patients.

The use of pill counts in antiretroviral trials does not always translate into effectiveness in real-world practice.

Combining iron and zinc attenuates the ability of iron to raise hemoglobin levels.

Trials of carotid endarterectomy select surgeons with extensive experience and low complication rates and are not representative of average vascular surgeons.

Intervention and Applicability: Examples

Conditions That Limit Applicability

Features That Should Be Extracted Into Evidence

TablesRegimen not reflective of current practice

Medication dose, schedule, duration

Intensity of intervention not feasible for routine use

Intensity of behavioral interventions

Monitoring practices or visit frequency not used in practice

Adherence to interventions

Versions not in common use Version of rapidly changing technology

Cointerventions that likely modify effectiveness of therapy

Cointerventions

Level of training not widely available

Training/skill level of intervention team (surgery/diagnostics)

Intervention and Applicability

A fixed-dose study that compared high-dose duloxetine to low-dose paroxetine

Many trials evaluating magnesium as a treatment for acute myocardial infarction that were conducted before thrombolytic drugs, antiplatelet drugs, ß-blockers, and primary percutaneous coronary intervention (PCI) were used

Only 1 of 23 trials that compared bypass surgery to PCI used drug-eluting stents

Comparator and Applicability: Examples

Trials of biologics for rheumatoid arthritis that used radiographic progression rather than symptom evaluations as an outcome measure

Trials comparing cyclooxygenase-2 inhibitors and nonsteroidal antiinflammatory drugs that used endoscopy-evaluated ulceration rather than symptomatic ulcers as an outcome measure

Outcomes and Applicability: Examples

Conditions Conditions That Limit ApplicabilityThat Limit Applicability

Features Features That Should Be Extracted That Should Be Extracted

Into Evidence TablesInto Evidence Tables

ComparatorComparatorRegimen not reflective of current practice

Medication dose, schedule, duration (if applicable)

Use of substandard alternative therapy

Comparator chosen vs. others available (if applicable)

OutcomesOutcomesSurrogate endpoints, brief follow-up periods, improper definitions for outcomes, composite endpoints

Outcomes (benefits and harms) and how they were defined

Comparator, Outcomes, and Applicability

Studies evaluating the benefits of breast self-examinations conducted in China and Russia, countries that do not employ routine mammography screening as in the United States

Studies of open surgical abdominal aortic aneurysm repair showing an inverse relationship between hospital volume and short-term mortality

Setting and Applicability: Examples

Setting and Applicability

Conditions That Conditions That Limit ApplicabilityLimit Applicability

FeaturesFeatures That Should Be That Should Be

ExtractedExtracted Into Evidence TablesInto Evidence Tables

Settings in which standards of care differ markedly from setting of interest

Geographic setting

Specialty population or level of care that differs from community

Clinical setting

Make judgments about applicability of individual studies

Look for effectiveness trials Specifically designed to produce applicable results

Report on applicability Highlight effectiveness trials Compare and contrast effectiveness and efficacy trials Clearly report and highlight major applicability issues of

efficacy trials and what impact they may have on the interpretation of the study

Step 2. Make and Report Judgments About Major Limitations to the Applicability of Individual Studies

Seven criteria:1. Enroll a primary-care population

2. Have less-stringent eligibility criteria

3. Assess health-related outcomes

4. Have a long study duration and clinically relevant treatment modalities

5. Assess for adverse events

6. Have an adequate sample size to assess minimally important differences from a patient perspective

7. Use intention-to-treat analysis

Effectiveness Trials

Gartlehner G, et al. J Clin Epidemiol 2006;59:1040-8;Gartlehner G, et al. Int J Technol Assess Health Care 2009;25:323-30.

For efficacy trials, clearly report characteristics that may limit applicability. Describe those characteristics in the text or in evidence

tables under the heading “comments” or “limitations.” Describe how important factors would affect applicability

and the expected direction and magnitude of bias.

Make Judgments About theApplicability of Individual Studies

Evidence Table Template for Applicability

TrialTrialPopulatioPopulatio

nnInterventiInterventi

ononComparatComparat

oror

OutcomeOutcomes, s,

SettingSetting CommentsComments

Smith et al.24

Heart failure population

Mean age: 65 years

NYHA class II or III: 83%

Surgical debulking of myocardium

Watchful waiting

(ACE inhibitor use, 34%; ß-blocker use, 40%)

Hospitali-zations and survival

Median followup at 1 year

Single, large, tertiary care hospital

An efficacy trial; limited standardization of intervention; comparator did not include optimal medical therapy; unclear how the benefits and harms would compare in a smaller community hospital

After reviewing this sample evidence table, assume that four other trials are available with similar PICOS domains.

Slutsky J, et al. In: Methods guide for comparative effectiveness reviews. Available at: http://www.effectivehealthcare.ahrq.gov/ehc/products/118/324/2008_1118CompareInterventions.pdf.

ACE = angiotensin-converting enzyme; NYHA = New York Heart Association; PICOS = population, intervention, comparators, outcomes, and setting

Describe the limitations of aggregate evidence in the conclusions.

Generate a summary applicability table with a PICOS structure. Aggregate applicability is not the sum of applicability of

individual studies. Each individual study can have relatively low applicability in

at least one PICOS domain, but the body of evidence can have high applicability.

Step 3. Consider and Summarize the Applicability of a Body of Evidence From Individual Studies


Which Series of StudiesHas an Applicability Issue?

Study 1Study 2Study 3

Study 4

Higher Age

Lower Age

Study 4Study 1Study 3 Study 2

Scenario 1Scenario 1

Scenario 2Scenario 2

Step 3. Applicability Summary Table Template

DomainDomain Description of Applicability Evidence for a Key Question

PopulationPopulation Describe the general characteristics of enrolled populations and how they may differ from the target population. Where possible, describe proportions (percentage older than 65) rather than an age range or average age.

InterventiInterventionon

Describe the general characteristics of the interventions, how they compare to routine practice, and how this may affect benefits or harms.

ComparatoComparatorsrs

Describe comparators, whether they reflect best alternative therapy, and how this may impact treatment effect size.

OutcomesOutcomes Describe outcomes most frequently reported and over what period of time. Describe whether or not these outcomes reflect the most important benefits or harms.

SettingSetting Describe the settings in which the studies were conducted and the potential implications of using the intervention elsewhere.

Step 3. Completed Applicability Summary Table

DomainDomain Description of Applicability Evidence for a Key Question

PopulationPopulation The population and disease stage are representative of the United States population with heart failure.

InterventionIntervention The intervention is plausible.

ComparatorsComparators Watchful waiting is reasonable if the baseline treatment in both groups was standard medical therapy. Standard medical therapy is not being used in most patients. Subgroup analyses suggest that benefits are predominantly in those patients not receiving standard therapy.

OutcomesOutcomes Although hospitalizations and survival are being evaluated, other outcomes, including harms, are not.

SettingSetting The settings for the studies are large tertiary medical centers, which may overestimate the benefits of therapy in actual practice and accentuate the harms.

ComparisoComparisonn

Strength Strength of of

EvidenceEvidenceConclusions With Description of Conclusions With Description of

ApplicabilityApplicabilityMyocardial debulking versus watchful waiting

Low When compared with watchful waiting, the use of myocardial debulking reduced hospitalizations without affecting survival. These results are predominantly limited to patients 65 years of age and older with NYHA class II and III disease. This is similar to the population with heart failure in the United States. In a subgroup analysis, benefits were accentuated in patients not receiving standard medical therapy with ACE inhibitors and ß-blockers, thus limiting applicability. No harms were being evaluated. These trials were conducted in select, large, tertiary medical centers and may not be applicable to other settings.

Summary Table forBody of Evidence Applicability

ACE = angiotensin-converting enzyme; NYHA = New York Heart Association

The PICOS framework is useful for organizing contextual factors of interest.

Input from clinical experts/stakeholders and a review of current practices can help identify specific applicability issues.

Population-based surveys, pharmacoepidemiologic studies, and large case series or registries can help determine how trial populations and circumstances differ from normal practice.

Key Messages (I)


Reviewers should routinely select the most important characteristics that may limit applicability and assess whether variation within the factors may modify outcome effects.

Reports should clearly highlight issues about applicability of individual studies in a “comments” or “limitations” section in evidence tables or in the text.

Meta-regression and/or separate applicability summary tables may help reviewers, and those using the reports, to see how the body of evidence applies to the question at hand.

Judgments about applicability of the evidence should consider the entire body of studies.

Important limitations of the applicability of the evidence should be described within each summary conclusion.

Key Messages (II)

Gartlehner G, Hansen RA, Nissman D, et al. A simple and valid tool distinguished efficacy from effectiveness studies. J Clin Epidemiol 2006;59:1040-8.

Gartlehner G, Thieda P, Hansen RA, et al. Inadequate reporting of trials compromises the applicability of systematic reviews. Int J Technol Assess Health Care 2009;25:323-30.

Shadish W, Cook T. Experimental and quasi-experimental design for generalized causal inference. Boston: Houghton Mifflin; 2002.

Thorpe KE, Zwarenstein M, Oxman AD, et al. A pragmatic-explanatory continuum indicator summary (PRECIS): a tool to help trial designers. J Clin Epidemiol 2009;62:464-75.

References

This presentation was prepared by C. Michael White, Pharm.D., FCP, FCCP, a member of the University of Connecticut/Hartford Hospital Evidence-based Practice Center.

This module is based on chapter 6 in version 1.0 of the Methods Reference Guide for Effectiveness and Comparative Effectiveness Reviews (available at: http://effectivehealthcare.ahrq.gov/repFiles/2007_10Draft MethodsGuide.pdf).

Author

Assessing Applicability Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training...

Documents

Transcript of Assessing Applicability Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training...