Clinicians, Probability, and EBM Michael A. Kohn 11/29/2007.

Clinicians, Probability, and EBM

Michael A. Kohn

11/29/2007

Criticisms of Evidence-Based Treatment Discussed in Chapter 121) EBM over-values randomized blinded trials

and denigrates other forms of evidence, including clinical experience.

2) Evidence-based treatment recommendations tend towards the nihilistic, failing to recommend treatments that some clinicians, professional societies or disease-specific advocacy groups believe are effective.

3) EBM has been or might be used by payers as an excuse to deny payment and limit clinician autonomy.

Outline of Today’s Talk

• Review of Bayesian updating described in Chapters 3 and 4

• Discussion of applying Bayesian updating in clinical practice

• Course review

Probability Updating1) Convert the patient’s pre-test probability of disease,

P(D+), to pre-test odds, Odds(D+):

Odds(D+) = P(D+)/[1 – P(D+)]

2) Find the likelihood ratio for the patient’s test result, LR(r):

LR(r) = P(r|D+)/P(r|D-)*

3) Get post-test odds, Odds(D+|r), by multiplying pre-test odds by likelihood ratio:

Odds(D+|r) = Odds(D+) x LR(r)

4) Convert post-test odds to post-test probability:

Prob(D+|r) = Odds(D+|r)/[1+Odds(D+|r)]

*Try to remember this definition, not LR(+) = sensitivity/(1-specificity) or LR(-) = (1- sensitivity)/specificity, which only work for dichotomous tests.

Threshold Probability

If B the benefit of treating a D+ individual (or the cost of failing to treat) and C is the cost of treating a D- individual unnecessarily, then

Ptt = C/(B+C)

Threshold Probability

If Ctreat is the treatment cost (in $), and the maximum cost-effectiveness ratio is CER* (in $/bad outcome prevented), then

Ptt = Ctreat/(ARR x CER*)

Where ARR is the absolute risk reduction of treating D+ individuals --usually obtained from an RCT of the treatment in D+ individuals.

By definition D+ individuals have a positive gold standard test.

Assumes treatment does not reduce risk at all in D- individuals.

Using Test Results

Treat if the updated probability of disease is greater than the treatment threshold:

Treat when

P(D+|r) > Ptt

Need 3 Things

1) Pre-test probability: P(D+)

2) Likelihood ratio of test result: LR(r)

3) Treatment threshold: Ptt

ExampleA 58 y.o woman presents to the E.R with an episodic pressing/burning chest pain that

began two days earlier for the first time in her life. The pain started while she was walking, radiates to the back and is accompanied by nausea, diaphoresis and mild dyspnea, but is not increased on inspiration. The latest episode of pain ended half an hour prior to her arrival.

Risk factors: hypertension known for years partially treated (in the past), truncal obesity (height–161 cm, weight–85 Kg ). She denies smoking, diabetes mellitus, hypercholesterolemia or a family history of heart disease.

She currently takes no medications On physical examination upon arrival: appears to be in distress, pulse regular 100/min,

B.P 135/80, 18 respirations/min, temperature 36.7°. The lungs are clear, the heart sounds are normal with no murmurs or extra sounds, the abdomen is soft with no organomegaly. No pedal edema is noted and the peripheral pulses are normal.

Cahan A, et al. Qjm. 2003 Oct;96(10):763-9.

On the E.C.G: normal sinus rhythm 101/min, axis 45°, borderline ST elevation of 0.5 mm in leads V2-V4.

Admit to telemetry?

Homework/Exam ProblemPretest Probability 10%

ECG ACSNo

ACS

Diagnostic (STEMI) 30% 1%

Suggestive (ST Depression) 20% 5%

Indeterminate (LBBB, paced rhythm) 10% 10%

Non-specific 39% 54%

Normal 1% 30%

100% 100%

Consider this patient’s ECG “suggestive”Admit if probability of ACS is 1 in 5 or greater

Homework/Exam Problem

• Pre-test Odds = 0.1/(1-0.1) = 0.11111• LR(“Suggestive”) = 20%/5% = 4 (See Note)• Post-test Odds = 0.111 x 4 = 0.4444• Post-Test Prob = 0.44/(1+0.44) = 31%• 31% > 1/5 = 20%, so admit

Note: The LR for a test with more than two results is not calculated by doing anything with sensitivity or specificity; it is the probability of the result in disease divided by the probability of the result in non-disease.


Could also calculate no ECG-ECG threshold probability*

• Threshold Odds = 0.2/(1-0.2) = 0.25• LR(“Diagnostic”) = 30%/1% = 30• Pre-test Odds = 0.25/30 = 0.00833• Pre-Test Probability = 0.0083/(1+0.0083)

= 0.82%If pre-test probability < 0.82%, don’t do

ECG.

*Assume ECG is riskless and costless (which it basically is).


Could also calculate ECG-admit threshold probability

• Threshold Odds = 0.2/(1-0.2) = 0.25

• LR(“Normal”) = 1%/30% = 1/30 = 0.033

• Pre-test Odds = 0.25/0.03330 =7.5

• Pre-Test Probability = 7.5/8.5 = 88%

If pre-test probability > 88% admit even if ECG is normal.

Does any clinician really do this in daily practice?

Real Clinical PracticePretest Probability 10% (????)

ECG ACSNo

ACS





Normal 1% 30%

(??) 100% 100%

Consider this patient’s ECG “suggestive” (??)

Admit if probability of ACS 1 in 5 or greater (????)

Problem 1: Pre-Test Probability

• In contrast with the homework examples, pre-test probabilities are never given.

• Patients have widely varying pre-test probabilities.

• We clinicians are very bad, even irrational, at estimating them.

Problem 2: Likelihood Ratios

• In contrast with the homework examples, likelihood ratios are rarely known.

• We clinicians are very bad, even irrational, about using them.

Problem 3: Treatment Thresholds

• In contrast with the homework examples, treatment thresholds are never given.

• Consequences of error vary widely from patient to patient and depend on multiple diverse factors.

Pre-test probabilities

What is the probability (in percents), in your opinion, that the patient has:

• Active coronary artery disease?

• A dissecting aortic aneurysm?

• Reflux esophagitis?

• Biliary colic?

• Anxiety disorder?

Cahan A, et al. Qjm. 2003 Oct;96(10):763-9.

Figure 3. The range of estimated probabilities for each of the five diagnoses suggested to the participants. For each diagnosis, the ranges of `crude' and `standardized' probabilities are shown as the left- and right-hand lines, respectively. The means are shown as dots.

Note: Ranges from 1-99%

Figure 2. Frequency distribution of the total probabilities assigned by participants. The mean total probability was 136.7% (± 53.9%). Sixty-five percent of participants had a total probability > 100% (i.e. exhibited subadditivity)

Figure 3. The range of estimated probabilities for each of the five diagnoses suggested to the participants. For each diagnosis, the ranges of `crude' and `standardized' probabilities are shown as the left- and right-hand lines, respectively. The means are shown as dots.

Why so high?

Heuristics Used in Probability Estimation

• Representativeness

• Availability

• Adjustment from an anchor

These heuristics can lead to biased estimates. See Chapter 12 for details.

Using Likelihood Ratios

• Prevalence of a disease is 1/1000• Dichotomous test has false positive rate of 5%• What is the probability of disease in a person

with a positive result?(Assume that you know nothing about the person’s

symptoms or signs)

Casscells W, et al.N Engl J Med. 1978 Nov 2;299(18):999-1001


Consider 2x2 table method.

D+ D-

T+ 1 50 51 1/51 2%

T- 0 949 949

1 999 1000


• Pre-test Prob = 1/1000

• Pre-test Odds = 1/999

• ASSUME False positive rate = 1 – Spec and Sens = 100%

• LR(+) = 100%/5% = 20

• Post-Test Odds = 1/999 x 20 = 20/999

• Post-Test Prob = 20/1019 = 2%

Casscells W, et al.N Engl J Med. 1978 Nov 2;299(18):999-1001


11/60 surveyed gave 2% for the answer

BUT,

27 gave 95% for the answer,

SO

Maybe the problem was misunderstanding about what was meant by “false positive rate.”

They interpreted it as 1 – PPV.Casscells W, et al.N Engl J Med. 1978 Nov 2;299(18):999-1001

Decision-making vs. Probability Estimation

A 58 y.o woman presents to the E.R with an episodic pressing/burning chest pain…

Ask MDs for pre-test probability of ACS and you get answers ranging from 1% to 99%

Ask them to choose a next step*, and 100% will get an ECG.

*You could give them options: a) send patient to cath lab, b) admit, c) get ECG, and d) send home.

Decision-making vs. Probability Estimation

On the E.C.G: normal sinus rhythm 101/min, axis 45°, borderline ST

Ask MDs for post-test probability of ACS and you get answers ranging from 1% to 99%

Ask them to choose a next step*, and most will get additional tests, such as troponin I, repeat ECG.

*You could give them options: a) send patient to cath lab, b) admit, c) get more tests (e.g., troponin I, repeat ECG, and d) send home.

MD 1

• Prob ACS: 90%

• Next Steps: call cardiology - pt needs admission or at least serial trops then stress testing prior to d/c home.

MD 2

• Prob ACS: My estimated probability for aortic arch aneurysm is 5%, acute MI is 10% and unstable angina is 95%.

• Next Steps: check troponin, routine labs, CXR (?mediastinum?) and discuss with cards for risk stratification (e.g. stress echo) when possible.

MD 3

• Prob ACS: (not stated)

• Next Steps: she needs cardiac markers. If she is pain free and her markers are negative and serial EKG's are negative, I'd get a stress echo. Certainly cardiology consult.

MD 4

• Prob ACS: Greater than 80%

• Next Steps: Troponin, aspirin, consider ntg and/or morphine if the patient is in severe pain, beta-blockade, cxr, repeat ECG in 10 minutes, repeat troponin, [Cardiology consultation, admit]

MD 5

• Prob ACS: I'm not good at ascertaining probabilities

• Next Steps: Chest x ray, D Dimer, serial troponins, and would be on the phone with cardiology now. ASA would be given as well as a Beta Blocker, and she would be most likely be admitted no matter the outcome of any single test short of coronary angiography.

MD 6

• Prob ACS: less than 50% (35%) but too high for me to send home

• Next Steps: iv, o2, monitor, ecg (serial), cxr, cbc, chem 7, trop. ASA, beta blockers, nitrates for cp only. regardlessof trop number would call cardiologist to eval.

MD 7

• Prob ACS: 80%mi, Pericarditis, 10%, Colecystitis or Coledocholithiasis 10%

• Next Steps: Troponin and repeat ecg

MD 8

• Prob ACS: I'd give her a high probability (> 85%) of ACS.

• Next Steps: I'd go down the therapeutic algorithm - ASA, B-blocker, etc., call Cardiology, and pursue/rule-out alternative diagnoses (GI, vascular causes, etc.) all the while expecting that she would be going to the cath lab. In the absence of another better alternative and with no contraindication to angiography, I think a negative EST would be a false-negative and a waste of time.

Why teach Bayes’s rule if we can’t or won’t use it in our clinical

practice?

Why teach EBD to you?

You will --

• develop clinical policies (decision rules and guidelines),

• evaluate proposed clinical policies,

• develop diagnostic tests,

• evaluate diagnostic tests.

Does understanding the mathematical process help clinical decision making, even though we don’t use it explicitly in our clinical practice?

Course Review

Kappa

ECG READER A

Diagnostic Suggestive Indeterminate Non-specific Normal

READER B

Diagnostic12 3 0 6 2 23

Suggestive5 4 2 2 0 13

Indeterminat0 5 4 3 0 12

Non-specific3 0 3 5 2 13

Normal0 2 4 2 2 10

20 14 13 18 6 71

Observed Agreement: 27/71 = 38%

Expected

6.5 4.5 4.2 5.8 1.9 23

3.7 2.6 2.4 3.3 1.1 13

3.4 2.4 2.2 3.0 1.0 12

3.7 2.6 2.4 3.3 1.1 13

2.8 2.0 1.8 2.5 0.8 10

20 14 13 18 6 71

6.5 + 2.6 + 2.2 + 3.3 + 0.8 = 15.4

15.4/71 = 21.7%

Unweighted Kappa

Kappa = (38%-21.7%)/(100%-21.7%) =

0.209

Linear Weights

1.00 0.75 0.50 0.25 0.00

0.75 1.00 0.75 0.50 0.25

0.50 0.75 1.00 0.75 0.50

0.25 0.50 0.75 1.00 0.75

0.00 0.25 0.50 0.75 1.00

Weighted Actual

12 2.25 0 1.5 0

3.75 4 1.5 1 0

0 3.75 4 2.25 0

0.75 0 2.25 5 1.5

0 0.5 2 1.5 2

51.5

Weighted Actual Agreement: 51.5/71 = 72.5%

Do not calculate marginals after applying weights.

Weighted Expected6.5 3.4 2.1 1.5 0.0

2.7 2.6 1.8 1.6 0.3

1.7 1.8 2.2 2.3 0.5

0.9 1.3 1.8 3.3 0.8

0.0 0.5 0.9 1.9 0.8

43.2

Weighted Expected Agreement: 43.2/71 = 60.8%

Do not calculate marginals after applying weights.

Weighted Kappa

Weighted Actual Agreement = 72.5%

Weighted Expected Agreement = 60.8%

Kappa = (72.5% - 60.8%) / (100% - 60.8%)

Linear Weighted Kappa = 0.30

Testing Threshold Odds*: Imperfect, but Costless Test

C = Cost of Treating D-

B = Cost of Failing to Treat D+

Oddstt = C/B

No Treat-Test Threshold Odds: Oddstt / LR(+)

Test-Treat Threshold Odds: Oddstt / LR(-)

*Remember, you still need to convert back to probability.

Testing Threshold Probabilities*: Perfect, but Costly Test

T = Cost of Test

No Treat-Test Threshold Probability: T/B

Test-Treat Threshold Probability: 1 – T/C

*Note that these are probabilities not odds

Testing Threshold Odds*: Imperfect and Costly Test

No Treat-Test Threshold Odds:

[(C) P(+|D−) +T] ∕ [(B) P(+|D+) –T]

Test-Treat Threshold Odds:

[(C) P(-|D−) – T] ∕ [(B) P(-|D+) + T]

*Remember, you still need to convert back to probability.

ROC Curves

ECG ACS No ACS





Normal 1% 30%

100% 100%

List from most abnormal to least abnormal

ROC Curves

ECG ACS No ACS TPR FPR*

Diagnostic (STEMI) 30% 1% 30% 1%

Suggestive (ST Depression) 20% 5% 50% 6%

Indeterminate (LBBB, paced rhythm) 10% 10% 60% 16%

Non-specific 39% 54% 99% 70%

Normal 1% 30% 100% 100%

100% 100%

* Not that this is not specificity; it’s 1 – specificity.

ROC Curves

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

1 - Specificity

Sen

siti

vity

Biases in Studies of Dx Test Accuracy

• Overfitting Bias – “Data snooped” cutoffs take advantage of chance variations in derivations set making test look falsely good.

• Incorporation Bias – index test part of gold standard (Sensitivity Up, Specificity Up)

• Verification/Referral Bias – positive index test increases referral to gold standard (Sensitivity Up, Specificity Down)

• Double Gold Standard – positive index test causes application of definitive gold standard, negative index test results in clinical follow-up (Sensitivity Up, Specificity Up)

• Spectrum Bias– D+ sickest of the sick (Sensitivity Up)– D- wellest of the well (Specificity Up)

Biases in Studies of Screening Tests

Biases that occur only when comparing outcomes in those diagnosed with the disease (D+ individuals) – not the entire screened vs unscreened populations:

Lead-Time (Zero time-point shift)

Length (Differing natural history)


Stage migration bias occurs only when comparing stage-specific outcomes/prognosis.

Volunteer bias can occur when comparing entire screened to unscreened (if not randomized to screening)

“Sticky diagnosis bias” makes screening look bad by selectively attributing bad outcomes to disease in the screened group.


Pseudodisease/Overdiagnosis:

Compare outcome in those diagnosed with disease rather than in all those screened vs. all those not screened.

Kind of like stage migration bias, where there are only two stages: Stage 1 and Stage 0. You are comparing Stage 1-specific outcomes.

Multiple Tests

LR for combined test results

NT NBE

Trisomy 21

+ %

Trisomy 21

- % LR

Pos Pos 158 47% 36 0.7% 69

Pos Neg 54 16% 442 8.5% 1.9

Neg Pos 71 21% 93 1.8% 12

Neg Neg 50 15% 4652 89% 0.2

Total Total 333 100% 5223 100%

P(r|D+) / P(r|D-) = (158/333) / (36/5223) = 47% / 0.7% = 69

RCTs: ARR, NNT, and BOTE CEA

Measures of Treatment Effect

RR= Risk Ratio =

RR < 1 means treatment is beneficial

RRR = Relative Risk Reduction = 1-RR

)/(

)/(

dcc

baa

Bad Outcome

No Bad Outcome

Totals

Treatment a b a + b

Control c d c + d

Totals a + c b + d N = a + b + c + d

Measures of Treatment Effect

ARR = Absolute Risk Reduction = c/(c+d) - a/(a+b)

NNT = Number Needed to Treat (to prevent 1 bad outcome) = 1/ARR

Bad Outcome

No Bad Outcome

Totals

Treatment a b a + b

Control c d c + d

Totals a + c b + d N = a + b + c + d

Back-of-the-Envelope Cost Effectiveness Analysis

How many patients do I need to treat (at the treatment cost) to prevent 1 bad outcome?

Number Needed to Treat (NNT) = 1/ARR

Cost of preventing one bad outcome =

NNT x Treatment Cost*

*This is just ∆$Cost /∆Risk .

Alternatives to RCTs

Instrumental Variables

Associated with the predictor of interest and not independently associated with the outcome. (There is no causal relationship between the instrument and the outcome, and there is no confounder other than the predictor of interest.)

Alternatives to RCTs

Propensity Scores

Logistic model predicts treatment rather than outcome.

Stratify by probability of treatment and compare outcomes in those treated to outcomes in those not treated.

Assessing the Importance of Confounding

You think a treatment improves an outcome, but are worried that a 3rd factor (a confounder) is the true cause of improved outcomes and your treatment is merely associated with this confounder.

Look at a different outcome that you don’t think the treatment could affect, and see if the treatment still seems to do so.

If treatment does appear to improve the second outcome, confounding is a problem.

Assessing the Importance of Confounding: Example

You think sigmoidoscopy decreases colon cancer death, but are worried that a confounder (e.g., better health care) is the true cause of decreased colon cancer deaths and sigmoidoscopy is merely associated with this confounder.

Look at death from proximal colon cancers (beyond the reach of the sigmoidoscope) that cannot be affected by sigmoidoscopy itself, and see if the sigmoidoscopy still seems to improve outcomes.

If so, confounding is a problem.

P-values and Confidence Intervals

• Section today.

• Problem 11.7 – If the baseline rate (in the placebo group) of a “bad” outcome is very low, then a relative risk reduction may have to be very large to make treatment worthwhile. Look at the confidence interval around the absolute risk reduction.

RCT of Prophylactic Antibiotics to Prevent Dog-Bite Infections

You don’t think prophylactic antibiotics are worth it if you need to treat more than 20 people to prevent one infection. (NNT ≤ 20, ARR ≥ 0.05)

You expect a baseline infection rate of 11% and power your study to pick up an ARR of 5% (NNT = 20, RRR = 50%) by picking a group size of 500.

RCT of Prophylactic Antibiotics to Prevent Dog-Bite Infections

InfectionNo

Infection

Prophylactic Antibiotics 8 492 500 0.016

Placebo 15 485 500 0.03

P = 0.1495% Conf.

Intrv.

RR 0.53 0.23 - 1.25

ARR 0.014 -0.005 - 0.03

Clinicians, Probability, and EBM Michael A. Kohn 11/29/2007.

Documents

Transcript of Clinicians, Probability, and EBM Michael A. Kohn 11/29/2007.