Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.
-
date post
21-Dec-2015 -
Category
Documents
-
view
242 -
download
0
Transcript of Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.
![Page 1: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/1.jpg)
Darlene Goldstein 29 January 2003
Receiver Operating Characteristic Methodology
![Page 2: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/2.jpg)
Outline
• Introduction
• Hypothesis testing
• ROC curve
• Area under the ROC curve (AUC)
• Examples using ROC
• Concluding remarks
![Page 3: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/3.jpg)
Introduction to ROC curves
• ROC = Receiver Operating Characteristic
• Started in electronic signal detection theory (1940s - 1950s)
• Has become very popular in biomedical applications, particularly radiology and imaging
• Also used in machine learning applications to assess classifiers
• Can be used to compare tests/procedures
![Page 4: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/4.jpg)
ROC curves: simplest case
• Consider diagnostic test for a disease
• Test has 2 possible outcomes:
– ‘postive’ = suggesting presence of disease
– ‘negative’
• An individual can test either positive or negative for the disease
• Prof. Mean...
![Page 5: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/5.jpg)
Hypothesis testing refresher
• 2 ‘competing theories’ regarding a population parameter:– NULL hypothesis H (‘straw man’)– ALTERNATIVE hypothesis A (‘claim’, or
theory you wish to test)• H: NO DIFFERENCE
– any observed deviation from what we expect to see is due to chance variability
• A: THE DIFFERENCE IS REAL
![Page 6: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/6.jpg)
Test statistic
• Measure how far the observed data are from what is expected assuming the NULL H by computing the value of a test statistic (TS) from the data
• The particular TS computed depends on the parameter
• For example, to test the population mean , the TS is the sample mean (or standardized sample mean)
• The NULL is rejected fi the TS falls in a user-specified ‘rejection region’
![Page 7: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/7.jpg)
True disease state vs. Test result
not rejected rejected
No disease (D = 0)
specificity
XType I error (False +)
Disease (D = 1) X
Type II error (False -)
Power 1 - ; sensitivity
DiseaseTest
![Page 8: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/8.jpg)
Specific Example
Test Result
Pts Pts with with diseasdiseasee
Pts Pts without without the the diseasedisease
![Page 9: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/9.jpg)
Test Result
Call these patients “negative”
Call these patients “positive”
Threshold
![Page 10: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/10.jpg)
Test Result
Call these patients “negative”
Call these patients “positive”
without the diseasewith the disease
True Positives
Some definitions ...
![Page 11: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/11.jpg)
Test Result
Call these patients “negative”
Call these patients “positive”
without the diseasewith the disease
False Positives
![Page 12: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/12.jpg)
Test Result
Call these patients “negative”
Call these patients “positive”
without the diseasewith the disease
True negatives
![Page 13: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/13.jpg)
Test Result
Call these patients “negative”
Call these patients “positive”
without the diseasewith the disease
False negatives
![Page 14: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/14.jpg)
Test Result
without the diseasewith the disease
‘‘‘‘-’-’’’
‘‘‘‘+’+’’’
Moving the Threshold: right
![Page 15: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/15.jpg)
Test Result
without the diseasewith the disease
‘‘‘‘-’-’’’
‘‘‘‘+’+’’’
Moving the Threshold: left
![Page 16: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/16.jpg)
Tru
e P
osi
tive R
ate
(s
en
siti
vit
y)
0%
100%
False Positive Rate (1-specificity)
0%
100%
ROC curve
![Page 17: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/17.jpg)
Tru
e P
osi
tive
Ra
te
0%
100%
False Positive Rate0%
100%
Tru
e P
osi
tive
Ra
te
0%
100%
False Positive Rate0%
100%
A good test: A poor test:
ROC curve comparison
![Page 18: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/18.jpg)
Best Test: Worst test:T
rue
Po
sitiv
e R
ate
0%
100%
False Positive Rate
0%
100%
Tru
e P
osi
tive
R
ate
0%
100%
False Positive Rate
0%
100%
The distributions don’t overlap at all
The distributions overlap completely
ROC curve extremes
![Page 19: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/19.jpg)
‘Classical’ estimation
• Binormal model:
– X ~ N(0,1) in nondiseased population
– X ~ N(a, 1/b) in diseased population
• Then
ROC(t) = (a + b-1(t)) for 0 < t < 1
• Estimate a, b by ML using readings from sets of diseased and nondiseased patients
![Page 20: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/20.jpg)
ROC curve estimation with continuous data
• Many biochemical measurements are in fact continuous, e.g. blood glucose vs. diabetes
• Can also do ROC analysis for continuous (rather than binary or ordinal) data
• Estimate ROC curve (and smooth) based on empirical ‘survivor’ function (1 – cdf) in diseased and nondiseased groups
• Can also do regression modeling of the test result
• Another approach is to model the ROC curve directlyas a function of covariates
![Page 21: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/21.jpg)
Area under ROC curve (AUC)
• Overall measure of test performance
• Comparisons between two tests based on differences between (estimated) AUC
• For continuous data, AUC equivalent to Mann-Whitney U-statistic (nonparametric test of difference in location between two populations)
![Page 22: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/22.jpg)
Tru
e P
osi
tive
Ra
te
0%
100%
False Positive Rate
0%
100%
Tru
e P
osi
tive
R
ate
0%
100%
False Positive Rate
0%
100%
Tru
e P
osi
tive
R
ate
0%
100%
False Positive Rate
0%
100%
AUC = 50%
AUC = 90% AUC =
65%
AUC = 100%
Tru
e P
osi
tive
R
ate
0%
100%
False Positive Rate
0%
100%
AUC for ROC curves
![Page 23: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/23.jpg)
Interpretation of AUC
• AUC can be interpreted as the probability that the test result from a randomly chosen diseased individual is more indicative of disease than that from a randomly chosen nondiseased individual: P(Xi Xj | Di = 1, Dj = 0)
• So can think of this as a nonparametric distance between disease/nondisease test results
![Page 24: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/24.jpg)
Problems with AUC
• No clinically relevant meaning
• A lot of the area is coming from the range of large false positive values, no one cares what’s going on in that region (need to examine restricted regions)
• The curves might cross, so that there might be a meaningful difference in performance that is not picked up by AUC
![Page 25: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/25.jpg)
Examples using ROC analysis
• Threshold selection for ‘tuning’ an already trained classifier (e.g. neural nets)
• Defining signal thresholds in DNA microarrays (Bilban et al.)
• Comparing test statistics for identifying differentially expressed genes in replicated microarray data (Lönnstedt and Speed)
• Assessing performance of different protein prediction algorithms (Tang et al.)
• Inferring protein homology (Karwath and King)
![Page 26: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/26.jpg)
Homology Induction ROC
![Page 27: Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.](https://reader030.fdocuments.in/reader030/viewer/2022033016/56649d555503460f94a3369e/html5/thumbnails/27.jpg)
Concluding remarks – remaining challenges in ROC methodology
• Inference for ROC curve when no ‘gold standard’
• Role of ROC in combining information?
• Incorporating time into ROC analysis
• Alternatives to ROC for describing test accuracy?
• Generalization of positive/negative predictive value to continuous test? (+/-) predictive value = proportion of patients with
(+/-) result who are correctly diagnosed = True/(True + False)