Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference...
-
Upload
thomasine-foster -
Category
Documents
-
view
230 -
download
0
Transcript of Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference...
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 1
INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE
APPROACH IN THE EVALUATION OF DIAGNOSTIC TEST
PERFORMANCE
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 2
In clinical medicine, different diagnostic tests may be used either to show the presence (ruling in) or absence (ruling out) of a given condition.
Most of these tests are not perfect, that is, they can not discriminate between patients with and without the given condition with 100% accuracy.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 3
Different measures can be used to evaluate the performance of diagnostic tests.
these measures are
Sensitivity (True positive rate-TP)
Specificity (True negative rate-TN)
1-Sensitivity (False negative rate-FN)
1-Specificity (False positive rate-FP)
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 4
A combination of these measures, such as likelihood ratios, are also valuable measures for evaluating diagnostic tests (with binary outcomes).
Another performance criterion is ROC curve analysis if diagnostic variable is non-binary.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 5
The discriminating power of a diagnostic test is determined by three parameters, two of which (the sensitivity and specificity) depend only on the proporties of diagnostic test itself, while the third (the prevalence of the target disorder) depends on the population being studied1.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 6
Although ROC curves help us to understand many important features of diagnostic tests, they can not be used to determine whether one test performs better than another at a different cut-off point at a given prevalence, so new approaches are needed to evaluate tests.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 7
Besides these conventional methods new techniques are being studied for the evaluation and applicability of diagnostic tests, which provide more detailed conclusions, about the diagnostic criteria.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 8
At this point, information theory, and its extension, the Kullback-Leibler distance seem as powerful tools for handling such problems.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 9
The uncertainty related to the disease under consideration before the diagnostic test is applied, is referred to as “a priori uncertainty” and after the test result comes out, the uncertainty is referred to as “a posteriori uncertainty”.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 10
Uncertainty is measured in terms of bits, and as indicated by Shannon and Weaver, for i mutually exclusive events (Zi), each with probabilities P (Zi) of occurring2,3:
i
i2i )P(Z)logP(ZH(Z)
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 11
The difference between the “a posteriori uncertainty” and “a priori uncertainty” is regarded as the gain in information obtained by the diagnostic test and is referred to as the information content of that test.
Metz at al. developed an equation that measures the information content, I, in bits, as given below4,5:
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 12
where,
Pr)-(FP)(1+(TP)(Pr)=B
(1)
[ ][ ][ ] [ ][ ] [ ]B)-FP)/(1-(1log×Pr)-FP)(1-(1+
B)-TP)/(1-(1log×TP)(Pr)-(1+
(FP/B)log×Pr)-(FP)(1+
(TP/B)log×(TP)(Pr)=I
2
2
2
2
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 13
Note that for a given test, the information gain will be a function of the cut-off and the prevalence.
By using information theory, it is possible to determine the test that has the maximum information and to choose the best cut-off value, the point that maximize the information content at a given prevalence.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 14
The graphical methods introduced by information theory enables us to compare diagnostic tests over a wide range of prevalence values in terms of their maximum information and to identify how best cut-off point may change as the disease prevalence changes4,5,6.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 15
Glaucoma is one of the leading causes of blindness worldwide, with a prevalence rate as high as 8% among negros and around 1% to 2% among white population.
It is a progressive optic neuropathy characterized with increased intraocular pressure (IOP), optic nerve head damage, retinal nerve fiber layer (RNFL) defects and typical glaucomatous visual field loss.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 16
In the last decade, many diagnostic imaging devices including laser scanning technology and optical coherence tomography are being used worldwide in glaucoma. These techniques allow quantitative evaluation of optic nerve head and RNFL and have considerable sensitivity and specificity for glaucoma screening. We obtain a lot of parameters with these techniques. However, there is no clear cut point for these parameters separating the glaucoma patients from the healthy subjects, since most of the measurements show overlap between the groups. Many statistical techniques are used to evaluate the measurements and make a correct diagnosis.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 17
Nerve Fiber Analyzer (NFA GDx) which is one of these imaging devices, measures the RNFL thickness using 14 parameters.
Symmetry
Superior Ratio
Inferior Ratio
Superior/Nasal
Maximum Modulation
Superior Maximum
Inferior Maximum
The Number
Ellipse Modulation
Average Thickness
Ellipse Average
Superior Average
Inferior Average
Superior Integral
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 18
We used information theory for determining the parameter with the highest discriminating power and the cut-off points related with this parameter.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 19
In order to determine to parameter which has the highest information content for a wide range of prevalence (MIP curve) and the best cut-off point for a given prevalence, information theory utilizes two types of graphs.
2. Graph of the information content versus cut-off for a given prevalence
1. Graph of maximum information versus for a wide range of prevalence (MIP curve)
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 20
Under the binormal assumption, the performance of a diagnostic test may be described using two indices:
m, the distance between the two populations’ means expressed units of standard deviation of the disease free (D-) disribution, and
s, the ratio of the standard deviations,
SD (D-)/SD(D+) (2)
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 21
In algebraic terms, relationship between true and false positive rates will be given by the stright line equation
ZTP=sZFP + sΔm
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 22
For a given prevalence, by using s and Δm values, a curve of information versus the diagnostic variable can be drawn. The steps are as follows4:Step 1. For different cut-off values the TP and corresponding
normal deviates ZTP were found. The set of cut-off values and ZTP were then fitted to a polynomial and for each
value of ZTP a cut-off was calculated.Step 2. For a set of values of ZFP, corresponding ZTP values were
calculated by using equation ZTP=sZFP + sΔmStep 3. The ZTP and ZFP values were converted to TP and FP. Step 4. To calculate the information, FP and TP values found in
and the prevalence were substituted into information content equation (1).
Step 5. In order to express the information as a function of the diagnostic variable, by using equation in Step 2, for each ZFP a ZTP was found and this was substituted in the polynomial defined in Step 1 to get the corresponding
cut- off.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 23
0,00
0,05
0,10
0,15
0,20
0,25
0,30
0,35
0,40
0,45
0,50
0,00 0,20 0,40 0,60 0,80 1,00
Prevalance
Max
imum
Inf
orm
atio
n C
onte
nt (B
its)
)
Symmetry
Superior Ratio
Inferior Ratio
Superior/Nasal
Maximum Modulation
Superior Maximum
Inferior Maximum
The Number
Ellipse Modulation
Average Thickness
Ellipse Average
Superior Average
Inferior Average
Superior Integral
Figure 1. Maximum Information Contents of Parameters Versus Prevalence (MIP Curves).
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 24
Pr=1%
Pr=2%
0,00
0,01
0,02
0,03
0,04
0,05
0,06
0,07
0,08
15,0 20,0 25,0 30,0 35,0 40,0 45,0 50,0 55,0 60,0
The Number Cut-off
Info
rmat
ion
Con
ten
t (B
its)
Figure 2. Information Content of “The Number” Versus Cut-off Values at Two Different Prevalences
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 25
The Kullback-Leibler distance is a concept within information theory, and is also known as relative entropy or divergence7.
It measures the distance between two probability distributions.
Let the ordinal diagnostic test results be indexed by i, i=1,2,…,K.
The proportions of the diseased subjects and disease free in the ith testing category are assumed known and are denoted as fi and gi, respectively.
1 i i ii gf
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 26
That is,
fi=Pr(test result is i amongst diseased subjects) and
gi=Pr(test result is i amongst disease free subjects).
The Likelihood Ratio (LR) function for this test is LRi=fi/gi.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 27
The Kullback-Leibler distance measures the distance between the diseased and the disease free distributions (the f and the g distributions).
Using the disaese free distribution as the reference, the Kullback-Leibler distance (denoted) as D(f||g) is defined by the following equation:
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 28
K
iii
K
i i
ii LRf
g
ff
11
loglogg)||D(f
K
iii
K
i i
ii LRg
f
gg
11
)/1log(logf)||D(g
For Ordinal Test Results
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 29
+- logLRSe+logLRSe)-(1=
Sp-1
SelogSe+
Sp
Se-1logSe)-(1=g)||D(f
••
••
)log(1/LRSp)-(1+)log(1/LRSp= Se
Sp-1logSp)-(1+
Se-1
SplogSp=f)||D(g
+- ••
••
For Binary Test Results
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 30
Since the mathematics required for the computation of Kullback-Leibler distance are complex for continious diagnostic variables, it is more practical to categorize a continuous diagnostic variable into ordinal scale.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 31
g)||expD(f=Pin
Pin is the ratio, for a randomly selected diseased subject, of the post test disease odds to pre test disease odds.
Pin measures the “increase” in disease odds for a diseased subject.
In order to rule in or rule out a disease two quantities, Pin and Pout can be formulated as follows
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 32
Let the ordinal diagnostic test results be indexed by i, i=1,2,…,K (K≥2).
Before testing, each subject has an equal pre test disease odds, which is denoted as h. After testing, the post test disease odds differ depending on the test results.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 33
The post test disease odds is (fi/gi)h for ith category of the test results.
Among the diseased subjects in the testing population, the geometrically averaged post test disease odds is
in
K
i i
ii
fK
i i
i
fK
i i
i Phg
ffh
g
fhh
g
fii
.logexp.111
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 34
Similarly, we can prove that the geometrically averaged post test disease odds, among the disease free subjects in the testing population, is h/Pout.
f)||expD(g=Pout
Pout measures the “decrease” in disease odds of a disease free subject.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 35
With these new performance indices, we can now have a better undertanding of the rule in and rule out potentials of a diagnostic test.
A diagnostic test with greater D(f||g) (and greater Pin), if administered, will on average make disease presence more likely among the diseased subjects in the testing population-its potential of ruling in disease is higher.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 36
Where as a diagnostic test with greater D(g||f) (and greater Pout), if administered, will on average make disease presence less likely among the disease free and it has a higher rule out potential.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 37
GDx Parameters D(f||g) expD(f||g)
The Number 1,37393 3,95084
Maximum Modulation 1,04413 2,84093
Superior Ratio 1,01587 2,76176
Inferior Maximum 1,00882 2,74237
Ellipse Modulation 0,97852 2,66050
Superior Average 0,91469 2,49601
Inferior Ratio 0,90279 2,46648
Superior/Nasal 0,89588 2,44948
Superior Maximum 0,88383 2,42015
Ellipse Average 0,81593 2,26127
Inferior Average 0,79773 2,22048
Superior Integral 0,74410 2,10454
Average Thickness 0,62698 1,87195
Symmetry 0,04826 1,04944
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 38
GDx Parameters D(g||f)expD(g||
f)
The Number 1,02595 2,78974
Inferior Ratio 0,95252 2,59222
Superior Ratio 0,94909 2,58335
Maximum Modulation 0,94310 2,56793
Superior Maximum 0,84195 2,32089
Ellipse Modulation 0,80664 2,24036
Inferior Maximum 0,78916 2,20154
Superior Average 0,73443 2,08430
Superior/Nasal 0,69222 1,99815
Superior Integral 0,61580 1,85113
Inferior Average 0,59339 1,81011
Ellipse Average 0,56779 1,76436
Average Thickness 0,39611 1,48603
Symmetry 0,04175 1,04264
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 39
GDx Parameters (Rule in)
The Number
Maximum Modulation
Superior Ratio
Inferior Maximum
Ellipse Modulation
Superior Average
Inferior Ratio
Superior/Nasal
Superior Maximum
Ellipse Average
Inferior Average
Superior Integral
Average Thickness
Symmetry
GDx Parameters (Rule out)
The Number
Inferior Ratio
Superior Ratio
Maximum Modulation
Superior Maximum
Ellipse Modulation
Inferior Maximum
Superior Average
Superior/Nasal
Superior Integral
Inferior Average
Ellipse Average
Average Thickness
Symmetry
Information Theory
The Number
Ellipse Modulation
Inferior Ratio
Maximum Modulation
Superior Maximum
Superior Average
Superior Ratio
Inferior Maximum
Superior/Nasal
Inferior Average
Superior Integral
Ellipse Average
Average Thickness
Symmetry
Discussion:
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 40
1,0
1,5
2,0
2,5
3,0
3,5
4,0
Th
e N
um
ber
Ma
xim
um
Mo
du
lati
on
Su
per
ior
Ra
tio
Infe
rior
Max
imu
m
Ell
ipse
Mo
du
lati
on
Su
per
ior
Av
erag
e
Infe
rior
Ra
tio
Su
per
ior/
Na
sal
Su
per
ior
Max
imu
m
Ell
ipse
Aver
ag
e
Infe
rior
Av
era
ge
Su
per
ior
Inte
gra
l
Aver
age
Th
ick
nes
s
Sym
met
ry
GDx Parameters
Pin
an
d P
ou
t V
alu
es
Pin
Pout
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 41
ROC curve helps us to evaluate a diagnostic test as a whole. The area under the ROC curve gives the probability that a randomly selected diseased subject will have more positive test result when compared to a nondiseased subject.
It does not take into account the prevalence of the disease.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 42
A better approximation for evaluating a diagnostic test is the use of information theory, where the prevalence is considered.
By means of information theory the maximum information content of a diagnostic test for a given prevalence, best cut-off point at different prevalences can be calculated.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 43
Kullback-Leibler distance which is an extention of information theory further helps to evaluate the ruling in and ruling out potentials of a diagnostic test.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 44
“The Number” is the best parameter in terms of ROC curve, information theory and Kullback-Leibler distance both for ruling in and ruling out the disease and “Symmetry” is the worst in all categories. The ordering of other parameters may change in terms of above mentioned methods.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 45
Larger values of Kullback-Leibler distance indicates greater seperation between two distributions.
For a parameter as D(f||g) increases the corresponding Pin value also increases.
Among GDx parameters “The Number” seems to be the best parameter for ruling in the disease, whereas “Symmetry” is the worst.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 46
For “The Number” the ratio of post test odds to pre test odds of being diseased is 3.95 for a randomly selected diseased subject. For a disease subject “The Number” increases the disease odds by 3.95 times.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 47
Similarly, for a parameter as D(g||f) increases Pout increases. For ruling out the disease again “The Number” is the best and “Symmetry” is the worst parameter for ruling out the disease.
The ratio of pre test disease odds to post test disease odds for a randomly selected disease free subject is 2.79 for “The Number” parameter. For a disease free subject “The Number” decreases the disease odds by 2.79 times.
Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ
3rd EMR-IBS International Conference CORFU, 2005 48
REFERENCES1. Mossman D, Somoza E. Maximizing Diagnostic Information From the
Dexamethasone Suppression Test. Arch Gen Psychiatry. 1989;46:653-660.
2. Shannon, CE, Weaver W. The Mathematical Theory of Communication, İllinois Press, Urbana, 1949:18-20.
3. Mossman D, Somoza E. Diagnostic Tests and Information Theory. J Neuropsychiatry Clin Neurosci. 1992;4:95-98.
4. Somoza E, Mossman D. Optimizing REM Latency as a Diagnostic Test for Depression Using Receiver Operating Characteristic Analysis and Information Theory. Biol Psychiatry. 1990;27:990-1006
5. Somoza E, Esperon LS, Mossman D. Evaluation and Optimization of Diagnostic Tests Using Receiver Operating Characteristic Analysis and Information Theory. Int J Biomed Comput. 1989;24:153-189
6. Somoza E, Mossman D. Comparing Diagnostic Tests Using Information Theory: The INFO-ROC Technique. J Neuropsychiatry Clin Neurosci. 1992;4:214-219.
7. Wen-Chung Lee. Selecting Diagnostic Tests for Ruling Out or Ruling In Disease: The Use of the Kullback-Leibler Distance. International Journal of Epidemiology. 28(1999) 521-525.