Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference...

48
Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK- LEIBLER DISTANCE APPROACH IN THE EVALUATION OF DIAGNOSTIC TEST PERFORMANCE

Transcript of Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference...

Page 1: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 1

INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE

APPROACH IN THE EVALUATION OF DIAGNOSTIC TEST

PERFORMANCE

Page 2: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 2

In clinical medicine, different diagnostic tests may be used either to show the presence (ruling in) or absence (ruling out) of a given condition.

Most of these tests are not perfect, that is, they can not discriminate between patients with and without the given condition with 100% accuracy.

Page 3: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 3

Different measures can be used to evaluate the performance of diagnostic tests.

these measures are

Sensitivity (True positive rate-TP)

Specificity (True negative rate-TN)

1-Sensitivity (False negative rate-FN)

1-Specificity (False positive rate-FP)

Page 4: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 4

A combination of these measures, such as likelihood ratios, are also valuable measures for evaluating diagnostic tests (with binary outcomes).

Another performance criterion is ROC curve analysis if diagnostic variable is non-binary.

Page 5: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 5

The discriminating power of a diagnostic test is determined by three parameters, two of which (the sensitivity and specificity) depend only on the proporties of diagnostic test itself, while the third (the prevalence of the target disorder) depends on the population being studied1.

Page 6: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 6

Although ROC curves help us to understand many important features of diagnostic tests, they can not be used to determine whether one test performs better than another at a different cut-off point at a given prevalence, so new approaches are needed to evaluate tests.

Page 7: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 7

Besides these conventional methods new techniques are being studied for the evaluation and applicability of diagnostic tests, which provide more detailed conclusions, about the diagnostic criteria.

Page 8: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 8

At this point, information theory, and its extension, the Kullback-Leibler distance seem as powerful tools for handling such problems.

Page 9: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 9

The uncertainty related to the disease under consideration before the diagnostic test is applied, is referred to as “a priori uncertainty” and after the test result comes out, the uncertainty is referred to as “a posteriori uncertainty”.

Page 10: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 10

Uncertainty is measured in terms of bits, and as indicated by Shannon and Weaver, for i mutually exclusive events (Zi), each with probabilities P (Zi) of occurring2,3:

i

i2i )P(Z)logP(ZH(Z)

Page 11: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 11

The difference between the “a posteriori uncertainty” and “a priori uncertainty” is regarded as the gain in information obtained by the diagnostic test and is referred to as the information content of that test.

Metz at al. developed an equation that measures the information content, I, in bits, as given below4,5:

Page 12: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 12

where,

Pr)-(FP)(1+(TP)(Pr)=B

(1)

[ ][ ][ ] [ ][ ] [ ]B)-FP)/(1-(1log×Pr)-FP)(1-(1+

B)-TP)/(1-(1log×TP)(Pr)-(1+

(FP/B)log×Pr)-(FP)(1+

(TP/B)log×(TP)(Pr)=I

2

2

2

2

Page 13: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 13

Note that for a given test, the information gain will be a function of the cut-off and the prevalence.

By using information theory, it is possible to determine the test that has the maximum information and to choose the best cut-off value, the point that maximize the information content at a given prevalence.

Page 14: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 14

The graphical methods introduced by information theory enables us to compare diagnostic tests over a wide range of prevalence values in terms of their maximum information and to identify how best cut-off point may change as the disease prevalence changes4,5,6.

Page 15: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 15

Glaucoma is one of the leading causes of blindness worldwide, with a prevalence rate as high as 8% among negros and around 1% to 2% among white population.

It is a progressive optic neuropathy characterized with increased intraocular pressure (IOP), optic nerve head damage, retinal nerve fiber layer (RNFL) defects and typical glaucomatous visual field loss.

Page 16: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 16

In the last decade, many diagnostic imaging devices including laser scanning technology and optical coherence tomography are being used worldwide in glaucoma. These techniques allow quantitative evaluation of optic nerve head and RNFL and have considerable sensitivity and specificity for glaucoma screening. We obtain a lot of parameters with these techniques. However, there is no clear cut point for these parameters separating the glaucoma patients from the healthy subjects, since most of the measurements show overlap between the groups. Many statistical techniques are used to evaluate the measurements and make a correct diagnosis.

Page 17: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 17

Nerve Fiber Analyzer (NFA GDx) which is one of these imaging devices, measures the RNFL thickness using 14 parameters.

Symmetry

Superior Ratio

Inferior Ratio

Superior/Nasal

Maximum Modulation

Superior Maximum

Inferior Maximum

The Number

Ellipse Modulation

Average Thickness

Ellipse Average

Superior Average

Inferior Average

Superior Integral

Page 18: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 18

We used information theory for determining the parameter with the highest discriminating power and the cut-off points related with this parameter.

Page 19: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 19

In order to determine to parameter which has the highest information content for a wide range of prevalence (MIP curve) and the best cut-off point for a given prevalence, information theory utilizes two types of graphs.

2. Graph of the information content versus cut-off for a given prevalence

1. Graph of maximum information versus for a wide range of prevalence (MIP curve)

Page 20: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 20

Under the binormal assumption, the performance of a diagnostic test may be described using two indices:

m, the distance between the two populations’ means expressed units of standard deviation of the disease free (D-) disribution, and

s, the ratio of the standard deviations,

SD (D-)/SD(D+) (2)

Page 21: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 21

In algebraic terms, relationship between true and false positive rates will be given by the stright line equation

ZTP=sZFP + sΔm

Page 22: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 22

For a given prevalence, by using s and Δm values, a curve of information versus the diagnostic variable can be drawn. The steps are as follows4:Step 1. For different cut-off values the TP and corresponding

normal deviates ZTP were found. The set of cut-off values and ZTP were then fitted to a polynomial and for each

value of ZTP a cut-off was calculated.Step 2. For a set of values of ZFP, corresponding ZTP values were

calculated by using equation ZTP=sZFP + sΔmStep 3. The ZTP and ZFP values were converted to TP and FP. Step 4. To calculate the information, FP and TP values found in

and the prevalence were substituted into information content equation (1).

Step 5. In order to express the information as a function of the diagnostic variable, by using equation in Step 2, for each ZFP a ZTP was found and this was substituted in the polynomial defined in Step 1 to get the corresponding

cut- off.

Page 23: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 23

0,00

0,05

0,10

0,15

0,20

0,25

0,30

0,35

0,40

0,45

0,50

0,00 0,20 0,40 0,60 0,80 1,00

Prevalance

Max

imum

Inf

orm

atio

n C

onte

nt (B

its)

)

Symmetry

Superior Ratio

Inferior Ratio

Superior/Nasal

Maximum Modulation

Superior Maximum

Inferior Maximum

The Number

Ellipse Modulation

Average Thickness

Ellipse Average

Superior Average

Inferior Average

Superior Integral

Figure 1. Maximum Information Contents of Parameters Versus Prevalence (MIP Curves).

Page 24: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 24

Pr=1%

Pr=2%

0,00

0,01

0,02

0,03

0,04

0,05

0,06

0,07

0,08

15,0 20,0 25,0 30,0 35,0 40,0 45,0 50,0 55,0 60,0

The Number Cut-off

Info

rmat

ion

Con

ten

t (B

its)

Figure 2. Information Content of “The Number” Versus Cut-off Values at Two Different Prevalences

Page 25: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 25

The Kullback-Leibler distance is a concept within information theory, and is also known as relative entropy or divergence7.

It measures the distance between two probability distributions.

Let the ordinal diagnostic test results be indexed by i, i=1,2,…,K.

The proportions of the diseased subjects and disease free in the ith testing category are assumed known and are denoted as fi and gi, respectively.

1 i i ii gf

Page 26: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 26

That is,

fi=Pr(test result is i amongst diseased subjects) and

gi=Pr(test result is i amongst disease free subjects).

The Likelihood Ratio (LR) function for this test is LRi=fi/gi.

Page 27: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 27

The Kullback-Leibler distance measures the distance between the diseased and the disease free distributions (the f and the g distributions).

Using the disaese free distribution as the reference, the Kullback-Leibler distance (denoted) as D(f||g) is defined by the following equation:

Page 28: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 28

K

iii

K

i i

ii LRf

g

ff

11

loglogg)||D(f

K

iii

K

i i

ii LRg

f

gg

11

)/1log(logf)||D(g

For Ordinal Test Results

Page 29: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 29

+- logLRSe+logLRSe)-(1=

Sp-1

SelogSe+

Sp

Se-1logSe)-(1=g)||D(f

••

••

)log(1/LRSp)-(1+)log(1/LRSp= Se

Sp-1logSp)-(1+

Se-1

SplogSp=f)||D(g

+- ••

••

For Binary Test Results

Page 30: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 30

Since the mathematics required for the computation of Kullback-Leibler distance are complex for continious diagnostic variables, it is more practical to categorize a continuous diagnostic variable into ordinal scale.

Page 31: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 31

g)||expD(f=Pin

Pin is the ratio, for a randomly selected diseased subject, of the post test disease odds to pre test disease odds.

Pin measures the “increase” in disease odds for a diseased subject.

In order to rule in or rule out a disease two quantities, Pin and Pout can be formulated as follows

Page 32: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 32

Let the ordinal diagnostic test results be indexed by i, i=1,2,…,K (K≥2).

Before testing, each subject has an equal pre test disease odds, which is denoted as h. After testing, the post test disease odds differ depending on the test results.

Page 33: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 33

The post test disease odds is (fi/gi)h for ith category of the test results.

Among the diseased subjects in the testing population, the geometrically averaged post test disease odds is

in

K

i i

ii

fK

i i

i

fK

i i

i Phg

ffh

g

fhh

g

fii

.logexp.111

Page 34: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 34

Similarly, we can prove that the geometrically averaged post test disease odds, among the disease free subjects in the testing population, is h/Pout.

f)||expD(g=Pout

Pout measures the “decrease” in disease odds of a disease free subject.

Page 35: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 35

With these new performance indices, we can now have a better undertanding of the rule in and rule out potentials of a diagnostic test.

A diagnostic test with greater D(f||g) (and greater Pin), if administered, will on average make disease presence more likely among the diseased subjects in the testing population-its potential of ruling in disease is higher.

Page 36: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 36

Where as a diagnostic test with greater D(g||f) (and greater Pout), if administered, will on average make disease presence less likely among the disease free and it has a higher rule out potential.

Page 37: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 37

GDx Parameters D(f||g) expD(f||g)

The Number 1,37393 3,95084

Maximum Modulation 1,04413 2,84093

Superior Ratio 1,01587 2,76176

Inferior Maximum 1,00882 2,74237

Ellipse Modulation 0,97852 2,66050

Superior Average 0,91469 2,49601

Inferior Ratio 0,90279 2,46648

Superior/Nasal 0,89588 2,44948

Superior Maximum 0,88383 2,42015

Ellipse Average 0,81593 2,26127

Inferior Average 0,79773 2,22048

Superior Integral 0,74410 2,10454

Average Thickness 0,62698 1,87195

Symmetry 0,04826 1,04944

Page 38: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 38

GDx Parameters D(g||f)expD(g||

f)

The Number 1,02595 2,78974

Inferior Ratio 0,95252 2,59222

Superior Ratio 0,94909 2,58335

Maximum Modulation 0,94310 2,56793

Superior Maximum 0,84195 2,32089

Ellipse Modulation 0,80664 2,24036

Inferior Maximum 0,78916 2,20154

Superior Average 0,73443 2,08430

Superior/Nasal 0,69222 1,99815

Superior Integral 0,61580 1,85113

Inferior Average 0,59339 1,81011

Ellipse Average 0,56779 1,76436

Average Thickness 0,39611 1,48603

Symmetry 0,04175 1,04264

Page 39: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 39

GDx Parameters (Rule in)

The Number

Maximum Modulation

Superior Ratio

Inferior Maximum

Ellipse Modulation

Superior Average

Inferior Ratio

Superior/Nasal

Superior Maximum

Ellipse Average

Inferior Average

Superior Integral

Average Thickness

Symmetry

GDx Parameters (Rule out)

The Number

Inferior Ratio

Superior Ratio

Maximum Modulation

Superior Maximum

Ellipse Modulation

Inferior Maximum

Superior Average

Superior/Nasal

Superior Integral

Inferior Average

Ellipse Average

Average Thickness

Symmetry

Information Theory

The Number

Ellipse Modulation

Inferior Ratio

Maximum Modulation

Superior Maximum

Superior Average

Superior Ratio

Inferior Maximum

Superior/Nasal

Inferior Average

Superior Integral

Ellipse Average

Average Thickness

Symmetry

Discussion:

Page 40: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 40

1,0

1,5

2,0

2,5

3,0

3,5

4,0

Th

e N

um

ber

Ma

xim

um

Mo

du

lati

on

Su

per

ior

Ra

tio

Infe

rior

Max

imu

m

Ell

ipse

Mo

du

lati

on

Su

per

ior

Av

erag

e

Infe

rior

Ra

tio

Su

per

ior/

Na

sal

Su

per

ior

Max

imu

m

Ell

ipse

Aver

ag

e

Infe

rior

Av

era

ge

Su

per

ior

Inte

gra

l

Aver

age

Th

ick

nes

s

Sym

met

ry

GDx Parameters

Pin

an

d P

ou

t V

alu

es

Pin

Pout

Page 41: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 41

ROC curve helps us to evaluate a diagnostic test as a whole. The area under the ROC curve gives the probability that a randomly selected diseased subject will have more positive test result when compared to a nondiseased subject.

It does not take into account the prevalence of the disease.

Page 42: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 42

A better approximation for evaluating a diagnostic test is the use of information theory, where the prevalence is considered.

By means of information theory the maximum information content of a diagnostic test for a given prevalence, best cut-off point at different prevalences can be calculated.

Page 43: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 43

Kullback-Leibler distance which is an extention of information theory further helps to evaluate the ruling in and ruling out potentials of a diagnostic test.

Page 44: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 44

“The Number” is the best parameter in terms of ROC curve, information theory and Kullback-Leibler distance both for ruling in and ruling out the disease and “Symmetry” is the worst in all categories. The ordering of other parameters may change in terms of above mentioned methods.

Page 45: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 45

Larger values of Kullback-Leibler distance indicates greater seperation between two distributions.

For a parameter as D(f||g) increases the corresponding Pin value also increases.

Among GDx parameters “The Number” seems to be the best parameter for ruling in the disease, whereas “Symmetry” is the worst.

Page 46: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 46

For “The Number” the ratio of post test odds to pre test odds of being diseased is 3.95 for a randomly selected diseased subject. For a disease subject “The Number” increases the disease odds by 3.95 times.

Page 47: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 47

Similarly, for a parameter as D(g||f) increases Pout increases. For ruling out the disease again “The Number” is the best and “Symmetry” is the worst parameter for ruling out the disease.

The ratio of pre test disease odds to post test disease odds for a randomly selected disease free subject is 2.79 for “The Number” parameter. For a disease free subject “The Number” decreases the disease odds by 2.79 times.

Page 48: Umut ARSLAN, Ergun KARAAĞAOĞLU Banu BOZKURT, Murat İRKEÇ 3rd EMR-IBS International Conference CORFU, 2005 1 INFORMATION THEORY and KULLBACK-LEIBLER DISTANCE.

Umut ARSLAN, Ergun KARAAĞAOĞLUBanu BOZKURT, Murat İRKEÇ

3rd EMR-IBS International Conference CORFU, 2005 48

REFERENCES1. Mossman D, Somoza E. Maximizing Diagnostic Information From the

Dexamethasone Suppression Test. Arch Gen Psychiatry. 1989;46:653-660.

2. Shannon, CE, Weaver W. The Mathematical Theory of Communication, İllinois Press, Urbana, 1949:18-20.

3. Mossman D, Somoza E. Diagnostic Tests and Information Theory. J Neuropsychiatry Clin Neurosci. 1992;4:95-98.

4. Somoza E, Mossman D. Optimizing REM Latency as a Diagnostic Test for Depression Using Receiver Operating Characteristic Analysis and Information Theory. Biol Psychiatry. 1990;27:990-1006

5. Somoza E, Esperon LS, Mossman D. Evaluation and Optimization of Diagnostic Tests Using Receiver Operating Characteristic Analysis and Information Theory. Int J Biomed Comput. 1989;24:153-189

6. Somoza E, Mossman D. Comparing Diagnostic Tests Using Information Theory: The INFO-ROC Technique. J Neuropsychiatry Clin Neurosci. 1992;4:214-219.

7. Wen-Chung Lee. Selecting Diagnostic Tests for Ruling Out or Ruling In Disease: The Use of the Kullback-Leibler Distance. International Journal of Epidemiology. 28(1999) 521-525.