Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P....

12
Decision Analysis as the Basis for Computer-Aided Management of Acute Renal Failure G. ANTHONY GORRY, Ph.D. JEROME P. KASSIRER, M.D. ALVIN ESSIG, M.D. WILLIAM 6. SCHWARTZ, M.D. Boston. Massachusetts From the Sloan School of Management, Massachusetts Institute of Technology, the Medical Service of the New England Medical Center Hospitals, and the Department of Med- icine, Tufts University School of Medicine, Boston, Massachusetts. This study was sup- ported in part by Grants LM 01508, HE 759, HE 13648 and HL 14322 from the National Institutes of Health and the Samuel Bass Foundation. Requests for reprints should be addressed to Dr. William B. Schwartz, New England Medical Center Hospitals, 171 Harri- son Avenue, Boston, Massachusetts 02111. Manuscript accepted March 28, 1973. In recent years many attempts have been made to use the computer as an aid to diagnosis, but little has been done to ex- ploit the potential of computer technology as a more general aid to decision making. We describe the use of the discipline of decision analysis as the basis for an experimental interactive computer program designed to assist the physician in the clini- cal management of acute oliguric renal failure. The program deals with alternative courses of action, either tests or treat- ments, for which the potential risks or benefits may be large, and it balances the anticipated risk of a given strategy against the anticipated benefit that it offers the patient. The appraisals of the different courses of action open to the physician are ex- pressed in quantitative terms as expected value. The program has been evaluated by comparing its recommendations to those of experienced nephrologists in 18 simulated cases of acute oliguric renal failure. Agreement between the nephrolo- gists and the program was found in more than 90 per cent of cases, but the experiments identified a series of problems that must be resolved if the program is eventually to be widely use- ful as a “consultant.” For example, it will be necessary to de- velop strategies for dealing with multiple diseases occurring simultaneously, with signs and symptoms that frequently are not independent of one another, with changes in the pattern of a disease over time, and with the weighing of priorities for carrying out tests and treatments. We conclude that computer- aided management with the aid of decision analysis is a prom- ising area for further investigation. For a decade or more, efforts have been made to develop com- puter programs to assist in diagnosis [l]. The initial approach was to gather a complete profile of signs, symptoms and lab- oratory data and then to process this information utilizing Bayes theorem [2,3]. Although such a strategy often works well, it has the drawback that it requires massive amounts of clinical data, far more than can be obtained routinely in clinical practice. This difficulty has, however, been overcome recently by a com- October 1973 The American Journal of Medicine Volume 55 473

Transcript of Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P....

Page 1: Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P. KASSIRER, M.D. ALVIN ESSIG, M.D. WILLIAM 6. SCHWARTZ, M.D. Boston. Massachusetts From

Decision Analysis as the Basis for Computer-Aided Management of Acute Renal Failure

G. ANTHONY GORRY, Ph.D.

JEROME P. KASSIRER, M.D.

ALVIN ESSIG, M.D.

WILLIAM 6. SCHWARTZ, M.D.

Boston. Massachusetts

From the Sloan School of Management, Massachusetts Institute of Technology, the Medical Service of the New England Medical Center Hospitals, and the Department of Med- icine, Tufts University School of Medicine, Boston, Massachusetts. This study was sup- ported in part by Grants LM 01508, HE 759, HE 13648 and HL 14322 from the National Institutes of Health and the Samuel Bass Foundation. Requests for reprints should be addressed to Dr. William B. Schwartz, New England Medical Center Hospitals, 171 Harri- son Avenue, Boston, Massachusetts 02111. Manuscript accepted March 28, 1973.

In recent years many attempts have been made to use the computer as an aid to diagnosis, but little has been done to ex- ploit the potential of computer technology as a more general aid to decision making. We describe the use of the discipline of decision analysis as the basis for an experimental interactive computer program designed to assist the physician in the clini- cal management of acute oliguric renal failure. The program deals with alternative courses of action, either tests or treat- ments, for which the potential risks or benefits may be large, and it balances the anticipated risk of a given strategy against the anticipated benefit that it offers the patient. The appraisals of the different courses of action open to the physician are ex- pressed in quantitative terms as expected value. The program has been evaluated by comparing its recommendations to those of experienced nephrologists in 18 simulated cases of acute oliguric renal failure. Agreement between the nephrolo- gists and the program was found in more than 90 per cent of cases, but the experiments identified a series of problems that must be resolved if the program is eventually to be widely use- ful as a “consultant.” For example, it will be necessary to de- velop strategies for dealing with multiple diseases occurring simultaneously, with signs and symptoms that frequently are not independent of one another, with changes in the pattern of a disease over time, and with the weighing of priorities for carrying out tests and treatments. We conclude that computer- aided management with the aid of decision analysis is a prom- ising area for further investigation.

For a decade or more, efforts have been made to develop com-

puter programs to assist in diagnosis [l]. The initial approach

was to gather a complete profile of signs, symptoms and lab- oratory data and then to process this information utilizing Bayes

theorem [2,3]. Although such a strategy often works well, it has the drawback that it requires massive amounts of clinical

data, far more than can be obtained routinely in clinical practice.

This difficulty has, however, been overcome recently by a com-

October 1973 The American Journal of Medicine Volume 55 473

Page 2: Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P. KASSIRER, M.D. ALVIN ESSIG, M.D. WILLIAM 6. SCHWARTZ, M.D. Boston. Massachusetts From

DECISION ANALYSIS-GORRY ET AL.

Sample Physician-Computer Dialogue

Phase I Program

The program you are about to use is intended for the diagnosis of acute renal failure. The program considers 14 of the most frequent causes of acute

renal failure, continually revising its opinion of the likelihood of each diagnosis in light of new data provided by the user. The diseases included and

their abbreviations are

ATN

FARF

OBSTR

AGN

CN

HS

PYE

AE

RI

RVT

VASC

SCL

CGAE

MH

Acute tubular necrosis

Functional acute renal failure (fluid loss, shock, etc.)

Urinary tract obstruction (bladder neck, bilateral ureteral. etc.)

Poststreptococcal glomerulonephritis

Renal cortical necrosis

Hepatorenal syndrome

Acute fulminant pyelonephritis without obstruction

Atheromatous embolism

Renal infarction (bilateral)

Renal vein thrombosis (after severe dehydration in children)

Renal vasculitis (“allergic” vasculitis. polyarteritis. lupus erythematosus, rapidly progressive glomerulonephritis. Goodpasture’s

syndrome, Wegener’s granulomatosis

Scleroderma

Chronic glomerulonephritis with acute exacerbation

Malignant hypertension with malignant nephrosclerosis

The Program Now Begins.

A Priori Probabilities Are*

Probability Disease

0.400 FARF

0.250 ATN

0.100 AGN

O.!OO OBSTR

0.050 VASC

0.030 MH

0.030 CGAE

0.020 CN

Question l-What is the patient’s age?

1. O-10

2. 11-30

3. 31-50 4. 51-70

5. Over 70

Reply: 5

The current distribution is

Disease Probability

FARF 0.58 OBSTR 0.22

ATN 0.09

Question 2-What is the sex?

1. Male 2. Pregnant Female

3. Non-Pregnant Female

Reply: 1

The current distribution is

Disease Probability - ___ FARF 0.54

OBSTR 0.26

ATN 0.08

Probability Disease

0.010 PYE

0.005 HS

0.003 AE

0.002 SCL

0.002 RVT

0.002 RI

Question 3-What was the blood pressure at the onset of oliguria?

1. Low 2. Normal

3. Moderately Elevated

4. Severely Elevated

Reply: 3

The current distribution is

Disease Probability

OBSTR 0.44 FARF 0.30

VASC 0.08

Question 4-What is the daily urine volume?

1. Less than 50 cc/day 2. 50-400 cc/day

Reply:2

The current distribution is

Disease Probability

FARF 0.53 OBSTR 0.24

AE 0.08

*Note that these probabilities do not sum to 1 .O. The program permits such entries in order to facilitate the specification of probabilities by the

physician. The numbers given arenormalized by the program after entry to assure that the probabilities employed do sum to 1 .O.

474 October 1973 The American Journal of Medicine Volume 55

Page 3: Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P. KASSIRER, M.D. ALVIN ESSIG, M.D. WILLIAM 6. SCHWARTZ, M.D. Boston. Massachusetts From

DECISION ANALYSIS--GORRY ET AL.

Question 5.-What is the krdney size on plarn film of the abdomen? ~______ 1. Small

2. Normal

3. Large

4. Very Large

Reply: 3

The current distribuhon is

Disease Probability

OBSTR 0.80

FARF 0.12

PYE 0.04

Question B-Was there a IargeTluid loss preceding the onset of oliguria?

Reply: No

The current distribution is

Disease Probability

OBSTR 0.88 PYE 0.05

FARF 0.03

Question 7-What is the degree of Proteinurra’J

1. 0

2. trace to 2+

3. 3+to4+

Reply:1

The current distribution IS

Disease Probability

OBSTR 0.94 FARF 0.03

PYE 0.03

Question 8-1s there a history of prolonged hypotension

preceding the OnSt?t of oliguria?

Reply. No

The current distribution is

Disease Probability

OBSTR 0.96 PYE 0.03

Figure 1. Typical interactive dialogue between the physician and the phase I computer program. The final diagnosis, which was arrived at after eight questions were asked, was urinary tract obstruction.

puter program which operates in the interactive mode and which usually can arrive at a diagnosis quickly by requesting only the most critical infor- mation [4,5]. This latter program, like its predeces- sors, still has the serious deficiency that it is indif- ferent to the risks and pain involved in various tests and has no way of balancing the dangers and discomforts of a procedure against the value of the information to be gained. In this sense it lacks a key element that characterizes the practice of a good physician.

We describe an interactive computer program which deals with this problem by incorporating the potential risks and potential benefits of tests and treatments into the decision-making process, uti- lizing the discipline of decision analysis [2].* As a prototype for study we chose acute oliguric renal failure.

The program is divided into two portions: phase I, which considers only tests that involve little risk or discomfort, e.g., historic data, chemical tests of blood, and phase I I, which utilizes tests or treat- ments for which the potential risks are significant.

We also describe the structure of the program, the way in which it has performed in the diagnosis and management of simulated clinical cases, and the problems that must be resolved if the technic is to have value as a “consultant” to the practic- ing physician.

The system to be described has been imple- mented on a time-sharing facility at the Massa-

*In an accompanying paper we have shown how the disci-

pline of decision analysis can be utilized without the aid of a

computer in the management of complex clinical disorders

[31.

chusetts Institute of Technology, utilizing Fortran 4 as a programming language.

METHODS

Selection of the Clinical Problem. The clinical problem of acute renal failure was selected for several reasons. First, the number of diseases causing acute oliguric renal failure is relatively small and manageable. Second, the problem is within the field of our expertise. Third, the clinical characteristics and the therapy of the diseases causing acute renal failure are rather well defined. The Phase I Program. The phase I portion of the program, as mentioned earlier, considers only tests for which the risk or cost is negligible so that the potential benefit can therefore be mea- sured solely in terms of the expected amount of information to be gained. The program operates in a sequential mode, engaging in an interactive dia- logue with the physician (Figure 1) and has two basic functions. The first, the inference function, evaluates the diagnostic significance of new attri- butes (signs, symptoms and laboratory results) in light of the facts already available about a patient. The second function, the question selection func- tion, determines which question should be asked next in order to maximize the expected gain in in- formation. The underlying concepts of both of these functions will be discussed subsequently. The computer programs have been described elsewhere and will not be considered in detail here [5]. The inference function: The inference function is the means by which the program interprets diag- nostic evidence about a patient. Given the a priori

October 1973 The American Journal of Medicine Volume 55 475

Page 4: Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P. KASSIRER, M.D. ALVIN ESSIG, M.D. WILLIAM 6. SCHWARTZ, M.D. Boston. Massachusetts From

DECISION ANALYSIS-GORRY ET AL

TABLE I Sample of Data Base for Proteinuria*-Phase I Program

Probabilities*

Diseases

Trace 3+ to

0 to2+ 4+

Acute tubular necrosis

Functional acute renal failure

Urinary tract obstruction

Poststreptococcal

glomerulonephritis

Renal cortical necrosis

Hepatorenal syndrome

Acute fulminant pyelonephritis

without obstruction

Atheromatous embolism

Renal infarction (bilateral)

Renal vein thrombosis

Renal vasculitis

Scleroderma

Chronic glomerulonephritis

with acute exacerbation

Malignant hypertension with

malignant nephrosclerosis

0.1 0.8 0.1

0.8 0.2 0.001

0.7 0.3 0.001

0.01 0.2 0.8

0.01 0.8 0.2

0.8 0.2 0.001

0.4 0.6 0.001

0.1 0.8 0.1

0.1 0.7 0.2

0.001 0.1 0.9

0.01 0.2 0.8

0.1 0.4 0.5

0.001 0.2 0.8

0.001 0.4 0.6

* Note that these probabilities do not always sum to 1.0. The

program permits such entries in order to facilitate the speci-

fication of probabilities by the physician. The numbers given are normalized by the program after entry to assure that the probabilities employed do sum to 1.0.

probabilities of the disease being considered, the program revises these estimates on the basis of new information. To do this the program utilizes three types of data: (1) P(D)-The initial or cur- rent probability that a patient with the medical problem under consideration has a given disease D. (2) P(S/D)-The probability of an attribute S, given the disease D. (3) P(S)-The probability of the attribute S occurring in the medical problem under consideration. As we will discuss, initial values of P(D) and the values for P(S/D) are based on expert opinion. At each stage in the di- agnostic process, the revised probabilities P(D/S) are obtained in accordance with Bayes rule. They then replace the values P(D) for the next stage. Question selection function: The question selec- tion function analyzes the available diagnostic questions and chooses the one which promises the maximum reduction of uncertainty about the diagnosis. The difference between the current un- certainty and the expected uncertainty if the pro- gram were to receive a given answer measures the potential value of each individual question. This function is described in detail in the “Appen- dix.” Termination of questioning: All the information. requested (e.g., blood chemistry values, historic data) can be collected with virtually no risk to the

patient, but because the information-gathering process does have some nonrisk costs (in terms of money, loss of time, pain, etc.) the questioning strategy minimizes, by use of a termination proba- bility, the number of questions which must be asked. If any disease attains a probability higher than this termination probability, the program ter- minates the questioning and prints out the two or three leading diagnoses with their associated probability values. . The data base: The phase I program “under-

stands” a medical problem only in terms of the contents of the data base. Each of the compo- nents of the data base is described.

Disease list: The disease list for the phase I program is shown in Figure 1. For purposes of the present studies we have assumed that any patient will have one and only one disease; possible strat- egies for dealing with combinations of two or more diseases will be discussed later.

Attributes: The data base specifies the attri- butes of potential diagnostic significance for the diseases under consideration. We defined 31 questions relating to such items as the degree of hematuria, urine sodium concentration, urine sediment, history of recent streptococcal infection and roentgenologic data. For each question an appropriate choice of possible answers is provid- ed. For example, for the question relating to pro- teinuria three answers are recognized; namely, none, trace to 2i-, or 3-t to 4+ (Table I). For this particular question, more than three catego- ries were thought to provide no additional diag- nostic capability, whereas less than three were estimated to reduce the capacity to distinguish one disorder from another. In the case of other questions, the matrix consisted of as few as two and as many as seven possible answers. It shoirld be noted that if extra attributes that have little di- agnostic value are included in the data base, the program will automatically discover that they are unimportant through the use of the question selec- tion function. Thus, when there was uncertainty about whether to include an attribute, it was re- tained rather than deleted. Figure 1 shows an ex- ample of how the questions relating to attributes are employed by the program.

In addition to diseases and attributes, the data base also contains relationships between them, (see example, Table I) i.e., an estimate of the probabiiity that a patient having a given disease will manifest a particular attribute. The number of such probabilities can be quite large; for example, in a medical area in which 20 diseases and 50 signs and symptoms are considered, the program requires 1,000 probabilities for disease-attribute

476 October 1973 The American Journal of Medicine Volume 55

Page 5: Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P. KASSIRER, M.D. ALVIN ESSIG, M.D. WILLIAM 6. SCHWARTZ, M.D. Boston. Massachusetts From

pairs. The current phase I program utilizes 1,036 probabilities in the data base. The required proba- bilities were obtained by asking an experienced nephrologist a question such as the following: of every 100 patients known to have acute renal fail- ure secondary to urinary tract obstruction, how many will have a urine volume less than 50 cc/ day? Two basic assumptions underlie this ap- proach. The first is that a clinician who performs well in diagnosis must have a good grasp of the relevant probabilities through his experience and familiarity with the literature. The second is that the physician can state his opinion with reason- able accuracy. We have relied on subjective probabilities in this study because examination of the literature concerning acute renal failure has indicated that detailed quantitative information is, in a high percentage of instances, not yet avail- able. A priori probabilities: The a priori probability of a disease represents the likelihood that the disease will be present in a patient, given no knowledge whatever concerning him except the general na- ture of his presenting problem. In acute renal fail- ure, for example, the program must take into ac- count that acute tubular necrosis is a much more common disease than cortical necrosis (Figure

1). Routine questio.ns: Four questions are asked rou- tinely at the outset in an effort to simulate the usual approach of physicians (Figure 1). These questions ascertain the patient’s sex, age, blood pressure and urine volume at the onset of acute renal failure. Although the routine use of these questions is unnecessary because, as mentioned earlier, the program has the capability to analyze each question to ascertain its potential value, this approach seems desirable in order to make the program more acceptable to the user. Questions related to specific diseases: The ques- tion selection function will not ordinarily choose a question relating to a rare disease because the expected information associated with such a question is quite small. Therefore, in an effort not to overlook a relatively uncommon disease be- cause a critical question is not asked, we have in- troduced disease-related questions. That is, if at any stage during the diagnostic process the prob- ability value for the rare disease in question ex- ceeds a predetermined threshold, the disease-re- lated question is automatically asked. For exam- ple, if the probability of scleroderma reaches a value of 0.1, a question is asked concerning the presence of skin, intestinal or lung lesions. Dis- ease related questions are also asked when a common disease with a clinical pattern similar to

DECISION ANALYSIS-~GORRY ET AL

a rare disease achieves a high probability value. If, for example, the probability of acute glomerulo- nephritis exceeds 0.6, a question is asked con- cerning skin, lung or joint lesions in order to ob- tain data that might lead to a diagnosis of vasculi- tis. The Phase II Program. Computer strategies: The phase II program differs from the phase I program in that it must balance the expected risk of a given strategy against the expected benefit. Technics derived from utility theory [2) were em- ployed to make quantative assessments of the opinions of experts. The basic ideas underlying these technics are discussed elsewhere [3]. Brief- ly, the expected value of a test in a given disease state reflects the diagnostic usefulness of the test measured against the associated pain, the cost, the time of skilled personnel required, the risk, etc. Similarly, the expected value of a treatment reflects the potential for benefiting the patient measured against the associated pain, risk and other costs. With regard to assessment of values in this study, two comments are pertinent. While preparing the data base it became apparent that in the case of acute renal failure, the risk of a se- rious complication from tests or treatments was far more important in our minds than monetary costs, discomfort and inconvenience. Second, all judgments regarding values were made only by us. No attempt has yet been made to account for the preferences of patients or of other physicians. The way in which the values of other physicians and patients can be incorporated is clearly an important matter that requires additional study.

In simplest terms, the test/treatment selection function utilized by the phase II program consid- ers whether it is best to treat the patient immedi- ately or to first carry out an additional diagnostic test. At any point in the problem-solving process the program considers each possible treatment on the assumption that no further diagnostic tests are to be used. It then selects the treatment with maximal expected value, that is, the treatment which promises the largest benefit relative to its risk [3]. However, before recommending this treatment, the program turns to the evaluation of all further available tests to determine whether additional testing offers an advantage. For each possible result of each test, the program, through the use of the inference function, simulates the change in the probability distribution which would occur if a given result were obtained. For this new distribution a new choice of optimal treatment is made and the expected value of this treatment is computed. Taking into account the likelihood of each result, an expected value of the test followed

October 1973 The American Journal of Medicine Volume 55 477

Page 6: Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P. KASSIRER, M.D. ALVIN ESSIG, M.D. WILLIAM 6. SCHWARTZ, M.D. Boston. Massachusetts From

DECISION ANALYSIS-GORRY ET AL.

by the most appropriate treatment is obtained. The program then incorporates the risk of the test in order to determine whether the over-all expect- ed value of the test is greater than that of immedi- ate treatment. The data base: The data base for the phase II program contains three tests (Table 1 I). For each test there is a matrix which represents the likeli- hood that a test will yield various results, given that the patient has one of the 14 disease states. A second matrix contains the probabilities that each of these tests will lead to a complication in each of the diseases considered. For example, we estimated that an open kidney biopsy was accom- panied by a 10 per cent chance of serious compli- cation for the patient with functional acute renal failure and a 25 per cent chance of a serious complication for a patient with pyelonephritis and acute renal failure.

In addition to tests, the program contains eight

TABLE II Treatments, Tests and Consequences-Phase II Program

Tests (1) Biopsy. Open kidney biopsy under spinal anesthesia.

(2) Retrograde. Standard retrograde pyelography.

(3) Arteriography. Transfemoral renal arteriography.

Treatments

(1) The “Conservative” approach. Maintain fluid balance.

Dialyze if necessary for the prevention or treatment of uremia. Continue program for at least seven days.

(2) intravenous fluids. To volume-contracted patient, give

fluids and electrolytes over a 1 or 2 day period in

amounts sufficient to restore normal hydration. If

state of hydration is uncertain or appears normal, give,

on 2 successive days, 1 liter/day more of fluid than the

amount required to maintain water balance.

(3) Surgery for urinary tract obstruction. Operate im-

mediately to relieve the obstruction.

(4) Steroids. Give prednisone, 80 mg/day for at least 7

days. It is assumed that treatment with this drug is

superimposed on the conservative approach (No. 1).

(5) Antibiotics. Give appropriate doses of antibiotics and adjust drug dose in light of renal function, quantitative

urine cultures and patient’s clinical response. Treat- ment with antibiotics is superimposed on the conserva-

tive approach (No. 1).

(6) Surgery for clots. Operate immediately on renal vessels to remove clot.

(7) Antihypertensive drugs. Give sufficient quantity of

antihypertensive drugs to bring blood pressure under adequate control. Treatment with drugs is super- imposed on the conservative approach (No. 1).

(8) Hepsrin. Give sufficient heparin to prolong clotting time to 20 minutes. Treatment with this drug is super-

imposed on the conservative approach (No. 1). Consequences

(1) Condition improved. (2) Condition unchanged.

(3) Condition worse.

treatments (Table II) and a matrix that gives the probabilities that a given treatment will result in one of the three consequences: “improved,” “un- changed” and “worse.” For example, it was esti- mated that following steroid therapy, a patient with acute renal failure secondary to vasculitis has a 15 per cent chance that his condition will improve, a 25 per cent chance that it will not change and a 60 per cent chance that it will be- come worse (Table Ill). Obviously, the discrimi- nation provided by this degree of definition of out- comes is rather gross, but we felt justified in using this approach as a first approximation. Refinement of the data base: The data bases for the phase I and phase II programs were refined by giving to each program a variety of clinical problems and checking the resulting decisions for agreement with ours. In this initial phase of study, the decisions of the program were notably differ- ent in a number of instances. In each such dis- crepancy, a review indicated that imprecision in the definitions of tests and treatments had led to incorrect subjective estimates in the data base. For example, we learned that we had not ade- quately defined factors such as duration of thera- py or treatments that were mutually exclusive. For each program, then, the data base evolved from its initial definition to one which is now more specific as well as consistent with our judgment.

Parenthetically, we should note that we made

TABLE III Sample of Data Base for Phase II Program- Therapeutic Effects of Steroid Therapy in Various Disease States (Expressed as Proba- bilities)

Result

No Disease Improved Change Worse

Acute tubular necrosis 0.60 0.20 0.20 Functional acute renal failure 0.05 0.35 0.60 Urinary tract obstruction 0.05 0.60 0.35 Poststreptococcal

glomerulonephritis 0.40 0.40 0.20 Renal cortical necrosis 0.05 0.75 0.20 Hepatorenal syndrome 0.05 0.05 0.90

Acute fulminant pyelonephritis

without obstruction 0.05 0.05 0.90 Atheromatous embolism 0.05 0.70 0.25 Renal infarction (bilateral) 0.01 0.14 0.85 Renal vein thrombosis 0.10 0.30 0.60 Renal vasculitis 0.15 0.25 0.60 Scleroderma 0.05 0.05 0.90 Chronic glomerulonephritis

with acute exacerbation 0.40 0.35 0.25 Malignant hypertension with

malignant nephrosclerosis 0.05 0.05 0.90

478 October 1973 The American Journal of Medicine Volume 55

Page 7: Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P. KASSIRER, M.D. ALVIN ESSIG, M.D. WILLIAM 6. SCHWARTZ, M.D. Boston. Massachusetts From

DECISION ANALYSIS-GORRY ET AL.

certain peripheral changes in the program to facil- itate its use. For example, explanations of the in- tention of the program were written, and the re- sponses of the program to various questions were provided in English. Large matrices were broken down into their component parts and made avail- able to display in a simplified form that permits ready review of the data. In addition, the program was adapted to interact with a cathode ray tube console, making it possible to display large amounts of information rapidly.

RESULTS

Testing the Phase I Program. After preliminary testing, the phase I program was presented with 33 new simulated case histories considered to be typical examples of the 14 diseases in the data base. Each diagnostic problem consisted of a set of signs or symptoms for a hypothetical patient with a specified diagnosis. Whenever the program selected a question for use, the answer was pro- vided in accordance with the hypothetical case history. Both the accuracy of the program’s diag- nosis and the number of questions required were evaluated. The “termination level” at which a di- agnosis was accepted by the program was set ini- tially at 90 per cent. In all 33 instances the pro- gram reached the 0.90 probability level for some disease. Furthermore, in all but two of the hypo- thetical cases (renal vein thrombosis in children), the diagnosis made by the program was identical

with that given by the physicians. In these latter two cases functional acute renal failure was diag- nosed. The average number of questions asked before a diagnosis was reached (including the four obligatory questions relating to age, sex, urine volume and blood pressure level at the onset of oliguria) was 7.7. Thus, on the average, the program reached a 90 per cent certainty level by asking only about one fourth of the total of 31 questions that were available.

We then set the threshold for establishing a di- agnosis at 0.95 instead of 0.90 and again present- ed the 33 hypothetical cases to the program. At this level there was total agreement between the diagnoses made by the program and those estab- lished by the physicians. The two cases of renal thrombosis misdiagnosed at the 90 per cent threshold level were now correctly diagnosed. In this experiment the program asked an average of 8.7 questions including the four obligatory ques- tions. Testing the Phase II Program. As a preliminary step the program was asked to select the most appropriate treatment for each disease in the data base. This was accomplished by setting the prob- ability of each disease in turn to 1.0. Recall that the best treatment for a given disease is not stored in the data base. Instead, the data base contains probabilities of consequences of each treatment/disease pair (Table I I I) and the values of these consequences. The program selects a

treatment by first computing an expected value for

TABLE IV Computer’s Choice of Treatment and Associated Expected Value for Each Disease*

Disease Treatment Expected Value

Condition improved (+5,000)

Functional acute renal failure Intravenous fluids $3,750

Urinary tract obstruction Surgery for obstruction +3.125

Acute tubular necrosis Conservative therapy +1,750

Chronic glomerulonephritis with

acute exacerbation Conservative therapy +850

Acute fulminant pyelonephritis

(without obstruction) Antibiotics +750

Poststreptococcal

glomerulonephritis Conservative therapy +750

Renal infarction (bilateral) Surgical removal of clot -1,250

Renal vein thrombosis Surgical removal of clot -1,250

Renal cortical necrosis Heparin -1,875

Atheromatous embolism Conservative therapy -2,500

Condition unchanged (-2,500)

Renal vasculitis Steroids -2,875

Malignant hypertension with

malignant nephrosclerosis Antihypertensive drugs -3,125

Hepatorenal syndrome Intravenous fluids -3,875

Scleroderma Conservative therapy -4,250

Condition worse (-5,000)

* In each instance the treatment selected by the computer for the disease state was the same as that chosen by the physicians.

October 1973 The American Journal of Medicine Volume 55 479

Page 8: Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P. KASSIRER, M.D. ALVIN ESSIG, M.D. WILLIAM 6. SCHWARTZ, M.D. Boston. Massachusetts From

DECISION ANALYSIS-GORRY ET AL.

each treatment, given a disease.* In the case of acute tubular necrosis, for example, the program ranked “conservative therapy” as the best choice, obviously an appropriate recommendation. In fact, for each disease the treatment chosen by the pro- gram agreed with that chosen by us (Table IV).

In reviewing the expected values of the best treatment for all 14 diseases, it became apparent that these values are, as might be expected, a re- liable measure of the seriousness of the clinical situation (Table IV). (We have here considered the value of a serious complication to be -5,000, and the value of a marked improvement in the pa- tient’s condition, +5,000.) Thus, if the value of the best treatment for a given disease is very low, then either the disease is basically irreversible or any possibly effective treatment is hazardous. Similarly, if the value is high, then it is likely that the patient’s condition will improve with therapeu- tic measures which involve little risk.

Our study showed that the prognostic index for the 14 diseases in question ranged from a high of +3,750 to a low of -4,250 (Table IV). The dis- eases at the high end of this range included func- tional acute renal failure (FARF), urinary tract ob- struction, acute tubular necrosis (ATN) and acute glomerulonephritis. The lowest values were ob- tained in diseases almost invariably irreversible; the three lowest values, for example, were found for patients with acute renal failure secondary to malignant hypertension, hepatorenal syndrome and scleroderma.

*It is evident that given the diagnosis, further tests cannot be useful because no further information can be gained.

The final test of the phase II program was con- ducted using a set of 18 hypothetical clinical problems in which there was a varying degree of uncertainty as to the true diagnosis (such as P(ATN) = 0.75 and P(FARF) = 0.25, i.e., a pa- tient thought to have a 75 per cent chance of hav- ing acute tubular necrosis and a 25 per cent chance of having functional acute renal failure). The two nephrologists agreed on the appropriate test or treatment for each of these cases, and their decisions served as the standard for evaluat- ing the response of the program. In 14 of the 18 cases the decision of the program and the physi- cians agreed (Table V), and in the remaining four cases the first choice made by the program was considered by the physicians to be a reasonable one (Table VI).

COMMENTS

Formal strategies for dealing with the problem of risks and benefits of one course versus another, so-called decision analysis, have been utilized widely in management and in economics, but have received scant attention in medicine. In a previous paper, we have described the qualitative approach to the use of decision analysis in clinical practice and have suggested that this technic can not only be applied to difficult medical prob- lems, but also that it can be taught easily [3]. We have also described the use of quantitative deci- sion analysis [3] but have indicated that the quan- titative approach is not likely to win wide accep- tance among physicians since the computational requirements are relatively large and the effort in-

TABLE V Choices of Tests and/or Treatments in the Hypothetical Cases Used to Test the Phase II Program*

Case No. Hypothetical Case Physician 1st Choice Program 1st Choice

1 ATN 0.12, OBSTR 0.75, AGN 0.13

2 ATN 0.05, FARF 0.90, AGN 0.05

3 ATN 0.85, FARF 0.10, AGN 0.05

4 ATN 0.50, OBSTR 0.50

5 SCL 0.05, CGAE 0.20, MH 0.75

6 AGN 0.75, MH 0.25

7 AGN 0.50, MH 0.50

a CN 0.50, RI 0.50 9 AGN 0.20, PYE 0.80

10 AGN 0.40, PYE 0.40, AE 0.20

11 ATN 0.50, FARF 0.50 12 ATN 0.30, OBSTR 0.10, PYE 0.60

13 ATN 0.30, OBSTR 0.60, PYE 0.10 14 SCL 0.50, M H 0.50 15 ATN 0.50, AE 0.50 16 AGN 0.60, CN 0.30, SCL 0.10 17 ATN 0.60, RVT 0.20, SCL 0.20 ia AGN 0.50, VASC 0.50

Retrograde pyelography Intravenous fluids

Intravenous fluids

Retrograde pyelography Antihypertensive drugs

Antihypertensive drugs

Antihypertensive drugs Heparin

Antibiotics

Antibiotics

Intravenous fluids

Retrograde pyelography

Retrograde pyelography Antihypertensive drugs Conservative therapy Heparin

Arteriography Steroids

Retrograde pyelography

Intravenous tluids

Intravenous fluids

Retrograde pyelography

Antihypertensive drugs

Antihypertensive drugs

Antihypertensive drugs

Arteriography

Antibiotics

Antibiotics

Intravenous fluids Antibiotics

Retrograde pyelography Antihypertensive drugs

Conservative therapy

Conservative therapy

Conservative therapy Steroids

* For disease names matching abbreviations see Figure 1.

480 October 1973 The American Journal of Medicine Volume 55

Page 9: Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P. KASSIRER, M.D. ALVIN ESSIG, M.D. WILLIAM 6. SCHWARTZ, M.D. Boston. Massachusetts From

DECISION ANALYSIS -GORRY ET AL.

TABLE VI Detailed Analysis of Hypothetical Cases in Which Physicians and Program’s 1st Decision Differed

Program Per cent

Difference

Case

No. Diagnostic Probabilities* Physician

Choices Choices

Expected Between Value Utilitiest

8 CN 0.50, RI 0.50 1st: Heparin

2nd: Arteriography

12

16

17

ATN 0.30, OBSTR 0.10, 1st:

PY E 0.60 2nd:

AGN 0.60, CN 0.30, 1st: SCL 0.10 2nd:

ATN 0.60, RVT 0.20, 1st: SCL 0.20 2nd:

Retrograde pyelography

Antibiotics

Heparin

Biopsy

Arteriography

Conservative therapy

1st: Arteriography

2nd: Surgery (clot)

3rd: Heparin

1st: Antibiotics

2nd: Retrograde pyelography

1st: Conservative therapy

2nd: Heparin

1st: Conservative therapy

2nd: Heparin

-2,268 . . . -2,375 1.1

-2,875 6.1

+35ll . . .

-590 9.4

-988 . . . -1,000 0.1

-400 . . -475 0.8

* For disease names matching abbreviations see Figure 1.

t These calculations are based on the full utility scale of from -5,000 to +5,000.

volved probably cannot be justified except in the occasional very complex case.

Given this problem, we have in the present study turned to the computer for a possible solu- tion. In essence, we have explored the possibility that the computer might make the quantitative ap- proach sufficiently simple so that formal decision analysis might potentially become widely useful as a means of assisting the physician in determining the values of alternative courses of action.

Our experience with a prototype program for acute renal failure supports the value of this ap- proach. Using data bases derived from expert opinion, the program performed notably well when challenged by a series of hypothetical acute renal failure problems; in almost all instances the diag- nostic or management decision was the same as that of two experts. Specifically, the phase I pro- gram, which deals with low risk, routine tests, diagnosed all 33 test cases correctly at the 0.95 probability level after utilizing only one third to one half of the questions available. The phase II pro- gram, which deals with tests and treatments in- volving significant risks, made choices which agreed with the experts in 14 of 18 cases, and in each instance in which disagreements occurred, the first choice made by the program was a rea- sonable alternative to that made by the clinicians.

These results support the view, expressed ear- lier, that subjective probability estimates should provide a good initial basis for computer-based decision analysis of complex clinical problems. It is evident that such probability estimates must be approximately correct if the program is to perform successfully; but it also appears that small errors in the estimates are not important, primarily be- cause the decisions made by the program are based on the combination of large aggregates of

such numbers. The influence of small errors is probably further lessened by the fact that value judgments also play an important role in shaping the program’s decision. Potential Advantages of a Computer-Based Sys-

tem. If a computer-based system were eventual- ly to become operational and widely used, it would offer a number of advantages which are perhaps worth emphasizing. First, because com- puters can store large volumes of information rel- evant to a clinical problem (e.g., probabilities, values), they would reduce the burden on physi- cian memory that is now imposed by a huge and burgeoning number of important medical facts. Thus, facts relevant to each clinical problem would be immediately accessible in an organized framework as defined by the appropriate decision tree. A second advantage of a computer-based system would be its consistent performance. Al- though most physicians can make reasonable judgments regarding probabilities and values of outcomes, they, for a variety of reasons (inade- quate time, fatigue, recent unfortunate oufcome in a similar case), do not perform consistently in merging these judgments into diagnostic and ther- apeutic strategies. By contrast, given the data, the performance of the computer program does not vary. Finally, the system would permit ready se@ aration of probabilities from value judgments. Since this in effect would make it convenient to introduce information that is specific for the pa- tient, it would protect against a possible dehuman- izing influence of the computer. Problems for Future Research in Decision Analy- sis. Although our initial work has proved encour- aging, it is evident that there are a number of im- portant problems that must be resolved before computer-based decision analysis can become a

October 1973 The American Journal of Medicine Volume 55 401

Page 10: Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P. KASSIRER, M.D. ALVIN ESSIG, M.D. WILLIAM 6. SCHWARTZ, M.D. Boston. Massachusetts From

DECISION ANALYSIS-GORRY ET At

practical tool for assisting the clinician. Patients frequently do not present with uncomplicated, clearly defined illnesses such as considered in the acute renal failure program, and technics must therefore be devised which will allow the program to cope with the real world complexities. Let us briefly consider several problems posed by these complexities and possible strategies for their solu- tion. First, in both diagnosis and management the experimental program does not consider the pos- sibility that the patient is suffering from more than one disease. Thus, we have no capability for deal- ing with the complex issues that may result when the clinical presentation of the primary illness is modified by a second disorder. In certain cases the sblution to this problem may lie in the redefini- tion of the discrete disorders such as “obstructive uropathy” and “pyelonephritis” into a single new disease state, “obstructive uropathy and pyelone- phritis.” If, however, there are large numbers of such combinations, this approach will almost cer- tainly make excessive demands on the construc- tion of appropriate data bases and for this reason other solutions may well have to be sought.

Our present program also does not consider the fact that the signs and symptoms of a disease often change over time and that the clinical pic- ture may therefore vary enormously depending on the point in his illness at which the patient pre- sents himself to the physician. Acute glomerulo- nephritis is a good example of this problem. A sin- gle set of probabilities obviously will not describe adequately the findings that are encountered at each stage of the disease, nor will such a set of probabilities provide a proper basis for manage- ment decisions that are in many instances a criti- cal function of the stage of the disease (e.g., con- servative management early in the illness as com- pared to renal transplantaiion or maintenance di- alysis after several months of anuria). Still a fur- ther problem relating to choice of treatment is brought irito focus by the difference between the recommendations of the physician and the man- agement program in Case 12 (Table VI). In this case the difficulty clearly stemmed from the fact that two simultaneous approaches were indicated; one, a treatment for the acute problem posed by severe pyelonephritis, and the other, a diagnostic procedure-retrograde pyelography-to rule out the possibility of underlying urinary tract obstruc- tion. Obviously, programs must be constructed which will have the capability of recognizing cir- cumstances such as these in which two or more actions should be carried out concurrently.

A further problem to be faced is the means of

deriving appropriate values for various treatments or tests. In the present study the clinicians stated their preferences for a “typical” patient with acute renal failure; in so doing they made global judg- ments without an explicit analysis of those factors which might be relevant to a particular patient. These might include quality of life, disability, loss of income, pain, anxiety, etc. For the program to have practical value it must incorporate such pa- tient specific information provided by either the in- dividual physician or the patient himself. This list of limitations, which could readily be extended, points to a few of the important directions in which future work must be directed. The Limitations of Decision Analysis. Although decision analysis promises to be of considerable value in the diagnosis and management of limited, well defined clinical problems, this technic alone is unlikely to deal with the wide range of complex issues involved in computer-aided decision mak- ing. One basic problem is that the introduction of the full complexity of clinical medicine produces a tremendous expansion of the data base, making its analysis unfeasible even for a computer. Thus, a program, if it is to be broadly useful, must somehow rapidly narrow the scope of a problem so that the use of decision analysis becomes fea- sible. What will these strategies be? We believe that some hint can be derived from the consider- ations examined in our previous paper on the qualitative approach to decision analysis [3]. As discussed in this previous paper, the expert clini- cian recognizes many patterns or situations which enable him to reduce the definition of the clinical problem to manageable size prior to his detailed analysis, and it appears that a computer-based system will have to incorporate similar strategies if it is to successfully deal with a broad range of complex problems. For example, if the physician knows that there is an epidemic of p hemolytic streptococcal infection in the community and a patient presents with proteinuria, gross hematuria and red blood cell casts, he can make an immedi- ate preliminary diagnosis of acute glomerulone- phritis without any attempt at analysis. His experi- ence has shown this to be a “good guess” and the capacity to use such experience will almost cer- tainly be a feature of an effective computer sys- tem. Similarly, just as a competent clinician dis- counts the diagnostic significance of hematuria in a patient with an indwelling catheter, so the com- puter will have to do the same without elaborate formalisms. Such strategies are important since their use would greatly diminish the size of the data base that would otherwise have to be con-

482 October 1973 The American Journal of Medicine Volume 55

Page 11: Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P. KASSIRER, M.D. ALVIN ESSIG, M.D. WILLIAM 6. SCHWARTZ, M.D. Boston. Massachusetts From

strutted when dealing with a presenting finding such as abdominal pain, coma and other difficult problems in diagnosis and management.

I-

If relatively few such “pieces of advice” were necessary in order for the program to deal effec- tively with a complex area such as renal disease, a decision analysis program could be modified to incorporate such information. We believe, how- ever, that the number of rules a program would need to cope with a difficult case in its entirety is too large to be accommodated in a structured program. Instead, there is a need to develop a system which can accept such rules directly from the physician. A general approach to such a pro- gramming strategy has recently been described [6] and would appear to provide a logical starting point for the development of a program that could supplement the approach we describe. Such a system, by providing the capability to modify and control decision analysis, could, in principle, make it possible for the computer ultimately to become a broadly useful tool in both the diagnosis and management of a difficult clinical problem.

APPENDIX i

The Question Selection Function. The uncertainty measure used in the question selection function is based on the probability distribution for the dis- eases. If one disease has a very high probability, then there is little uncertainty; whereas if all dis- eases are equally likely, the uncertainty is at a maximum. For a given probability distribution, the program uses a function H (called entropy) to as- sign a number to the associated uncertainty [7].

If we designate the entropy associated with probability p for condition 1 and probability l-p for condition 2 by H (p, l-p), the following general relationships hold: (1) H(p, l-p) = H(l-p, p) for any p between 0 and 1. (2) H(l .O, 0) < H(0.9, 0.1) < H(0.8, 0.2) < . . . < H(0.5, 0.5). There- fore, the more uncertain the diagnosis, the greater the value of H. Because it has a number of prop- erties which correspond to our intuitive notions of uncertainty, the entropy function has proved a useful measure in many contexts, such as analy- sis of language by computers, engineering design and the design of codes for communication [7].

The program first determines the value of H for the current probability distribution. Then, for each possible question, the program computes an ex- pected value for H which will result if that ques- tion is used. The general way in which this is done can be seen in the following example.

Suppose only two diseases are being consid-

DECISION ANALYSIS -~GORRY ET AL

-__

structure of QuestIon SelectIon Program

Phase I Program

0 P(ATN.FARF) = (0 30 o 70)

“Tubular cell casts?”

TCC

k

no TGC

P(TCC) = 0 36 P(no TCC) = 0 62 .

_, ‘_.._

0 ._

0

f’(ATN.FARF) = (0 64, 0.36) P(ATN.FARF) = (0 10. 0.90)

H is defned by

H(PI. PZ. . . P,,) = -PI togzp, - pztog,p, - - P,JogzP,,

~he~eo~p~~1fo~k=i.2.....~a~dpl+p~+. _+P,,=l;tog2Pk

is the logarithm of pk to the base 2.

tf we defme PktogPpk = 0 for pk = 0. then -pkiogzpk ? o for ~II Ph

between 0 and 1.

For the present example the expected reduction rn uncertainty =

t-f(o.30. 0.70) - 0.38 H(0.64. 0 36) - 0.62 H(0 10. 0 90) = 0.2.326

Figure 2. Structure of the question selectron function

used in the phase I computer program. For further de-

tails, see “Appendix. ”

ered, acute tubular necrosis and functional acute renal failure, with current probabilities of 0.30 and 0.70, respectively. The value of H associated with these probabilities is calculated. Consider the question “Are there tubular cell casts (TCC) in the patient’s urine?” Assume there are two possible answers to this question. “Yes” (denoted by “TCC”) and “No” (denoted by “no TCC”). For each answer, the program simulates the receipt of that answer and uses Bayes rule to compute a revised probability distribution (see the previous paper [3]). The value of H for this new distribution is then computed. Given the current disease probabilities and {he probabilities that each an- swer will be received (the expression in the de- nominator of Bayes rule), it is then possible to compute an expected value of H which incorpo- rates the likelihood of all possible answers to the question. The diagram in Figure 2 gives a sche- matic of this process. For this example, on the average, the question promises to reduce uncer- tainty.

The question selection function chooses that question for which the reduction in uncertainty is greatest (Figure 2).

October 1973 The American Journal of Medicine Volume 55 403

Page 12: Decision Analysis as the Basis for Computer-Aided Management … · 2012. 2. 2. · JEROME P. KASSIRER, M.D. ALVIN ESSIG, M.D. WILLIAM 6. SCHWARTZ, M.D. Boston. Massachusetts From

DECISION ANALYSIS-GORRY ET AL

REFERENCES

1. Lusted LB: Introduction to Medical Decision Making, Springfield, III, Charles C Thomas, 1968.

2. Raiffa H: Decision analysis: Introductory Lectures on Choices Under Uncertainty. Reading, Mass, Addison Wesley, 1968.

3. Schwartz WB, Gorry GA, Kassirer JP, Essig A: Decision analysis and clinical judgment. Amer J Med 55: 459, 1973.

4. Gorry GA, Barnett GO: Sequential diagnosis by comput-

er. JAMA 205: 849.1968. 5. Gorry GA: Strategies for computer-aided diagnosis.

Mathematical Biosciences 2: 293, 1968. 6. Sussman GJ, Winograd T, Charniak E: Microplanner Ref-

erence Manual: Memo No. 203 A of the Artificial In- telligence Laboratory, Cambridge, Mass. 1971.

7. Pierce JR: Symbols, Signals and Noise: The Nature and Process qf Communication, New York, Harper Torch- books, 1961.

484 October 1973 The American Journal of Medicine Volume 95