Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School...

Post on 18-Jan-2018

218 views 0 download

description

Goals Model Selection Data Pre-Processing Data Pre-Processing Model Construction Model Construction System Evaluation System Evaluation Decision Support Cycle

Transcript of Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School...

Lucila Ohno-Machado, MD, PhDmachado@dsg.harvard.edu

Division of Health Sciences and Technology

Harvard Medical SchoolMassachusetts Institute of Technology

Introduction to HST 951Medical Decision Support

Welcome

Objectives• Provide a practical approach to medical decision support• Put a strong emphasis on computer-based applications that

utilize concepts from the fields of artificial intelligence and statistics

• Focus on principled predictive modeling in biomedicine

Audience• Background in quantitative methods is desirable• Undergraduates• Graduate students and post-doctoral fellows (MDs) in medical

informatics

Goals

Model Selection

Data Pre-Processing

ModelConstruction

SystemEvaluation

Decision Support Cycle

Types of Models

What type of support is needed?

• “Exploratory analysis”• “Confirmatory analysis” (gold-standard)

• Clustering• Classification

Inputs

Age 34

2Gender

4

.6

.5

.8

.2

.1

.3.7

.2

“Probabilityof Cancer”

0.6

.4

.2

Mitoses

Neural Networks

Inputs

Coefficients

Output

Independentvariables

Prediction

Age 34

1Gender

4

.5

.8

.40.6

“Probability

of cancer”

p = 1 1 + e -( + cte)

Mitoses

Logistic Regression

CART

Rough Sets

Models

Requirements, Strengths and Weaknesses, Application Examples

• Naïve Bayes• Bayesian Networks• Logistic Regression• Neural Networks• Classification Trees• Rough Set Models• Support Vector Machines• Clustering (Hierarchical and Partitioning)

Evaluation and Comparisons

Classification• Calibration (plots, goodness-of-fit)• Discrimination (ROC areas)• Explanation (variable selection)• Outliers, influential observations (case selection)

Clustering• Distance metrics• Homogeneity• Inter-cluster distance

nl disease

threshold

1.0 3.01.7

FN

TN

FP

TP

“D”

“nl”

nl D

40

4010

10

50 50

50

50

Sensitivity = 40/50 = .8Specificity = 40/50 = .8

ROCcurve

“D”

“nl”

nl D

50

30 0

20

50 50

70

30

“D”

“nl”

nl D

40

4010

10

50 50

50

50

“D”

“nl”

nl D

40

5010

0

50 50

40

60

Sens

itivi

ty

1 - Specificity0 1

1

Thre

shol

d 1.

4Th

r esh

old

1 .7

Thre

shol

d 2.

0

ROC Curves

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Sensitivity

1-Sp

ecifi

city

LRNNRS

Sum

of s

yste

m’s

est

imat

es

Sum of real outcomes0 1

1

overestimation

Calibration Curves

RS Model

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8

Observed

LR Model

0

0.2

0.40.6

0.8

1

0 0.2 0.4 0.6 0.8

Observed

NN Model

0

0.2

0.40.6

0.8

1

0 0.2 0.4 0.6 0.8

Observed

Important Topics

• Decision Analysis• Cost-effectiveness analysis

• Design of Experiments

• Real-World Applications

• Blocking inferences: quantifying anonymity

Examples of Projects

Students have worked in the past in different domains• Diagnosis of

– Coronary Artery Disease– Breast Cancer– Melanoma

• Prognosis in – Interventional Cardiology– Spinal Cord Injury– AIDS– Pregnancy

Data Mining and Predictive Modeling in

(Bio) Medical Databases

0.75

0.77

0.79

0.81

0.83

0.85

0.87

0.89

0.91

1 2 3 4 5 6year

Area

und

er R

OC

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

bala

nce

Logistic Neural Net

We emphasize comparison of different models

0.8 y = e-(X)

LogisticRegression

Modeling the Risk of Major In-Hospital Complications

Following Percutaneous Coronary Interventions

Frederic S. Resnic, Lucila Ohno-Machado, Gavin J. Blake, Jimmy Pavliska, Andrew Selwyn, Jeffrey J. Popma

ACC, 2000

Methods

• Consecutive BWH patients, 1/97 through 2/99 randomly divided into training (n = 1,877) and test (n = 927) sets

• Outcomes: death and combined death, CABG or MI (MACE)

• Validation using independent dataset: 3/99 - 12/99 (n = 1,460)

History Presentation Angiographic Procedural Operator/Lab

age acute MI occluded number lesions annual volumegender primary lesion type multivessel device experiencediabetes rescue (A,B1,B2,C) number stents daily volume iddm CHF class graft lesion stent types (8) lab devicehistory CABG angina class vessel treated closure device experienceBaseline creatinine

Cardiogenic shock

ostial gp 2b3a antagonists

unscheduled case

CRI failed CABG dissection postESRD rotablator

hyperlipidemia atherectomyangiojetmax pre stenosis

Data Source:

max post stenosis

Medical Record

no reflow

Clinician Derived

Dataset: Attributes

Study Population

Cases 2,804 1,460

Women 909 (32.4%) 433 (29.7%)

1/97-2/99 3/99-12/99 Development Set Validation Set

Age > 74yrs 595 (21.2%) 308 (22.5%)

Acute MI 250 (8.9%) 144 (9.9%) Primary 156 (5.6%) 95 (6.5%) Shock 62 (2.2%) 20 (1.4%)

Class 3/4 CHF 176 (6.3%) 80 (5.5%)

gp IIb/IIIa antagonist 1,005 (35.8%) 777 (53.2%)

Death 67 (2.4%) 24 (1.6%) Death, MI, CABG (MACE) 177 (6.3%) 96 (6.6%)

p=.066

p=.340

p=.311

p=.214

p=.058

p=.298

p<.001

p=.110

p=.739

Inputs

Coefficients

Output

Independentvariables

Prediction

Age 34

1Gender

4

.5

.8

.40.6

“Probability

of cancer”

p = 1 1 + e -( + cte )

Mitoses

Logistic Regression

Logistic regression

These models are based on statistics and can only discover linear relationships among the data

Probability of complication

0.6

age

IDDM

CHF class

type

number

procedure

Complications in Coronary Intervention

Logistic and Score Models for Death

OddsRatio p-value

2.51 0.022.12 0.052.06 0.138.41 0.005.93 0.030.57 0.200.53 0.127.53 0.001.70 0.172.78 0.04

Age > 74yrsB2/C LesionAcute MIClass 3/4 CHFLeft main PCIIIb/IIIa UseStent UseCardiogenic ShockUnstable AnginaTachycardicChronic Renal Insuf. 2.58 0.06

Logistic Regression Model

Logistic and Score Models for Death

OddsRatio p-value

2.51 0.022.12 0.052.06 0.138.41 0.005.93 0.030.57 0.200.53 0.127.53 0.001.70 0.172.78 0.04

Age > 74yrsB2/C LesionAcute MIClass 3/4 CHFLeft main PCIIIb/IIIa UseStent UseCardiogenic ShockUnstable AnginaTachycardicChronic Renal Insuf. 2.58 0.06

Logistic Regression Model

beta Riskcoefficient Value

0.921 20.752 10.724 12.129 41.779 3-0.554 -1-0.626 -12.019 40.531 11.022 20.948 2

Prognostic Risk Score Model

Inputs

WeightsIndependentvariables

Dependentvariable

Prediction

Age 34

2Gender

4

.6

.5

.8

.2

.1

.3.7

.2

WeightsHiddenLayer

“Probabilityof Cancer”

0.6

.4

.2

Mitoses

Neural Network

Neural networks

These are mathematical models that can discover non-linear relationships

among the data

Neural networks for predicting death and complications

disease free

death

other complications

age

IDDM

CHF class

type

number

procedure

Death ModelsValidation Set: 1460 Cases

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.00 0.20 0.40 0.60 0.80 1.00

1 - Specificity

Sens

itivi

ty LRScoreaNN

ROC AreaLR: 0.840Score: 0.855aNN: 0.835ROC = 0.50

53.6%

12.4%

21.5%

2.2%0

500

1000

1500

2000

2500

3000

0 to 2 3 to 4 5 to 6 7 to 8 9 to 10 >10

Risk Score Category

Num

ber o

f Cas

es

0%

10%

20%

30%

40%

50%

60%

Risk Score of Death: BWH ExperienceUnadjusted Overall Mortality Rate = 2.1%

Mortality Risk

Number of Cases

62%

26%

7.6%2.9% 1.6% 1.3%0.4% 1.4%

CART

Regression TreesThese are models that partition the data using

one variable at a time, and can model non-linear relationships among data

Diagnosis of Melanoma(Michael Binder, Greg Sharp et al., 1999)

Dermatoscopy

Dermatoscopy 0- TEST: null VALUE: null Num Cases: 700.0 Num Dsrd: 241.0 2- TEST: breath VALUE: 1 Num Cases: 75.0 Num Dsrd: 1.0 ********PRUNED!!! ********PRUNED!!! 1- TEST: breath VALUE: 0 Num Cases: 625.0 Num Dsrd: 240.0 4- TEST: CWtender VALUE: 1 Num Cases: 11.0 Num Dsrd: .0 3- TEST: CWtender VALUE: 0 Num Cases: 614.0 Num Dsrd: 240.0 8- TEST: age VALUE: >32 Num Cases: 611.0 Num Dsrd: 240.0 10- TEST: Duration VALUE: >72 Num Cases: 3.0 Num Dsrd: .0 9- TEST: Duration VALUE: <=72 Num Cases: 608.0 Num Dsrd: 240.0 12- TEST: Duration VALUE: >48 Num Cases: 2.0 Num Dsrd: 2.0 11- TEST: Duration VALUE: <=48 Num Cases: 606.0 Num Dsrd: 238.0 14 - TEST: prevang VALUE: 1 Num Cases: 340.0 Num Dsrd: 92.0 18 - TEST: Epis VALUE: 1 Num Cases: 8.0 Num Dsrd: .0 17 - TEST: Epis VALUE: 0 Num Cases: 332.0 Num Dsrd: 92.0 22- TEST: Worsening VALUE: >72 Num Cases: 6.0 Num Dsrd: .0 21- TEST: Worsening VALUE: <=72 Num Cases: 326.0 Num Dsrd: 92.0 28 - TEST: Duration VALUE: >36 Num Cases: 3.0 Num Dsrd: .0 27- TEST: Duration VALUE: <=36 Num Cases: 323.0 Num Dsrd: 92.0 36 - TEST: Worsening VALUE: >28 Num Cases: 3.0 Num Dsrd: 2.0 35 - TEST: Worsening VALUE: <=28 Num Cases: 320.0 Num Dsrd: 90.0 44 - TEST: age VALUE: >55 Num Cases: 240.0 Num Dsrd: 81.0 52 - TEST: Worsening VALUE: >0 Num Cases: 238.0 Num Dsrd: 81.0 64 - TEST: OldMI VALUE: 1 Num Cases: 49.0 Num Dsrd: 9.0 74 - TEST: Smokes VALUE: 0 Num Cases: 37.0 Num Dsrd: 9.0 86 - TEST: age VALUE: >65 Num Cases: 30.0 Num Dsrd: 5.0 ********PRUNED!!! ********PRUNED!!! 85 - TEST: age VALUE: <=65 Num Cases: 7.0 Num Dsrd: 4 .0 98 - TEST: Worsening VALUE: >2 Num Cases: 5.0 Num Dsrd: 2.0 97 - TEST: Worsening VALUE: <=2 Num Cases: 2.0 Num Dsrd: 2.0 73 - TEST: Smokes VALUE: 1 Num Cases: 12.0 Num Dsrd: .0 63- TEST: OldMI VALUE: 0 Num Cases: 189.0 Num Dsrd: 72 .0 72 - TEST: Nausea VALUE: 0 Num Cases: 165.0 Num Dsrd: 57. 0 84 - TEST: Duration VALUE: >16 Num Cases: 3.0 Num Dsrd: 2.0 83 - TEST: Duration VALUE: <=16 Num Cases: 162.0 Num Dsrd: 55.0 ********PRUNED!!! ********PRUNED!!! 71 - TEST: Nausea VALUE: 1 Num Cases: 24.0 Num Dsr d: 15.0 82 - TEST: Back VALUE: 0 Num Cases: 21.0 Num Dsrd: 15.0 94 - TEST: post VALUE: 1 Num Cases: 1.0 Num Dsrd: .0 93 - TEST: post VALUE: 0 Num Cases: 20.0 Num Dsrd: 15.0 81 - TEST: Back VALUE: 1 Num Cases: 3.0 Num Dsrd: .0 51 - TEST: Worsening VALUE: <=0 Num Cases: 2.0 Num Dsrd: .0 43 - TEST: age VALUE: <=55 Num Cases: 80.0 Num Dsrd: 9.0 50 - TEST: Worsening VALUE: >1 Num Cases: 68.0 Num Dsrd: 5.0 ********PRUNED!!! ********PRUNED!!! ********PRUNED!!! ********PRUNED!!! ********PRUNED!!! ********PRUNED!!! ********PRUN ED!!! ********PRUNED!!! 49 - TEST: Worsening VALUE: <=1 Num Cases: 12.0 Num Dsrd: 4.0 60 - TEST: age VALUE: >47 Num Cases: 10.0 Num Dsrd: 2.0 68 - TEST: OldMI VALUE: 1 Num Cases: 1.0 Num Dsrd: 1.0 67- TEST: OldMI VALUE: 0 Num Cases: 9.0 Num Dsrd: 1.0 ********PRUNED!!! ********PRUNED!!! 59 - TEST: age VALUE: <=47 Num Cases: 2.0 Num Dsrd: 2.0 13 - TEST: prevang VALUE: 0 Num Cases: 266.0 Num Dsrd: 146.0 16- TEST: Duration VALUE: >0 Num Cases: 259.0 Num Dsrd: 146.0 20- TEST: post VALUE: 1 Num Cases: 13.0 Num Dsrd: 2.0 26 - TEST: Diabetes VALUE: 1 Num Cases: 1.0 Num Dsrd: 1.0 25 - TEST: Diabetes VALUE: 0 Num Cases: 12.0 Num Dsrd: 1.0 ********PRUNED!!! ********PRUNED!!! 19 - TEST: post VALUE: 0 Num Cases: 246.0 Num Dsrd: 144.0 24 - TEST: Nausea VALUE: 0 Num Cases: 202.0 Num Dsrd: 105.0 32 - TEST: OldMI VALUE: 1 Num Cases: 13.0 Num Dsrd: 1.0 42 - TEST: BP VALUE: 1 Num Cases: 1.0 Num Dsrd: 1.0 41 - TEST: BP VALUE: 0 Num Cases: 12.0 Num Dsrd: .0 31 - TEST: OldMI VALUE: 0 Num Cases: 189.0 Num Dsrd: 104.0 40 - TEST: age VALUE: >37 Num Cases: 184.0 Num Dsrd: 103.0 48 - TEST: Epis VALUE: 1 Num Cases: 8.0 Num Dsrd: 2.0 58 - TEST: Duration VALUE: >8 Num Cases: 2.0 Num Dsrd: 2.0 57- TEST: Duration VALUE: <=8 Num Cases: 6.0 Num Dsrd: .0 47 - TEST: Epis VALUE: 0 Num Cases: 176.0 Num Dsrd: 101.0 56 - TEST: Duration VALUE: >15 Num Cases: 2.0 Num Dsrd: .0 55 - TEST: Duration VALUE: <=15 Num Cases: 174.0 Num Dsrd: 101 .0 66- TEST: Lipids VALUE: 1 Num Cases: 1.0 Num Dsrd: 1.0 65 - TEST: Lipids VALUE: 0 Num Cases: 173.0 Num Dsrd: 100 .0 76 - TEST: Sweating VALUE: 0 Num Cases: 73.0 Num Dsr d: 32.0 ********PRUNED!!! ********PRUNED!!! 75 - TEST: Sweating VALUE: 1 Num Cases: 100.0 Num Dsrd: 68.0 88 - TEST: Duration VALUE: >8 Num Cases: 7.0 Nu m Dsrd: 2.0 104 - TEST: Rarm VALUE: 0 Num Cases: 5.0 Num Dsrd: .0 103- TEST: Rarm VALUE: 1 Num Cases: 2.0 Num Dsrd: 2.0 87 - TEST: Duration VALUE: <=8 Num Cases: 93.0 Num Dsrd: 66.0 ********PRUNED!!! ********PRUNED!!! 39- TEST: age VALUE: <=37 Num Cases: 5.0 Num Dsrd: 1.0 23 - TEST: Nausea VALUE: 1 Num Cases: 44.0 Num Dsrd: 39.0 30 - TEST: age VALUE: >47 Num Cases: 41.0 Num Dsrd: 39.0 38 - TEST: Duration VALUE: >7 Num Cases: 7.0 Num Dsrd: 5.0 46 - TEST: Larm VALUE: 0 Num Cases: 1.0 Num Dsrd: .0 45 - TEST: Larm VALUE: 1 Num Cases: 6.0 Num Dsrd: 5.0 54 - TEST: Rarm VALUE: 0 Num Cases: 5.0 Num Dsrd: 5.0 53 - TEST: Rarm VALUE: 1 Num Cases: 1.0 Num Dsrd: .0 37 - TEST: Duration VALUE: <=7 Num Cases: 34.0 Num Dsrd: 34.0 29- TEST: age VALUE: <=47 Num Cases: 3.0 Num Dsrd: .0 15 - TEST: Duration VALUE: <=0 Num Cases: 7.0 Num Dsrd: .0 7- TEST: age VALUE: <=32 Num Cases: 3.0 Num Dsrd: .0

asymmetry

border

detail

“benigh”

color

“malig”

borderdetail

< 2

R

< 2

A

detail

Y

“malig”

> 10

“benign”

detail

<2

Y

Performance using ABCD rule

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1ROC CURVES ABCD RULE

1 - SPECIFICITY

SE

NS

ITIV

ITY

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1ROC CURVES OVERALL DIAGNOSIS

1 - SPECIFICITY

SE

NS

ITIV

ITY

Rough Sets

Rough Sets

These are mathematical models that derive rules for grouping cases based

on boolean logic

Multiple subsamples of a large table are created and combined for rule extraction

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

# Sex T3 FTI TT4 TSH Med Status

1 F 1.05 49.9 48 3.8 N OK

2 M 1.10 50.1 49 4.7 Y sick

3 F 1.3 170 51 5.8 N OK

4 M 1.4 175 200 0.4 N sick

If [(number>2) and …]

then

Complication = true

Rules

Comparison of Practical Prediction Models for Ambulation Following

Spinal Cord Injury(Rowland et al, 1998)

Study Population Spinal Cord Injury Model Systems of Care Database

• Admitted to one of 24 federally funded designated regional SCI care systems

• 17,861 patients who sustained a spinal cord injury between 1973 and 1997

• 1755 patients had data for LEMS scores, 1993 to 1997• 1138 had complete data for variables of interest

SCI Mortality NN DesignInput & Output

Admission Info (9 items)

system daysinjury daysagegenderracial/ethnic grouplevel of neurologic fxnASIA impairment indexUEMSLEMS

Ambulation (1 item)

yesno

Results: ROC Curve Area

Model ROC Curve Area Standard Error

Logistic Regression 0.925 0.016

Neural Network 0.923 0.015

Rough Set 0.914 0.016

Results: ROC Curves

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Sensitivity

1-Sp

ecifi

city

LR

NNRS

Other methods

Support Vector Machines, multiple variations of the nearest neighbor

algorithm, etc.

Heart Attack Alert Program(Wang et al., 2001)

Cox’s Models for Prediction

time (years)

Genetic Algorithms

Search mechanism

• Used for variable selection (model construction)

• Case selection (regression diagnostics)

• Multidisorder diagnosis

People

• Brigham and Women’s Hospital • Children’s Hospital• EECS MIT• School of Public Health• Partners Information Systems

Administrivia

Grading based on• 30% homeworks (almost every week)/participation• 30% midterm, open notes• 40% project (no final exam)

Lectures on the WWW for referenceHandouts with Prof. Szolovits’ assistant at NE-43 r416

Questions/Suggestions

• machado@dsg.harvard.edu• isaac_kohane@harvard.edu• psz@mit.edu