By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf ·...

18
By Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733 12/7/2010

Transcript of By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf ·...

Page 1: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

By

Michael (Zhe) Chen

Helen Culbertson

James Rappaport

Elan Rozmaryn

Mark Schmidt

BUDT733

12/7/2010

Page 2: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Agenda

� Issue

� Data Source

� Analysis

� Action� Action

Page 3: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Issue

� Kidney transplant selection criteria

� Medical Criteria

○ End-stage renal disease

○ Time spent waiting ○ Time spent waiting

○ Blood type

� Demographic Blind?

○ Gender

○ Ethnicity

○ Socio-Economic Status

Page 4: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Issue

� Is there any truth to the allegations?

� Can ethnic and socio-economic factors

explain variances in expected wait times for

Kidney transplant candidates?

Page 5: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Data Source

� The United Network for Organ Sharing

database

� Over 150,000 patients

� Spanning ten years

� Using over 50 different variables

Page 6: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Analysis

1. Preprocessing

2. Categorization

� How do we define the project

3. Evaluation 3. Evaluation

� What variables kept patients within the

range

4. Modeling

5. Results

Page 7: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Analysis - Preprocessing

1. Determined medically appropriate indicators� Conducted domain related research

2. Identified demographic variables to test

3. Limited scope to adults3. Limited scope to adults� Due to differences in pediatric criteria

4. Limited scope to U.S. Citizens� Due to citizenship criteria ambiguity

5. Created a sample of 10,000 records� Due to software limitations

Page 8: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Analysis - Categorization

� Assumed normal distribution

� Created Tolerance range

○ Average days waiting plus 2 Standard

Deviations

Records were either;○ Records were either;

� within range (Success)

� outside range (Failure)

Page 9: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Analysis - Evaluation

� Classification tree

� Ethnicity

○ The most important variable

� GFR (a measure of kidney function)� GFR (a measure of kidney function)

○ Unimportant

Page 10: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Analysis - Modeling

0.5

0.5 0.5

ethcat_1

4140 5858

� Decision Tree Output

0.51

0.51

1 1 10.5

1 1

hlamat_2 hlamat_2

gender_d gender_d

education_ne

2920 1220 4571 1287

1222 1698 1747 2824

1804 1020

Page 11: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Analysis - Modeling

� Regression Output

Coefficient Std. Error p-value Odds

2.90582538 0.12922119 0 18.28032564

-1.08263218 0.10598912 0 0.33870283

-1.07214999 0.13583644 0 0.34227183

-1.08779919 0.19312915 0.00000002 0.33695728

-1.47905588 0.32332769 0.00000477 0.2278527

-1.722031 0.46229008 0.00019531 0.17870285ethcat_hawaiian

Input variables

Constant term

ethcat_black

ethcat_hispanic

ethcat_asian

ethcat_amer_indian

-1.722031 0.46229008 0.00019531 0.17870285

-0.61347246 0.60676026 0.3119866 0.54146737

-0.07645346 0.11885482 0.52006137 0.92639601

0.18034148 0.13425711 0.17918953 1.19762623

0.81629068 0.15910229 0.00000029 2.26209331

1.14885104 0.26462233 0.00001415 3.15456653

1.67617679 0.39389876 0.00002087 5.34508133

1.23707902 0.35201064 0.00044088 3.44553447

2.13338709 0.38291278 0.00000003 8.44341755

13.3478384 492.7839356 0.97839069 626458.625

0.22174832 0.09027406 0.01403406 1.24825716

hlamat_6

hlamat_7

abo_mat_2

abo_mat_3

gender_male

ethcat_hawaiian

ethcat_multiple

hlamat_2

hlamat_3

hlamat_4

hlamat_5

Page 12: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Analysis - Modeling

� Measures of Fit

Class Prob.

1 0.944688938<-- Success

Class

0 0.055311062

9982

3891.517578

94.46889378

15

0.09001472

Residual Dev.

Residual df

% Success in training data

# Iterations used

Multiple R-squared

R-Squared

Page 13: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Analysis - Modeling

Ethnicity OddsBlack 33.87%

Hispanic 34.23%

Odds of Landing Within Tolerance Compared to White

Hispanic 34.23%

Asian 33.70%

American Indian 22.79%

Haw aiian 17.87%

Multi_Racial 54.15%

Page 14: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Visualization

Pre-processing: Within Tolerance by Ethnicity

Proportion

White Black Alaskan-

NA

Hawaiian-

IslanderHispanic Asian Multi-racial

Proportion

within

Tolerance

Page 15: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Visualization

Odds of Landing Within Tolerance Compared to White

0.50

0.45

0.40

0.350.35

0.30

0.25

0.20

0.15

0.10

0.05

0.00

Page 16: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Analysis - Results

� Education (socio-economic proxy)

� Not Significant

� Ethnicity

� Significant� Significant

○ White was used as reference category

○ Non-White

� All ethnicities less likely to be within tolerance

� Least Likely

- Hawaiians & Pacific Islanders

- Native Americans

Page 17: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Action

� Additional Research

� Other non-medical, non-demographic factors

� Analyze reason behind discrepancy

� Make changes� Make changes

� Legislation & Policy

� Education

Page 18: By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf · Michael (Zhe) Chen Helen Culbertson James Rappaport Elan Rozmaryn Mark Schmidt BUDT733

Questions?