By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf ·...
Transcript of By Michael (Zhe) Chen Helen Culbertson James Rappaport ...yahavi1/Projects/CP2010T4_prez.pdf ·...
By
Michael (Zhe) Chen
Helen Culbertson
James Rappaport
Elan Rozmaryn
Mark Schmidt
BUDT733
12/7/2010
Agenda
� Issue
� Data Source
� Analysis
� Action� Action
Issue
� Kidney transplant selection criteria
� Medical Criteria
○ End-stage renal disease
○ Time spent waiting ○ Time spent waiting
○ Blood type
� Demographic Blind?
○ Gender
○ Ethnicity
○ Socio-Economic Status
Issue
� Is there any truth to the allegations?
� Can ethnic and socio-economic factors
explain variances in expected wait times for
Kidney transplant candidates?
Data Source
� The United Network for Organ Sharing
database
� Over 150,000 patients
� Spanning ten years
� Using over 50 different variables
Analysis
1. Preprocessing
2. Categorization
� How do we define the project
3. Evaluation 3. Evaluation
� What variables kept patients within the
range
4. Modeling
5. Results
Analysis - Preprocessing
1. Determined medically appropriate indicators� Conducted domain related research
2. Identified demographic variables to test
3. Limited scope to adults3. Limited scope to adults� Due to differences in pediatric criteria
4. Limited scope to U.S. Citizens� Due to citizenship criteria ambiguity
5. Created a sample of 10,000 records� Due to software limitations
Analysis - Categorization
� Assumed normal distribution
� Created Tolerance range
○ Average days waiting plus 2 Standard
Deviations
Records were either;○ Records were either;
� within range (Success)
� outside range (Failure)
Analysis - Evaluation
� Classification tree
� Ethnicity
○ The most important variable
� GFR (a measure of kidney function)� GFR (a measure of kidney function)
○ Unimportant
Analysis - Modeling
0.5
0.5 0.5
ethcat_1
4140 5858
� Decision Tree Output
0.51
0.51
1 1 10.5
1 1
hlamat_2 hlamat_2
gender_d gender_d
education_ne
2920 1220 4571 1287
1222 1698 1747 2824
1804 1020
Analysis - Modeling
� Regression Output
Coefficient Std. Error p-value Odds
2.90582538 0.12922119 0 18.28032564
-1.08263218 0.10598912 0 0.33870283
-1.07214999 0.13583644 0 0.34227183
-1.08779919 0.19312915 0.00000002 0.33695728
-1.47905588 0.32332769 0.00000477 0.2278527
-1.722031 0.46229008 0.00019531 0.17870285ethcat_hawaiian
Input variables
Constant term
ethcat_black
ethcat_hispanic
ethcat_asian
ethcat_amer_indian
-1.722031 0.46229008 0.00019531 0.17870285
-0.61347246 0.60676026 0.3119866 0.54146737
-0.07645346 0.11885482 0.52006137 0.92639601
0.18034148 0.13425711 0.17918953 1.19762623
0.81629068 0.15910229 0.00000029 2.26209331
1.14885104 0.26462233 0.00001415 3.15456653
1.67617679 0.39389876 0.00002087 5.34508133
1.23707902 0.35201064 0.00044088 3.44553447
2.13338709 0.38291278 0.00000003 8.44341755
13.3478384 492.7839356 0.97839069 626458.625
0.22174832 0.09027406 0.01403406 1.24825716
hlamat_6
hlamat_7
abo_mat_2
abo_mat_3
gender_male
ethcat_hawaiian
ethcat_multiple
hlamat_2
hlamat_3
hlamat_4
hlamat_5
Analysis - Modeling
� Measures of Fit
Class Prob.
1 0.944688938<-- Success
Class
0 0.055311062
9982
3891.517578
94.46889378
15
0.09001472
Residual Dev.
Residual df
% Success in training data
# Iterations used
Multiple R-squared
R-Squared
Analysis - Modeling
Ethnicity OddsBlack 33.87%
Hispanic 34.23%
Odds of Landing Within Tolerance Compared to White
Hispanic 34.23%
Asian 33.70%
American Indian 22.79%
Haw aiian 17.87%
Multi_Racial 54.15%
Visualization
Pre-processing: Within Tolerance by Ethnicity
Proportion
White Black Alaskan-
NA
Hawaiian-
IslanderHispanic Asian Multi-racial
Proportion
within
Tolerance
Visualization
Odds of Landing Within Tolerance Compared to White
0.50
0.45
0.40
0.350.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
Analysis - Results
� Education (socio-economic proxy)
� Not Significant
� Ethnicity
� Significant� Significant
○ White was used as reference category
○ Non-White
� All ethnicities less likely to be within tolerance
� Least Likely
- Hawaiians & Pacific Islanders
- Native Americans
Action
� Additional Research
� Other non-medical, non-demographic factors
� Analyze reason behind discrepancy
� Make changes� Make changes
� Legislation & Policy
� Education
Questions?