Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods -...

59
Tree-Based Methods in Drug Safety Research Salford Systems Data Mining Conference March 30, 2006

Transcript of Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods -...

Page 1: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Tree-Based Methods in Drug Safety Research

Salford Systems Data Mining Conference

March 30, 2006

Page 2: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 2 | Copyright © 2006 i3

Outline

In-going question: “Can tree-based methods help here?”

Background

Methods

Focus on splitting rules for CART

Brief description of Stochastic Gradient Boosting and Random Forests

Results

– Question 1 - Associations not discernable by epidemiologic analyses?

– Question 2 - What else can be learned about the outcomes ?

Limitations

Page 3: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 3 | Copyright © 2006 i3

Background

Common disorder

Medications

– Potential adverse outcomes– Cerebrovascular (n=2 types)

– Cardiovascular (n=4)

– Multiple events

– Death

Concern re: particular drug class

– Focus drug

– Compared with all other drugs used to treat the disorder

Page 4: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 4 | Copyright © 2006 i3

Background

Epidemiologic Analyses

– Matched retrospective cohort study– Focus drug users, users of other drugs in the class, matched controls

– Four entry points into the study

– Using state-of-the-art propensity score matching

– Conclusion: No difference in the occurrence of cardiovascular or cerebrovascular events in the two treated groups

– Increased risk for those who are treated with any drug in the class compared with the control group

Page 5: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 5 | Copyright © 2006 i3

Objectives of the Tree-Based Analyses

Are there patterns of covariates associated with the outcomes that are better identified using tree-based methods?

1. Are there patterns of association with the outcomes and drugs not amenable to standard epidemiologic analyses?

2. What else can be learned about the etiology of the target outcomes from this study?

Page 6: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Methods

Page 7: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 7 | Copyright © 2006 i3

Methods - Sample

Medical and pharmacy claims data

– 3 Groups (N ≈ 50,000)– Focus drug ≈ 12,500

– Other drugs ≈ 12,500

– Controls ≈ 25,000

– Propensity score matched

– 4 Cohorts– Entry points into the study

– 6 months apart

Page 8: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 8 | Copyright © 2006 i3

Methods - Outcomes

Outcomes

– Individual (7 binary, one polychotomous – 8 levels)– One-at-a-time – individuals with multiple outcomes in more than one

– One polychotomous outcome variable including, “Multiple”

– Grouped by type (one polychotomous – 5 levels)– Cardiovascular

– Cerebrovascular

– Multiple

– Death

– None

– Any vs. None (binary)

– Continuous – count of the number of outcomes (0-3)

Page 9: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 9 | Copyright © 2006 i3

Methods - Covariates

Covariates

– Prescription drug use (binary indicators)– 17 drug classes, total Rx, cardiovascular Rx

– Medical claims 6 months prior to study entry (binary indicators)– 19 classes of medical conditions

– Number of hospital visits– ER, ICU, psychiatry, psychology, cardiovascular, other, other mental, # stays

– Costs– Total, total drug

– Demographics– Age (year of birth), gender, region

– No personally identifying information of any kind

Page 10: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 10 | Copyright © 2006 i3

Statistical Methods

Statistical Methods

– Recursive partitioning (using CART)– More on specific splitting rules …

– Random forests (using Salford’s RF)– Methods summary to follow – see Breiman and Cutler refs

– Stochastic gradient boosting (using TreeNet)– Methods summary to follow – see Friedman refs

Page 11: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 11 | Copyright © 2006 i3

Statistical Methods - Details

CART

– Mix of equal and data priors for CART

– Misclassification costs – case as control = 1.5

Random Forests

– 1,000 trees grown, 3 variables considered at each split

– Balanced prior probabilities

Stochastic Gradient Boosting

– Misclassification costs as in CART

– 500 trees grown

– Balanced prior probabilities

Page 12: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Splitting Rules

Page 13: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 13 | Copyright © 2006 i3

Statistical Methods - Splitting Rules

Splitting rules can effect the classification accuracy of a tree

– Some might argue that pruning is equally, if not more, important

In some cases, the purity of particular nodes may be of more interest than overall accuracy

– Given 2 different structures with equal accuracy

Page 14: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 14 | Copyright © 2006 i3

Splitting Rules

Conceptually:

Systematically choose a split-point and divide the sample into two groups

Construct measure (impurity or goodness – cup ½ empty…)

– More on this …

Assess split using some weighted combination of the measure and class probabilities

– i.e. a small child node with perfect information contributes only a small amount to the adjudication of the quality of the split

Iterate across all possible splits – choose the “best” using given measure

Page 15: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 15 | Copyright © 2006 i3

Statistical Methods - Splitting Rules

Gini Symmetric Gini

Entropy

Twoing Ordered Twoing

Class probability

Child - Left Child - Right

Parent

Best?

Page 16: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 16 | Copyright © 2006 i3

Gini – A Little History

Developed by Italian statistician Corrado Gini in 1912

– Designed to measure income inequity

– Can be used to assess inequity – – or impurity - in any distribution

– Lorenz curve (1905) income equity

Graph: http://en.wikipedia.org/wiki/Gini_coefficient

Page 17: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 17 | Copyright © 2006 i3

Gini – A Little Math

GINI “impurity” criterion:

Where = relative frequency of class j at node t

Split quality measured by:

Where nj = number of individuals at child node, j

np = number of individuals at parent node, p

Split with the minimum GINI index is chosen

[ ]2)|(1)( ∑−=j

tjptGINI

( )tjp |

∑ ==

k

j jp

jsplit GINI

n

nGINI

1

Child - Left Child - Right

Parent

Page 18: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 18 | Copyright © 2006 i3

Gini – An Example

Simple Gini Example

Parent Node 200

Cases 50 Cases 50

Controls 50 Controls 50

node n 100 node n 100Prop Cases 50% Prop Cases 50%

Prop Controls 50% Prop Controls 50%

Gini(s1) 0.50 Gini(s2) 0.50

Gini(1,2) 0.50

Child 2Child 1

Page 19: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 19 | Copyright © 2006 i3

Gini

Categorical variables:

– Count each class

Continuous variables:

– Sort the variable

– Choose one or more splits

Tends to find the largest classes first

Misclassification costs are incorporated by adjusting prior probabilities

Page 20: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 20 | Copyright © 2006 i3

Symmetric Gini

Gini criterion using symmetric misclassification costs

– Difference: – At the pruning stage Gini can use nonsymmetric costs

– Symmetric Gini imposes symmetry on the matrix

– Motivation: highly nonsymmetric costs can cause the impurity function to behave badly (not concave)

– See CART monograph for nice example

– Something to consider when costs are important and not balanced

Page 21: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 21 | Copyright © 2006 i3

Entropy – A Little History

Claude Shannon introduced the concept in 1948

– “Mathematical Theory of Communications” – Bell Labs

– Minimum # of bits needed to encode a string of symbols

Many (>30) definitions and interpretations

– Physics: the amount of disorder in a system

– Statistics and Information Theory: information – as in statistical genetics

Entropy is a measure of uncertainty, or conversely, information

Page 22: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 22 | Copyright © 2006 i3

Entropy – A Little Math

Entropy calculation

Similar evaluation:

– Change in entropy = Parent entropy – weighted at each node

Multi-level outcomes

– Looks for splits where as many levels as possible are divided perfectly or near perfectly.

∑−=j

tjptjpti )])|(log([)|()(

(Recall that Gini is p(j|t)2)

Page 23: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 23 | Copyright © 2006 i3

Twoing – History & Math

First proposed in the original 1984 CART monograph

Idea: in a multimulti--class problemclass problem, separate the classes into two “superclasses”

– Choose split with greatest decrease in node impurity

– “Strategic” splits – information re: class similarities

Attempts to find groups of up to 50% of the data each

– Power-modified Twoing forces splits to be close to 50%

Has been suggested for difficult problems – low signal / noise ratio

[ ]2)|()|(4 ∑ −

j RLRL tjptjp

pp

Page 24: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 24 | Copyright © 2006 i3

Ordered Twoing

Ordered Twoing

Twoing designed for ordered outcomes

Constraint:

– Only considers grouping adjacent classes

Example:

– Twoing - consider any combinations of numeric values (e.g. 2,5 vs 1,3,4)

– Ordered twoing - only consider adjacent splits (1,2,3 vs 4,5)

– … a severity scale, for example

Page 25: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 25 | Copyright © 2006 i3

Class Probability

Probability trees instead of classification trees

– Estimate of the probability that a case is in the class– Class assignment also

– Measure of association (or affinity) with each class

– Example– Useful to estimate the relative probabilities of a disease w/o ruling any out

Page 26: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 26 | Copyright © 2006 i3

Splitting Rules

Using different rules will have the most effect for multilevel outcomes

Good practice to use several splitting rules and compare the results

Page 27: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 27 | Copyright © 2006 i3

Tree-Based Methods

Child - Left

Parent

Child - Left Child - Right

Parent

Child - Left

Parent

Child - Left Child - Right

Parent

Vote: Class=1

Child - Left

Parent

Child - Left Child - Right

Parent

Vote: Class=2

Child - Left

Parent

Child - Left Child - Right

Parent

Vote: Class=1

Child - Left

Parent

Child - Left Child - Right

Parent

Vote: Class=1

Child - Left

Parent

Child - Left Child - Right

Parent

Child - Left

Parent

Child - Left Child - Right

Parent

Child - Left

Parent

Child - Left Child - Right

Parent

Child - Left

Parent

Child - Left Child - Right

Parent

CART Random Forests Stochastic Gradient Boosting

Your vote counts!

Page 28: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 28 | Copyright © 2006 i3

Model Quality Across Methods

Variable importance

Classification accuracy

Page 29: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Random Forests- Brief Overview

Page 30: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 30 | Copyright © 2006 i3

Random Forests

Concept:

– Grow an ensemble of trees using bootstrapped samples– Each “votes” on the classification

Random sample of the data – WITH replacement– Usually, about 1/3 of the data are “out of the bag (OOB)”

– Can be used for validation purposes

– New observations are classified by all of the trees– Majority vote is the final classification for the observation

Sampling

– Random sample = to original sample size (WITH replacement)

– Sub-sample of all available covariates

Full trees grown – *no* pruning

Page 31: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 31 | Copyright © 2006 i3

Random Forests

The error of a forest depends upon:

– Strength of the individual trees

– Correlation among the trees

Page 32: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 32 | Copyright © 2006 i3

Random Forests

Evaluation criteria:

– Classification accuracy on the OOB samples

– Variable importance– Relative contribution of each variable to the final classification solution

Page 33: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 33 | Copyright © 2006 i3

Random Forests

Proximities for each pair of observations can be estimated

– Useful for clustering / segmentation

Nice feature – not used in these analyses

Available in CART 6.2 and in code from Adelle Cutler - flexible

Page 34: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Stochastic Gradient Boosting- Brief Overview

Page 35: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 35 | Copyright © 2006 i3

Stochastic Gradient Boosting

Boosting methods:

– Sequential algorithms – weights at a particular step are dependent upon previously fitted functions

Model residuals from previous stage

– Sub-sample of the data

– Small trees, usually with 2 to 8 nodes, at each step.

– Trees are combined by adding individual scores

Page 36: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 36 | Copyright © 2006 i3

Stochastic Gradient Boosting

Friedman : weighting coefficient, or “shrinkage parameter” implemented at each step

– slow the learning process

– results in better classification accuracy (Friedman, 2001)

Version 6.2 allows investigation of 3 shrinkage parameters in one run; 0.001, 0.01, 0.1

– “Battery” option LEARNRATE

Page 37: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 37 | Copyright © 2006 i3

Stochastic Gradient Boosting

Distinct advantage

– Much higher degree of accuracy than traditional methods – e.g. logistic or ordinary least squares regression,

– Better accuracy than single-tree methods – e.g. RP, or other parallel “forest” methods (e.g. random forests)

– Model implementation in an independent sample is not more statistically difficult than single-tree methods,

– But, models are considerably larger

– No simple graphical representation

– Model characterization:

– Classification accuracy in test or holdout samples (v-fold cross-validation)

– Number and relative importance of variables used

Page 38: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 38 | Copyright © 2006 i3

Summary

CART

– Single tree-structured model

– Nice graphical representation

– Several methods for splitting criteria

Random Forests

– Ensemble of trees based on bootstrapped samples

– Majority “vote”

– More accurate than CART – No single graphical representation

Stochastic Gradient Boosting

– Ensemble – small trees based on partial samples

– Models residuals

– Best for many classes of problems

Page 39: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

So,

How do these methods help discover the etiology of outcomes in these data?

Page 40: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Results

Question 1: Adverse Drug Events

Page 41: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 41 | Copyright © 2006 i3

Adverse Drug Events: Results

Are there patterns of association with the outcomes and specificdrugs not amenable to standard epidemiologic analyses?

The short answer is No, in this case.

There were no analyses in which the focus drug separated from the class was associated with the outcome.

Not with any construction of the outcome

Not with any of the tree-based statistical methods

Conclusion: Verification of the epidemiologic analysis

The Good News:

If there is no association, you’re not likely to make one up using Tree-Based methods

Page 42: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Results

Question 2:Etiology of Outcomes in these Data

Page 43: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 43 | Copyright © 2006 i3

Etiology: Cardiovascular & Cerebrovascular Outcomes

What else can be learned about the etiology of the target outcomes from this study?

Methods

– Sample of all outcomes and 600 randomly selected non-outcomes constructed (N=883)

– Outcomes examined from individual to any/none (as before)

– RP, SGB, and RF methods employed– CART version 6.2

Page 44: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 44 | Copyright © 2006 i3

CART Classification Accuracy

“Any / None” outcome, Gini splitting rules

Misclassification cost (1,0) = 1.5, mixed prior probabilities of class membership

Classification Accuracy

– 32% of the sample are cases– Indices: Estimation: 233, Validation: 215

Prediction Success--Focus Class 1 --Estimation--Row %

Actual Class

Total Cases

PercentCorrect

1 N=366

0 N=517

1 283 75.27 75.27 24.730 600 74.50 25.50 74.50

Total: 883.00Average: 74.88

Overall % Correct: 74.75

Prediction Success--Focus Class 1 --Validation--Row % Actual Class

Total Cases

PercentCorrect

1 N=403

0 N=480

1 283 72.79 72.79 27.210 600 67.17 32.83 67.17

Total: 883.00Average: 69.98

Overall % Correct: 68.97

Page 45: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

HS_VIZ_OTHER <= 5.50

Terminal

Node 1

Class = None

Class Cases %

None 326 88.3

Any 43 11.7

N = 369

HS_VIZ_OTHER > 5.50

Terminal

Node 2

Class = Any

Class Cases %

None 28 54.9

Any 23 45.1

N = 51

AGE <= 39.50

Node 2

Class = None

N = 420

EPI_COST <= 254.68

Terminal

Node 3

Class = None

Class Cases %

None 43 82.7

Any 9 17.3

N = 52

EPI_COST <= 2270.25

Terminal

Node 4

Class = Any

Class Cases %

None 45 52.3

Any 41 47.7

N = 86

EPI_COST > 2270.25

Terminal

Node 5

Class = None

Class Cases %

None 25 80.6

Any 6 19.4

N = 31

EPI_COST > 254.68

Node 7

Class = Any

N = 117

AGE <= 50.50

Node 6

Class = Any

N = 169

AGE > 50.50

Terminal

Node 6

Class = Any

Class Cases %

None 43 44.8

Any 53 55.2

N = 96

BLOCK4$ = (2003_00,...)

Node 5

Class = Any

N = 265

BLOCK4$ = (2004_01)

Terminal

Node 7

Class = None

Class Cases %

None 53 81.5

Any 12 18.5

N = 65

EPI_COST <= 3698.13

Node 4

Class = Any

N = 330

EPI_COST > 3698.13

Terminal

Node 8

Class = Any

Class Cases %

None 37 27.8

Any 96 72.2

N = 133

AGE > 39.50

Node 3

Class = Any

N = 463

Node 1

Class = Any

N = 883

(Sample mean age=39.7)

(~75th %ile) (Mean=$4,068)

(75th %ile=$2,911)

Page 46: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 46 | Copyright © 2006 i3

Variable Importance

0 20 40 60 80 100

Total Cost

Age

Total Rx Cost

Rx History

BL High BP

Lab History

Other Hosp Visits

# Hosp Stays

Cohort

Cardio Rx History

BL Statin

Cardio Hosp Visits

Psych Hosp Visits

Standardized Variable Importance

Page 47: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 47 | Copyright © 2006 i3

Conclusion

The older you are, and the more prior illness you have had, the more likely you are to be susceptible to these outcomes.

– Confirms what we already know

Need more information, and probably more accurate classification

– Additional source data – E.g. charts

– Claims designed for insurance payment

Page 48: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

But, what if…

Page 49: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 49 | Copyright © 2006 i3

… I used a different splitting rule?

0.65

0.70

0.75

0.80

Sym

Gin

i

Gin

i

Tw

oin

g

Ord

. Tw

oin

g

Entro

py

Cl. P

rob

Rel.

Err

or

Test Rel. Error

Gini (8) (0.612)

Min = 0.6122

Median = 0.6273

Mean = 0.6496

Max = 0.7702

Sym Gini (3) (0.628)

Relative error -> Gini

Smallest tree -> Symmetric Gini

Page 50: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 50 | Copyright © 2006 i3

Average Variable Importance

Average variable importance across methods – similar to Gini

Page 51: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 51 | Copyright © 2006 i3

… I used a different outcome

4-Level classification (cerebral or cardiac, death, multiple, none)

Page 52: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

But what if…

… I used a different tree-based method?

Page 53: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 53 | Copyright © 2006 i3

Random Forests – OOB Test Sample Results

459 trees, ROC = 0.60

229 trees, ROC = 0.64

Actual Total Percent Cerebral or cardiac Death Multiple None

Class Cases Correct N=1 N=120 N=198 N=564

Cerebral or cardiac 233 0.43 0.430.43 22.32 37.77 39.48

Death 13 76.92 0 76.9276.92 0 23.08

Multiple 37 37.84 0 35.14 37.8437.84 27.03

None 600 76.5 0 7.5 16 76.576.5

Actual Total Percent Cardiac Cerebral Death Multiple None

Class Cases Correct N=0 N=134 N=108 N=144 N=497

Cardiac 197 0 00 21.83 18.27 25.89 34.01

Cerebral 36 19.44 0 19.4419.44 33.33 25 22.22

Death 13 69.23 0 7.69 69.2369.23 0 23.08

Multiple 37 27.03 0 18.92 29.73 27.0327.03 24.32

None 600 68.33 0 12.67 6.67 12.33 68.3368.33

Page 54: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 54 | Copyright © 2006 i3

Random Forests – OOB Results

325 trees, ROC = 0.76

SGB?

Actual Total Percent None Any

Class Cases Correct N=612 N=271

None 600 82.83 82.8382.83 17.17

Any 283 59.36 40.64 59.3659.36

Page 55: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 55 | Copyright © 2006 i3

Stochastic Gradient Boosting (TreeNet)

169 trees, ROC=0.79

114 trees, ROC = 0.84

Actual Total Percent Cerebral or cardiac Death Multiple None

Class Cases Correct N=555 N=36 N=52 N=240

Cerebral or cardiac 233 75.97 75.9775.97 6.87 10.30 6.87

Death 13 61.54 23.08 61.5461.54 0.00 15.38

Multiple 37 8.11 72.97 10.81 8.118.11 8.11

None 600 36.50 58.00 1.33 4.17 36.5036.50

Actual Total Percent None Any

Class Cases Correct N=517 N=366

None 600 70.33 70.3370.33 29.67

Any 283 66.43 33.57 66.4366.43

Page 56: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 56 | Copyright © 2006 i3

Tree-Based Model Comparisons

Classification Accuracy - Any/None Outcome - Validation Samples

Of interesting note: Top variables were the same across methods

Cases Non-Cases

CART 72.8 67.2

Random Forests 59.4 82.8

Stochastic Gradient Boosting 66.4 70.3

Page 57: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 57 | Copyright © 2006 i3

Conclusions

There were no associations of the focus drug and the adverse outcomes in the data

– Tree-based methods confirmed the finding

– Methods focus: variable importance

No surprises in the etiology of the outcomes using the available data

– Sometimes simple really is best

– Methods focus: classification accuracy

Page 58: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 58 | Copyright © 2006 i3

Methods Conclusions

In this case, CART was the most parsimonious model

– Classification accuracy in the same range as SGB

– Variable importance ranks similar

Advantage: Graphically tractable tree

– Intuitive

– Easy to program

Sometimes, simple is best

Page 59: Tree-Based Methods in Drug Safety Researchdocs.salford-systems.com/MarshaWilcox.pdfMethods - Covariates ̇ Covariates – Prescription drug use (binary indicators) – 17 drug classes,

Page 59 | Copyright © 2006 i3

Limitations

Bias of claims data

None of these tree-based analyses used the “person-time”

– Can be accomplished with Tree-Structured Survival Analytic methods

– Available in S-Plus

– In development at Salford