IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling...

26
IEEE CBMS’06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen 1 Dynamic Integration of Classifiers for Handling Concept Drift Alexey Tsymbal Department of Computer Science Trinity College Dublin Ireland Seppo Puuronen Dept. of CS and IS University of Jyväskylä Finland Mykola Pechenizkiy Dept. of Mathematical IT University of Jyväskylä Finland IEEE CBMS’06: DM Track Salt Lake City, Utah, USA June 21-23, 2006 Pádraig Cunningham Department of Computer Science Trinity College Dublin Ireland

Transcript of IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling...

Page 1: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

1

Dynamic Integration of Classifiers for Handling Concept Drift

Alexey TsymbalDepartment of Computer

ScienceTrinity College Dublin

Ireland

Seppo PuuronenDept. of CS and IS

University of JyväskyläFinland

Mykola PechenizkiyDept. of Mathematical ITUniversity of Jyväskylä

Finland

IEEE CBMS’06: DM Track Salt Lake City, Utah, USA June 21-23, 2006

Pádraig Cunningham Department of Computer

ScienceTrinity College Dublin

Ireland

Page 2: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

2

Outline Introduction

– Supervised Learning – The Problem of Concept Drift (CD)

Approaches to Handle CD:– Instance selection; instance weighting; and ensemble

learning Dynamic Integration of Classifiers for Handling CD

– Dynamic Selection, Dynamic Integration, and their mix Domain of Antibiotic resistance

– How resistance occurs, concept drift context Experiments design

– C4.5 ensembles with static and dynamic integration Results and Conclusion

Page 3: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

3

CLASSIFICATIONCLASSIFICATION

New instance to be classified

Class Membership ofthe new instance

J classes, n training observations, p features

Given n training instances

(xi, yi) where xi are values of

attributes and y is class

Goal: given new x0,

predict class y0

Training Set

The task of classification

Examples:

- diagnosis of thyroid diseases;

- heart attack prediction, etc.

Page 4: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

4

The Task of Classification

Predicting Antibiotic Resistance– predict the sensitivity of a pathogen to an antibiotic based on

data about the antibiotic, the isolated pathogen, and the demographic and clinical features of the patient.

Page 5: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

5

The Problem of Concept Drift Changes in the hidden context can induce more

or less radical (gradual or abrupt) changes in the target concept

– A typical example – antibiotic resistance: • pathogen sensitivity may change over time as new

pathogen strains develop resistance to antibiotics that were previously effective

– Even in most strictly controlled environments some unexpected changes may happen due to:

• fail and replacement of some medical equipment, or • changes in personnel, causing the necessity to change

the model.– The necessity in the change of current model due to

the change of data distribution is called virtual concept drift

An effective learner should be able to track such changes and to quickly adapt to them.

Page 6: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

6

Approaches to Handle Concept Drift instance selection:

– select instances relevant to the current concept;– generalizing from a moving window and uses the learnt

concepts for prediction only in the immediate future;– case-base editing strategies in CBR that delete noisy,

irrelevant and redundant cases; instance weighting:

– weighting according to “age”, and competence wrt the current concept;

– weighting techniques handle CD worse than analogous instance selection techniques (due overfitting the data);

ensemble learning: – maintains a set of concept descriptions, predictions of

which are combined using e.g. a form of voting;– dividing the data into sequential blocks of fixed size and

building an ensemble on them.

Page 7: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

7

Handling Concept Drift with Ensembles

Ensemble is constructed as a set of concept descriptions corresponding to different time intervals:

time

training set for next base classifier Usually simple voting is used for model combination– does not work in complex domains with local concept drift

Our basic idea: use local accuracies for model combination in order to handle local concept drift

– adapts to concept drift better (e.g. with antibiotic resistance data)

Page 8: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

8

Local Concept Drift In the real world, concept drift may often be local,

– changes in the concept or data distribution occur in some regions of instance space only,

• only particular bacteria may develop their resistance to certain antibiotics, while resistance to the others could remain the same.

– the type and severity of changes may depend on the location in the instance space.

Local CD - changes in concept and data distribution occurring at an instance rather than data set level.

– Local CD occurs between two consecutive time points • if there is a sub-space of the whole instance space such that

it has different changes of concept and/or data distribution in comparison with the rest of the data.

– This is reflected by a different change in (local) predictive performance of currently used model in this sub-space.

Page 9: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

9

Stability of Regions: Rotating Hyperplane

Base models of an ensemble should not be discarded if

- global accuracy on the current block of data falls, but they are still good experts in the stable parts of the data.

One solution to this problem is the use of DIC:- the models are integrated at an instance level

according to their local accuracies.

t1 t2 t3 t4 Stability of regions in the rotating hyperplane problem

Page 10: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

10

Local Concept Drift: Most gradual CDs may be considered local, if:

– the velocity of changes is small relative wrt. arriving instances in the data stream;

– most regions of the data remain stable. Most abrupt CDs are

– not local unless substantial sub-areas remain stable between the two changing concepts.

– local, if it relates to a subgroup of the whole population. CD may also be complex, - different concept or data

distribution changes (potentially also differently!) in different clusters

– changes in AR and data distribution are usually different for different bacteria in the AR problem.

Local CD occurs at an instance level – its treatment should be at that level as well!

Potential approaches to handle local CD: – CBR: a case base is updated at an instance level;– a hybrid of ensemble learning and instance selection– Ensemble integration based on local accuracies

Page 11: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

11

How Antibiotic Resistance Happens

Page 12: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

12

How Antibiotic Resistance Happens

In spontaneous DNA mutation, bacterial DNA may mutate spontaneously. Drug-resistant tuberculosis arises this way.

In a form of microbial sex called transformation, one bacterium may take up DNA from another bacterium. Pencillin-resistant gonorrhea results from transformation.

Resistance acquired from a small circle of DNA called a plasmid, that can flit from one type of bacterium to another.

– A single plasmid can provide a slew of different resistances.

– In 1968, 12,500 people in Guatemala died in an epidemic of Shigella diarrhea. The microbe harbored a plasmid carrying resistances to four antibiotics!

Page 13: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

13

Data Collection & Organization N.N. Burdenko Institute of Neurosurgery Bacterial analyzer “Vitek-60” (by “bioMérieux”) Information Systems: "Microbiologist" & "Microbe"

Each instance: one sensitivity test: – pathogen that is isolated during the bacterial identification

analysis, – antibiotic that is used in the sensitivity test– the result of the sensitivity test itself (sensitive, resistant or

intermediate), obtained from “Vitek” according to the guidelines of (NCCLS).

– The above information is connected with patient, his or her demographical data (sex, age) and hospitalization in the Institute (main department, days spent in ICU, days spent in the hospital before test, etc.).

4430 sensitivity tests corresponding to a single specimen (liquor) including the meningitis cases of the year 2002 - 2004.

Page 14: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

14

Classification over Sequential Data Blocks

accuracy for C4.5 ensembles

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 3 5 7 9 11 13 15 17 19 21 23 25 27

v

wv

ds

dv

dvs

Page 15: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

15

Weighted Average of Classification Accuracy

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

min aver max v wv ds dv dvs

C4.5 ensembles

Page 16: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

16

Summary and Conclusions

In the real world concepts are often not stable but change with time, which is known as the problem of concept drift (CD).

Among the most popular and effective approaches to handling CD is ensemble learning:

– a set of concept descriptions built on data blocks corresponding to different time intervals is maintained, and

– the final prediction is the aggregated prediction of ensemble members.

We suggested a dynamic integration approach for ensembles (DIC) used in handling CD:

– integrates the base classifiers at an instance level, assigning to them weights proportional to their local accuracy on each instance considered.

We considered an example of CD from the area of antibiotic resistance.

We demonstrated that DIC often results in better accuracy with the considered data set than the more commonly used weighted voting:

– this supports our hypothesis that favors DIC for handling CD, especially in the presence of local CD.

Page 17: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

17

Contact Info

Mykola Pechenizkiy

Department of Mathematical Information Technology,

University of Jyväskylä, FINLANDE-mail: [email protected]

http://ww.cs.jyu.fi/~mpechen

THANK YOU!

MS Power Point slides of this and other recent talks and full texts of selected publications are available online at: http://www.cs.jyu.fi/~mpechen

Page 18: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

18

Additional Slides …

Page 19: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

19

Antibiotic Resistance in Nosocomial Infections

3 - 40% of patients admitted to hospital acquire an infection during their stay, and that the risk for hospital-acquired infection, or nosocomial infection, has risen steadily in recent decades.

The frequency depends mostly on the type of conducted operation being greater for “dirty” operations (10-40%), and smaller for “pure” operations (3-7%). E.g. such serious infectious complication as postoperative meningitis is often the result of nosocomial infection.

Antibiotics are the drugs that are commonly used to fight against infections caused by bacteria.

According to the Center for Disease Control and Prevention (CDC) statistics, more than 70% of the bacteria that cause hospital-acquired infections are resistant to at least one of the antibiotics most commonly used to treat infections.

Analysis of the microbiological data included in antibiograms collected in different institutions over different periods of time is considered as one of the most important activities to restrain the spreading of antibiotic resistance and to avoid the negative consequences of this phenomenon.

Page 20: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

20

Antibiotic sensitivity of different bacteria

Comparing the antibiotic sensitivity of different bacteria

© Jim Deacon, Institute of Cell and Molecular Biology, The University of Edinburgh

Page 21: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

21

The emergence of antibiotic resistance

Effects of different antibiotics on growth of a Bacillus strain. The right-hand image shows a close-up of the novobiocin disk (marked by an arrow on the whole plate). In this case some individual mutant cells in the bacterial population were resistant to the antibiotic and have given rise to small colonies in the zone of inhibition.

© Jim Deacon, Institute of Cell and Molecular Biology, The University of Edinburgh

Page 22: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

22

How Antibiotic Resistance Happens

Horizontal Gene Transfer (© Grace Yim and Fan Sozzi)

Page 23: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

23

Mechanisms of Antibiotic Resistance

© Grace Yim and Fan Sozzi

Page 24: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

24

Mechanisms of Antibiotic ResistanceAntibiotic Method of resistance

Chloramphenicol reduced uptake into cell

Tetracycline active efflux from the cell

β-lactams, Erythromycin, Lincomycin

eliminates or reduces binding of antibiotic to target

β-lactams, Erythromycin hydrolysis

Aminoglycosides, Chloramphenicol, Fosfomycin, Lincomycin

inactivation of antibiotic by enzymatic modification

β-lactams, Fusidic Acidsequestering of the antibiotic by protein binding

Sulfonamides, Trimethoprimmetabolic bypass of inhibited reaction

Sulfonamides, Trimethoprimoverproduction of antibiotic target (titration)

Bleomycinbinding of specific immunity protein to antibiotic

Page 25: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

25

Dataset CharacteristicsPatient and hospitalization related

Sex {Male, Female}Age IntegerRecurring stay {True,False}Days of stay in NSI IntegerDays of stay in ICU IntegerDays of stay in NSI before specimen was received IntegerBacterium is isolated when patient is in ICU {True,False}Main department {1,…,10}Department of stay (departments + ICU) {1,…,11}

Pathogen and pathogen groupsPathogen name {Pat_name1, …, Pat_name17}Gram(+/- ) {True,False}Staphylococcus {True,False}Enterococcus {True,False}Enterobacteria {True,False}Nonfermenters {True,False}

Antibiotic and antibiotic groupsAntibiotic name {Ant_name1, …, Ant_name39}Group1 {True,False}… …

Group15 {True,False}sensitivity {Sensitive, Intermediate, Resistant}

Page 26: IEEE CBMS06, DM Track Salt Lake City, Utah 22.06.06 Dynamic Integration of Classifiers for Handling Concept Drift by A. Tsymbal, M. Pechenizkiy, P. Cunningham.

IEEE CBMS’06, DM TrackSalt Lake City, Utah 22.06.06

“Dynamic Integration of Classifiers for Handling Concept Drift” by A. Tsymbal, M. Pechenizkiy, P. Cunningham and S. Puuronen

26

Experiment design In Naïve Bayes, a normal distribution was assumed for numeric features, and the Laplace

correction with a multiplicative factor of 1 was used in probability estimation for categorical features.

C4.5 decision trees were built using 0.25 as the confidence factor for pruning and 2 as the minimum number of instances per leaf.

With all ensembles considered here we use the simple so-called replace the loser ensemble pruning strategy.

– if the ensemble size is greater than or equal to 25, the worst classifier, according to the current validation estimates, is replaced with a new one trained on the most recent data.

We experimented with 5 different sizes of neighbourhood k; 7, 15, 31, 63, and 127. – Naturally, usually accuracy decreases with the increase in the size of neighbourhood, becoming

closer to static voting. – Our experiments demonstrated that DIC was not very sensitive to the size of neighbourhood. – A reason for that is the locally weighted learning scheme used, with which the more distant an

instance is from the current test instance, the less influence it will have on the prediction of local performance.

– However, the smaller neighbourhoods (7 and 15) sometimes result in noisy performance estimates and inferior accuracies (especially with DS).

– We continue our analysis of experimental results focusing on the size of neighbourhood equal to 31, as usually it gives the best improvement due to DIC in the problems considered.

WEKA3 environment: Data Mining Software in Java: – http://www.cs.waikato.ac.nz/ml/weka/ – Default settings were used in the WEKA learning algorithms used in our experiments.