A Similarity Evaluation Technique for Data Mining with Ensemble of Classifiers Seppo Puuronen, Vagan...

A Similarity Evaluation Technique for Data Mining with Ensemble of

Classifiers

Seppo Puuronen, Vagan Terziyan

International Workshop on Similarity Search

1-2 September, 1999Florence (Italy)

Authors

Department of Computer Science and Information Systems

University of Jyvaskyla FINLAND

Seppo Puuronen

Vagan Terziyan

Department of Artificial Intelligence

Kharkov State Technical University of Radioelectronics,

UKRAINE

[email protected]

[email protected]

Contents

The Research Problem and Goal Basic Concepts External Similarity Evaluation Evaluation of Classifiers Competence An Example Internal Similarity Evaluation Conclusions

The Research Problem

During the past several years, in a variety of application domains, researchers in machine learning, computational learning theory, pattern recognition and statistics have tried to combine

efforts to learn how to create and combine an ensemble of classifiers.

The primary goal of combining several classifiers is to obtain a more accurate prediction than can be obtained from any single classifier alone.

Goal

The goal of this research is to develop simple similarity evaluation technique to be used for classification problem based on an ensemble of classifiers

Classification here is finding of an appropriate class among available ones for certain instance based on classifications produced by an ensemble of classifiers

Basic Concepts:Training Set (TS)

TS of an ensemble of classifiers is a quadruple:

<D,C,S,P>• D is the set of instances D1, D2,..., Dn to be classified;

• C is the set of classes C1, C2,..., Cm , that are used to classify the instances;

• S is the set of classifiers S1, S2,..., Sr , which select classes to classify the instances;

• P is the set of semantic predicates that define relationships between D, C, S

Basic Concepts:Semantic Predicate P

P D C S

if the c S uses c C

to c i D

if S refuses to use C

to c D

if S does not use or refuse

to use C to c D

i j k

k j

i

k j

i

k

j i

( , , )

,

;

,

;

,

lassifier lass

lassify nstance

lassify

lassify .

1

1

0

Problem 1:Deriving External Similarity Values

DC

S

DiCj

Sk

SDk,i

DCi,j

SCk,j

Instances Classes

Classifiers

External Similarity Values

DC

S

DiCj

Sk

SDk,i

DCi,j

SCk,j

External Similarity Values (ESV): binary relations DC, SC, and SD between the elements of (sub)sets of D and C; S and C; and S and D.

ESV are based on total support among all the classifiers for voting for the appropriate classification (or refusal to vote)

Problem 2:Deriving Internal Similarity Values

D C

S

Di’

SSk’,k’’

DDi’,i’’ CCj’,j’’

Di’’

Cj’

Cj’’

Sk’

Sk’’

Instances Classes

Classifiers

Internal Similarity Values

D C

S

Di’

SSk’,k’’

DDi’,i’’ CCj’,j’’

Di’’

Cj’

Cj’’

Sk’

Sk’’

Internal Similarity Values (ISV): binary relations between two subsets of D, two subsets of C and two subsets of S.

ISV are based on total support among all the classifiers for voting for the appropriate connection (or refusal to vote)

Why we Need Similarity Values (or Distance Measure) ? Distance between instances is used by agents to

recognize nearest neighbors for any classified instance

distance between classes is necessary to define the misclassification error during the learning phase

distance between classifiers is useful to evaluate weights of all classifiers to be able to integrate them by weighted voting

Deriving External Relation DC:How well class fits the instance

DC CD P D C S D D C Ci j j i i j k i jk

r

, , ( , , ), ,

DC

S

DiCj

Sk2

DCi,j=3

Sk1

Sk3

Classifiers

Instances Classes

Deriving External Relation SC: Measures Classifiers Competence in the Area of Classes

The value of the relation (Sk,Cj) in a way represents the total support that the classifier Sk obtains selecting (refusing to select) the class Cj to classify all the instances.

SC CS DC P D C S S S C Ck j j k i j i j ki

n

k j, , , ( , , ), ,

Example of SC Relation

Classifiers

Instances Classes

Deriving External Relation SD: Measures “Competence” of Classifiers in the Area of Instances

The value of the relation (Sk,Di) represents the total support that the classifier Sk receives selecting (or refusing to select) all the classes to classify the instance Di.

SD DS DC P D C S S S D Dk i i k i j i j kj

m

k i, , , ( , , ), ,

Example of SD Relation

DC

SSk

Di

C1

SDk,i=2

C2

CD1i = -3

CD2i = 5

InstancesClasses

Classifiers

Standardizing External Relations to the Interval [0,1]

standardizing value value =value value

max(value) - min(value)

-min( )

DC CDDC r

ri j j ii j

, ,,

2

SC CSSC n r

n rk j j kk j

, ,, ( )

( )

2

2 1

SD DSSD m r

m rk i i kk i

, ,, ( )

( )

2

2 1

n is the number of instances

m is the number of classes

r is the number of classifiers

Competence of a Classifier

Di

Conceptual pattern of features

Conceptual pattern of class definition

Instances Classes

Cj

Classifier

Competence in the Instance Area

Competence in the Area of Classes

Classifier’s Evaluation:Competence Quality in an Instance Area

Q Sn

SDDk k i

i

n( ) , 1

- measure of the “classification abilities” of a classifier relatively to instances from the support point of view

Agent’s Evaluation:Competence Quality in the Area of Classes

- measure of the “classification abilities” of a classifier in the correct use of classes from the support point of view

Q Sm

SCCk k j

j

m( ) , 1

Quality Balance Theorem

Q S Q SDk

Ck( ) ( )

The evaluation of a classifier’s competence (ranking, weighting, quality evaluation) does not depend on the competence area “real world of instances” or “conceptual world of classes” because both competence values are always equal

Proof

Q Sn

SDn

SD m r

m rD

k k ii

nk i

i

n( )

( )

( ),,

1 1 2

2 1

1

2

2 1n

DC P D C S m r

m r

i j i j kj

m

i

n( ( , , )) ( )

( )

,

1

2

2 1m

DC P D C S n r

n r

i j i j ki

n

j

m( ( , , )) ( )

( )

,

...

...

1 2

2 1

1

m

SC n r

n r mSC Q S

k j

j

m

k jj

mC

k,

,

( )

( )( )

An Example

Let us suppose that four classifiers have to classify three papers submitted to a conference with five conference topics

The classifiers should define their selection of appropriate conference topic for every paper

The final goal is to obtain a cooperative result of all the classifiers concerning the “paper - topic” relation

C (classes) Set in the Example

Classes - Conference Papers Notation

AI and Intelligent Systems C1

Analytical Technique C2

Real-Time Systems C3

Virtual Reality C4

Formal Methods C5

S (classifiers) Set in the Example

Classifiers - “Referees” Notation

A.B. S1

H.R. S2

M.L. S3

R.S. S4

D (instances) Set in the Example

I n s t a n c e s

D 1P a p e r 1

D 2P a p e r 2

D 3P a p e r 3

Selections Made for the Instance “Paper 1”

D1

P(D,C,S) C1 C2 C3 C4 C5

S1 1 -1 -1 0 -1

S2 0+ -1** 0 ++ 1* -1***

S3 0 0 -1 1 0

S4 1 -1 0 0 1Classifier H.R. considers “Paper 1” to fit to topic Virtual Reality* and refuses to include it to Analytical Technique** or Formal Methods***. H.R. does not choose or refuse to choose the AI and Intelligent Systems+ or Real-Time Systems++ topics to classify “Paper 1”.


D2

P C1 C2 C3 C4 C5

S1 -1 0 -1 0 1

S2 1 -1 -1 0 0

S3 1 -1 0 1 1

S4 -1 0 0 1 0


D3

P C1 C2 C3 C4 C5

S1 1 0 1 -1 0

S2 0 1 0 -1 1

S3 -1 -1 1 -1 1

S4 -1 -1 1 -1 1

Result of Cooperative Paper Classification Based on DC Relation

AI and Intelligent Systems, Virtual

Reality, NOT Analytical Technique,

NOT Real-Time SystemsPaper 1

Virtual Reality, Formal Methods,NOT Analytical Technique, NOTReal-Time Systems

Paper 2

Real-Time Systems, Formal

Methods, NOT Virtual RealityPaper 3

Results of Classifiers’ Competence Evaluation (based on SC and SD sets)

… Proposals obtained from the classifier A.B. should be accepted if they concern topics Real-Time Systems and Virtual Reality or instances “Paper 1” and “Paper 3”, and these proposals should be rejected if they concern AI and Intelligent Systems or “Paper 2”. In some cases it seems to be possible to accept classification proposals from the classifier A.B. if they concern Analytical Technique and Formal Methods. All four classifiers are expected to give an acceptable proposals concerning “Paper 3” and only suggestion of the classifier M.L. can be accepted if it concerns “Paper 2” ...

Deriving Internal Similarity Values

Set A Set I

A’

A”

A’I

IA”

A’A”I

A’

A”

a)

Set A

Set I

A’

A”

A’I

JA”

A’A”IJ

A’

A”

b)

Set J

IJ

Via one intermediate set Via two intermediate sets

Internal Similarity for Classifiers: Instance-Based Similarity

D C

SS’S’’D

S’’

S’DS’’

S’D

S S S S S S S D DSD' '' ' '' ' '',

Instances

Classifiers

Internal Similarity for Classifiers: Class-Based Similarity

D C

SS’S’’C

S’’

S’

CS’’

S’C

S S S S S S S C CSC' '' ' '' ' '',

Classes

Classifiers

Internal Similarity for Classifiers: Class-Instance-Based Similarity

D C

SS’S’’CD

S’’

S’DS’’S’C

CD

S S S S S S S C CD DSCD' '' ' '' ' '',

Classifiers

ClassesInstances

Conclusion

Discussion was given to methods of deriving the total support of each binary similarity relation. This can be used, for example, to derive the most supported classification result and to evaluate the classifiers according to their competence

We also discussed relations between elements taken from the same set: instances, classes, or classifiers. This can be used, for example, to divide classifiers into groups of similar competence relatively to the instance-class environment

A Similarity Evaluation Technique for Data Mining with Ensemble of Classifiers Seppo Puuronen, Vagan...

Documents

Transcript of A Similarity Evaluation Technique for Data Mining with Ensemble of Classifiers Seppo Puuronen, Vagan...