AI - Fuzzy - Decision Trees

8/19/2019 AI - Fuzzy - Decision Trees

1/9

1

Fuzzy Decision Trees

Professor J. F. Baldwin

Classification and Prediction

For classification the universe for the target attribute is

a discrete set.

For prediction the universe for the target attribute is

continuous

For Prediction use fuzzy partition

a b c d e

Arrange fuzzy sets

so there are equal

number of trainingdata points in each

of intervals

[a, b], [b, c], [c, d],

[d, e]

f 1f 2 f 3 f 4

f 5

T


2/9

2

Target translation for predictionA1 A2 … An T Pr

A1 A2 … An T Pr

a11 a12 … a1n t1 p1

a11 a12 … a1n f i p1χfi(t1)

a11 a12 … a1n f i+1 p1χfi+1(t1)

Training Set

Tr

Repeat for each row collecting equivalent rows

and adding probabilities

Translated

Training set

Tr'This is now a

classification

Set.

Preparing one attribute reduced

database for continuous attribute

Ai T PrFrom Tr'

if prediction

Tr if

classification

Pr(Ai,T)= ... ... Pr(A1...,An ,T)An

∑A i=1

∑A i−1

∑A1

∑

Continuous attribute

a b c d e

g1g2 g3 g4

g5

Ai

equal

number

of datapoints

in each

interval

Ai T PrFrom Tr'

gi Reduced database

choose number

of fuzzy sets


3/9

3

Fuzzy ID3

Using the training set Tr' and one attribute reduced database

for all continuous attributes, we can use the method of ID3

previously given to determine the decision tree for predicting

or classifying the target and also post pruning

We modify the stopping condition. Do not expand node N if

S = Pr(T)Ln{Pr(T)T

∑ } for that node is < some value v

Node N will have probability distribution {gi : θi}

You can also limit the depth of the tree to some value.

For example expand tree to depth 4.

Evaluating new case for

classificationAi

g1g2

gn

Attribute value, for continuous

attribute will have probability

distribution over {gi}. Only

2 non-zero probabilities

New case will propagate through many branches of the tree

arriving at node Ni with probability βi determined by multiplying

the probabilities of all branches to arrive at Ni

Let distributions for leaf nodes be N j : {ti : θij}Overall distribution is {ti : β jθij

j

∑ }

Decision: choose tk where MAXi

{ β jθij j

∑ } = β jθkj j

∑


4/9

4

Evaluating new case for

predictionAi

g1 g2

gn

Attribute value, for continuous

attribute will have probability

distribution over {gi}. Only

2 non-zero probabilities

New case will propagate through many branches of the tree

arriving at node Ni with probability βi determined by multiplyingthe probabilities of all branches to arrive at Ni

Let distributions for leaf nodes be N j : {f i : θij}Overall distribution is {f i : β jθij j

∑ }

predicted value= µ(f i)i

∑ { β jθij j

∑ }

Fuzzy Sets important for Data

Mining

profitincome

outgoing0 1

0

1

Partition eachuniverse with{small, large}

large

small

small large

OUTGOING

INCOMEsmall

largeINCOME

small

large

large

small

profit : 0.543

profit : 0.874

profit : 0.165

profit : 0.543

Profit

94.14% correct

Two crisp sets on eachuniverse can give at mostonly 50%accuracy

We would require 16 crispsets on each universe to givesame accuracy as a two fuzzy set partition


5/9

5

Ellipse Example

legal

illegal

-1.5 1.5

-1.5

1.5

X

Y

X, Y universes each partitionedinto 5 fuzzy sets

about_-1.5 = [-1.5:1, -0.75: 0]about_-0.75 = [-1.5:0, -0.75:1, 0:0]about_0 = [-0.75:0, 0:1, 0.75:0]about_0.75 = [0:0, 0.75:1, 1.5:0]about_1.5 = [0.75:0, 1.5: 1]

tree learnt on 126 random points from [-1.5,1.5]2

Tree for Ellipse example

L:0 I:1

L:0.0092 I:0.9908

L:0.3506 I:0.6494

L:0.5090 I:0.4910

L:0.3455 I:0.6545

L:0.0131 I:0.9869

L:0.1352 I:0.8648

L:0.8131 I:0.1869

L:1 I:0

L:0.8178 I:0.1822

L:0.1327 I:0.8673

L:0.0109 I:0.9891

L:0.3629 I:0.6371

L:0.5090 I:0.5910

L:0.3455 I:0.6545

L:0.0131 I:0.9869

X

Y

Y

Y

L:0 I:1

abou t_−1 .5

abou t_−0 .75

abou t_0

abou t_0 .75

abou t_1 .5

about_1 .5

abou t_1 .5

abou t_1 .5

about_0 .75

abou t_0 .75

abou t_0 .75

abou t_0

about_0

abou t_0

abou t_−0 .75

abou t_−0 .75

abou t_−0 .75

abou t_−1 .5

about_ −1 .5

about_ −

1 .5


6/9

6

General Fril Rule ((Classification = legal) if (((X is about_-1.5))((X is about_-0.75)& (Y is about_-1.5))((X is about_-0.75) & (Y is about_-0.75))((X is about_-0.75) & (Y is about_0))((X is about_-0.75) & (Y is about_0.75))((X is about_-0.75) & (Y is about_1.5))((X is about_0) & (Y is about_-1.5))((X is about_0) & (Y is about_-0.75))((X is about_0) & (Y is about_0))((X is about_0) & (Y is about_0.75))((X is about_0) & (Y is about_1.5))((X is about_0.75) & (Y is about_-1.5))((X is about_0.75) & (Y is about_-0.75))((X is about_0.75) & (Y is about_0))((X is about_0.75) & (Y is about_0.75))((X is about_0.75) & (Y is about_1.5))((X is about_1.5)))) :

((0 0)(0.0092 0.0092)(0.3506 0.3506)(0.5090 0.5090)(0.3455 0.3455)(0.0131 0.0131)(0.1352 0.1352)(0.8131 0.8131)(1 1)(0.8178 0.8178)(0.1327 0.1327)(0.0109 0.0109)(0.3629 0.3629)(0.5090 0.5090)(0.3455 0.3455)(0.0131 0 . 0131)(0 0))

Results

The above tree was tested on 960 points

forming a regular grid on [-1.5,1.5]2

giving 99.168% correct classification.

The control surface for the positive

quadrant


7/9

7

Iris ClassificationData3 classes - Iris-Setosa ,Iris-Versicolor and Iris-Virginica -50 instances of each classAttributes 1. sepal length in cm ----universe [4.3, 7.9] 2. sepal width in cm ----universe [2, 4.4] 3. petal length in cm ----universe [1, 6.9] 4. petal width in cm ----universe [0.1, 2.5]

Fuzzy partition of 5 fuzzy sets on each universe

Iris Decision treev_small4 (1 0 0)

small4

v_small3(1 0 0)

small3{v_small2,small2}

{medium2,large2,v_large2}

(0 1 0)

(0.93 0.07 0)

{medium3,large3}(0 1 0)

v_large3 (0 0 1)

medium4{v_small3,small3,medium3,large3}

(0 0.92 0.08)

v_large3(0 0.08 0.92)

large4

v_small3(0.33 0.33 0.33)

small3(0 1 0)

medium3

{v_small2,small2}

v_small1(0 0.27 0.73)

{small1,med1,large1}(0 0.62 0.38)

v_large1 (0 0.95 0.05)

{med2,large2}(0 0.81 0.19)

v_large2 (0.33 0.33 0.33)

large3

(v_small2,small2}(0 0.27 0.73)

med2

v_small1(0 1 0)

{small1,med1}(0 0.43 0.56)

{large1,v_large1}(0 0.35 0.65)

large2

v_small1(0.33 0.33 0.33)

small1(0 0.97 0.03)

med1(0 0.74 0.26)

large1(0 0.41 0.59)

v_large1(0 0.12 0.88)

v_large2(0 0 1)

v_large3(0 0.02 0.98)

v_large5( 0 0 1)

4

3

3

3

2

2

1

1

1

Gives98.667%accuracyon testdata


8/9

8

Diabetes in Pima Indians

Data768 over 21 yrs females - 384 training, 384 test classes -Attributes 1 Number of times pregnant 2 Plasma glucose concentration 3 Diastolic blood pressure 4 Triceps skin fold thickness 5 2-Hour serum insulin 6 Body mass index 7 Diabetes pedigree function 8 Age

Diabetes mellitus in the Pima Indian population livingnear Phoenix Arizona - 5 fuzzy sets used for each attribute

The decision tree wasgeneratedto a maximum depth of 4 given a tree of 161 branches.This gave an accuracy of 81.25% on the trainingset and 79.9% on the testset.

Forward pruning algorithm the tree complexity is halved to 80 branches. This reduced tree gives an accuracy of80.46% on the training set and 78.38% on the test set.

Post pruning reduces the complexity to 28 branches giving78.125% on the training set and 78.9% on the test set

Diabetes Tree

v_small2

v_small8

small8

medium8

{large8,v_large8}

(nd:0.99 d:0.01)

(nd:0.09 d:0.91)

(nd:0.3 d:0.7)

(nd:0.5 d:0.5)

small2(nd:0.96 d:0.04)

medium2

{v_small8,v_large8}(nd:0.89 d:0.11)

small8(nd:0.65 d:0.35)

medium8

large8(nd:0.58 d:0.42)

(nd:0.6 d:0.4)v_small7

{small7,medium7}(nd:0.39 d:0.6)

large7(nd:0.88 d:0.12)

v_large7(nd:0.5 d:0.5)

large2

v_small8

v_small3(nd:0.22 d:0.78)

{small3,v_large3}(nd:0.68 d:0.32)

{medium3,large3}(nd:0.74 d:0.26)

small8

v_small6(nd:0.29 d:0.71)

small6(nd:0.64 d:0.36)

(medium6,large6)(nd:0.45 d:0.55)

v_large6(nd:0.05 d:0.95)

medium8

v_small5 (nd:0.56 d:0.44)

small5(nd:0.31 d:0.69)

(medium5,large5,v_large5}(nd:0.03 d:0.97)

large8

{v_small7,small7}(nd:0.39 d:0.61)

medium7(nd:0.44 d:0.56)

{large7,v_large7}(nd:0.92 d0.08)

v_large8(nd:0.55 d:0.45)

v_large2

{v_small3,small3,v_large3}(nd:0.09 d:0.91)

{medium3,large3}

(nd:0.29 d:0.71)

2

8

8

7

8

3

6

7

5

3

Decision Tree for Pima Indian Problem


9/9

9

SIN XY Prediction Example

databaseconsists of 528 triples(X, Y, sin XY)where the pairs (X, Y)form a regular grid on [0, 3]

2

about_ 0 = [0:1 0.333333:0 ]about_0.3333 = [0:0 0.333333:1 0.666667:0]about_ 0.6667 = [0.333333:0 0.666667:1 1:0]about _ 1 = [0.666667:0 1:1 1.33333:0]about_ 1.333 = [1:0 1.33333:1 1.66667:0]about_1.667 = [1.33333:0 1.66667:1 2:0]about _ 2 = [1.66667:0 2:1 2.33333:0]

about _2.333 = [2:0 2.33333:1 2.66667:0]about _ 2.6667 = [2.33333:0 2.66667:1 3:0]about _ 3 = [2.66667:0 3:1 ]

class_ 1 = [-1:1 0:0] class _2 = [-1:0 0:1 0.380647:0] class_ 3 = [0:0 0.380647:1 0.822602:0] class_4 = [0.380647:0 0.822602:1 1:0] class_5 = [0.822602:0 1:1]

Fuzzy ID3 decision tree with 100

branches

sinxy

control surface

Percentage error of

4.22% on a regular

test set of

1023 points.

AI - Fuzzy - Decision Trees

Documents

Transcript of AI - Fuzzy - Decision Trees