Presentation on Decision trees Presented to: Sir Marooof Pasha.

24
Presentation Presentation on on Decision trees Decision trees Presented to: Presented to: Sir Marooof Pasha Sir Marooof Pasha

description

Decision Trees Rules for classifying data using attributes. Rules for classifying data using attributes. The tree consists of decision nodes and leaf nodes. The tree consists of decision nodes and leaf nodes. A decision node has two or more branches, each representing values for the attribute tested. A decision node has two or more branches, each representing values for the attribute tested. A leaf node attribute which does not require additional classification testing. A leaf node attribute which does not require additional classification testing.

Transcript of Presentation on Decision trees Presented to: Sir Marooof Pasha.

Page 1: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Presentation Presentation onon

Decision treesDecision treesPresented to:Presented to:

Sir Marooof PashaSir Marooof Pasha

Page 2: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Group membersGroup membersKiran Shakoor 07-05Kiran Shakoor 07-05Nazish Yaqoob 07-11Nazish Yaqoob 07-11Razeena Ameen 07-25Razeena Ameen 07-25Ayesha Yaseen 07-31Ayesha Yaseen 07-31Nudrat Rehman 07-47Nudrat Rehman 07-47

Page 3: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Decision TreesDecision TreesRules for classifying data using attributes.Rules for classifying data using attributes.The tree consists of decision nodes and The tree consists of decision nodes and

leaf nodes.leaf nodes.A decision node has two or more A decision node has two or more

branches, each representing values for the branches, each representing values for the attribute tested.attribute tested.

A leaf node attribute which does not A leaf node attribute which does not require additional classification testing.require additional classification testing.

Page 4: Presentation on Decision trees Presented to: Sir Marooof Pasha.

The Process of Constructing a The Process of Constructing a Decision TreeDecision Tree

Select an attribute to place at the Select an attribute to place at the root of the decision tree and make root of the decision tree and make one branch for every possible value.one branch for every possible value.

Repeat the process recursively for Repeat the process recursively for each branch.each branch.

Page 5: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Decision Trees Example

Minivan

Age

Car Type

YES NO

YES

<30 >=30

Sports, Truck

0 30 60 Age

YESYES

NO

Minivan

Sports,Truck

Page 6: Presentation on Decision trees Presented to: Sir Marooof Pasha.

ID codeID code OutlookOutlook TemperaturTemperaturee

HumidityHumidity WindyWindy PlayPlay

aabbccddeeffgghhiijjkkllmmnn

SunnySunnySunnySunnyOvercastOvercastRainyRainyRainyRainyRainyRainyOvercastOvercastSunnySunnySunnySunnyRainyRainySunnySunnyOvercastOvercastOvercastOvercastRainyRainy

HotHotHotHotHotHotMildMildCoolCoolCoolCoolCoolCoolMildMildCoolCoolMildMildMildMildMildMildHotHotMildMild

HighHighHighHighHighHighHighHighNormalNormalNormalNormalNormalNormalHighHighNormalNormalNormalNormalNormalNormalHighHighNormalNormalHighHigh

FalseFalseTrueTrueFalseFalseFalseFalseFalseFalseTrueTrueTrueTrueFalseFalseFalseFalseFalseFalseTrueTrueTrueTrueFalseFalseTrueTrue

NoNoNoNoYesYesYesYesYesYesNoNoYesYesNoNoYesYesYesYesYesYesYesYesYesYesNoNo

Page 7: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Decision tree ExampleDecision tree Example

Page 8: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Advantages of decision treesAdvantages of decision treesSimple to understand and interpret Simple to understand and interpret Able to handle both numerical Able to handle both numerical

and categorical data and categorical data Possible to validate a model using Possible to validate a model using

statistical tests statistical tests

Page 9: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Nudrat Rehman 07-47Nudrat Rehman 07-47

Page 10: Presentation on Decision trees Presented to: Sir Marooof Pasha.

What is ID3?What is ID3?

A mathematical algorithm for building the A mathematical algorithm for building the decision tree.decision tree.

Invented by J. Ross Quinlan in 1979.Invented by J. Ross Quinlan in 1979.Uses Information Theory invented by Uses Information Theory invented by

Shannon in 1948.Shannon in 1948.Builds the tree from the top down, with no Builds the tree from the top down, with no

backtracking.backtracking. Information Gain is used to select the most Information Gain is used to select the most

useful attribute for classification.useful attribute for classification.

Page 11: Presentation on Decision trees Presented to: Sir Marooof Pasha.

EntropyEntropy

A formula to calculate the homogeneity of a A formula to calculate the homogeneity of a sample.sample.

A completely homogeneous sample has A completely homogeneous sample has entropy of 0.entropy of 0.

An equally divided sample has entropy of 1.An equally divided sample has entropy of 1.Entropy(s) = - p+log2 (p+) -p-log2 (p-) for a Entropy(s) = - p+log2 (p+) -p-log2 (p-) for a

sample of negative and positive elements.sample of negative and positive elements.

Page 12: Presentation on Decision trees Presented to: Sir Marooof Pasha.

The formula for entropy: The formula for entropy:

Page 13: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Entropy ExampleEntropy ExampleEntropy's) = Entropy's) =

- (9/14) Log2 (9/14) - (5/14) Log2 (5/14) - (9/14) Log2 (9/14) - (5/14) Log2 (5/14) = 0.940= 0.940

Page 14: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Information Gain (IG)Information Gain (IG) The information gain is based on the decrease in entropy after a The information gain is based on the decrease in entropy after a

dataset is split on an attribute.dataset is split on an attribute. Which attribute creates the most homogeneous branches?Which attribute creates the most homogeneous branches? First the entropy of the total dataset is calculated.First the entropy of the total dataset is calculated. The dataset is then split on the different attributes.The dataset is then split on the different attributes. The entropy for each branch is calculated. Then it is added The entropy for each branch is calculated. Then it is added

proportionally, to get total entropy for the split. proportionally, to get total entropy for the split. The resulting entropy is subtracted from the entropy before the split.The resulting entropy is subtracted from the entropy before the split. The result is the Information Gain, or decrease in entropy.The result is the Information Gain, or decrease in entropy. The attribute that yields the largest IG is chosen for the decision The attribute that yields the largest IG is chosen for the decision

node.node.

Page 15: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Information Gain (cont’d)Information Gain (cont’d)A branch set with entropy of 0 is a leaf A branch set with entropy of 0 is a leaf

node.node.Otherwise, the branch needs further Otherwise, the branch needs further

splitting to classify its dataset.splitting to classify its dataset.The ID3 algorithm is run recursively on the The ID3 algorithm is run recursively on the

non-leaf branches, until all data is non-leaf branches, until all data is classified.classified.

Page 16: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Nazish yaqoobNazish yaqoob07-1107-11

Page 17: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Example: The SimpsonsExample: The Simpsons

Page 18: Presentation on Decision trees Presented to: Sir Marooof Pasha.

PersonPerson Hair Hair LengthLength

WeightWeight AgeAge ClassClass

HomerHomer 0”0” 250250 3636 MMMargeMarge 10”10” 150150 3434 FF

BartBart 2”2” 9090 1010 MMLisaLisa 6”6” 7878 88 FF

MaggieMaggie 4”4” 2020 11 FFAbeAbe 1”1” 170170 7070 MM

SelmaSelma 8”8” 160160 4141 FFOttoOtto 10”10” 180180 3838 MM

KrustyKrusty 6”6” 200200 4545 MM

ComicComic 8”8” 290290 3838 ??

Page 19: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Hair Length <= 5?yes no

Entropy(4F,5M) = -(4/9)log2(4/9) - (5/9)log2(5/9) = 0.9911

Entropy(1F,3M) = -(1/4)log2(1/4) - (3/4)log2(3/4)

= 0.8113

Entropy(3F,2M) = -(3/5)log2(3/5) - (2/5)log2(2/5)

= 0.9710

np

nnp

nnp

pnp

pSEntropy 22 loglog)(

Gain(Hair Length <= 5) = 0.9911 – (4/9 * 0.8113 + 5/9 * 0.9710 ) = 0.0911

)()()( setschildallEsetCurrentEAGain

Let us try splitting on Hair length

Page 20: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Weight <= 160?yes no

Entropy(4F,5M) = -(4/9)log2(4/9) - (5/9)log2(5/9) = 0.9911

Entropy(4F,1M) = -(4/5)log2(4/5) - (1/5)log2(1/5)

= 0.7219

Entropy(0F,4M) = -(0/4)log2(0/4) - (4/4)log2(4/4)

= 0

np

nnp

nnp

pnp

pSEntropy 22 loglog)(

Gain(Weight <= 160) = 0.9911 – (5/9 * 0.7219 + 4/9 * 0 ) = 0.5900

)()()( setschildallEsetCurrentEAGain

Let us try splitting on Weight

Page 21: Presentation on Decision trees Presented to: Sir Marooof Pasha.

age <= 40?yes no

Entropy(4F,5M) = -(4/9)log2(4/9) - (5/9)log2(5/9) = 0.9911

Entropy(3F,3M) = -(3/6)log2(3/6) - (3/6)log2(3/6)

= 1

Entropy(1F,2M) = -(1/3)log2(1/3) - (2/3)log2(2/3)

= 0.9183

np

nnp

nnp

pnp

pSEntropy 22 loglog)(

Gain(Age <= 40) = 0.9911 – (6/9 * 1 + 3/9 * 0.9183 ) = 0.0183

)()()( setschildallEsetCurrentEAGain

Let us try splitting on Age

Page 22: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Weight <= 160?yes no

Hair Length <= 2?yes no

Of the 3 features we had, Weight was best. But while people who weigh over 160 are perfectly classified (as males), the under 160 people are not perfectly classified… So we simply recurse!

This time we find that we can split on Hair length, and we are done!

Page 23: Presentation on Decision trees Presented to: Sir Marooof Pasha.

Weight <= 160?

yes no

Hair Length <= 2?

yes no

We need don’t need to keep the data around, just the test conditions.

Male

Male Female

How would these people be classified?

Page 24: Presentation on Decision trees Presented to: Sir Marooof Pasha.

It is trivial to convert Decision Trees to rules… Weight <= 160?

yes no

Hair Length <= 2?

yes no

Male

Male Female

Rules to Classify Males/Females

If Weight greater than 160, classify as MaleElseif Hair Length less than or equal to 2, classify as MaleElse classify as Female