Classification & ID3

12
Classification & ID3 Dr. Riggs Spring 2004

description

Classification & ID3. Dr. Riggs Spring 2004. Classification Problem. Given a some number of observed features Predict an unobserved feature (the ‘class’) Example: Given features of a borrower Predict whether he will default An interesting problem is learning rules from examples. - PowerPoint PPT Presentation

Transcript of Classification & ID3

Page 1: Classification & ID3

Classification & ID3

Dr. Riggs

Spring 2004

Page 2: Classification & ID3

04/19/23 Dr. Riggs 2

Classification Problem• Given a some number of observed features• Predict an unobserved feature (the ‘class’)

– Example:• Given features of a borrower• Predict whether he will default

• An interesting problem is learning rules from examples

Page 3: Classification & ID3

04/19/23 Dr. Riggs 3

Example Data ?id ?size ?color ?shape ?class

• (item 1 medium blue brick yes)• (item 2 small red sphere yes)• (item 3 large green pillar yes)• (item 4 large green sphere yes)• (item 5 small red wedge no)• (item 6 large red wedge no)• (item 7 large red pillar no)

Page 4: Classification & ID3

04/19/23 Dr. Riggs 4

Distinguish ALL By Size{1,2,3,4,5,6,7} = All

small medium large

{2 5 } { 1 } {3 4 6 7 }

Class: Y N Y Y Y N N

Rule: (feature ?id size medium) => (class ?id yes)

Page 5: Classification & ID3

04/19/23 Dr. Riggs 5

Distinguish {2 5} By Shape{2 5} = size: small

sphere wedge

{2 } {5 }

Class:: Y N

(feature ?id size small)(feature ?id shape sphere) => (class ?id yes)

(feature ?id size small)(feature ?id shape wedge) => (class ?id no)

Page 6: Classification & ID3

04/19/23 Dr. Riggs 6

Distinguish {3 4 6 7 } By COLOR

{ 3 4 6 7 } = size: large

green red

{3 4 } { 6 7 }

C: Y Y N N(feature ?id size large)(feature ?id color green) => (class ?id yes)

(feature ?id size large)(feature ?id color red) => (class ?id no)

Page 7: Classification & ID3

04/19/23 Dr. Riggs 7

Considerations • Are the examples enough?

– The examples must be enough to tell the classes apart

• This is an unsolvable question

• Are the rules the most efficient?– We could have made other choices

• What should we uses to compare choices?

Page 8: Classification & ID3

04/19/23 Dr. Riggs 8

Entropy• Measures ‘disorder’• Def:

n

H(m1..mn) = - Pr(mi) * lg( Pr(mi ) ) i=1

• Example (entropy of learning set):– Messages (m1…m7) : Y Y Y Y N N N– Pr(Y) = 4/7 Pr(N) = 3/7– H = - [ 4/7*lg 4/7 + 3/7*lg(3/7)

– - [ .571*-.243 + .429*-.368] = .985

lg = log2

Page 9: Classification & ID3

04/19/23 Dr. Riggs 9

GainIf a set is partitioned by a feature into subsets• the gain in entropy is: Original_entropy - the_weighted_sum_of_subclass_entropies

• Eg: Partition ALL = {1,2,3,4,5,6,7} by COLOR

{blue 1 } {red 2 5 6 7} {green 3 4} partition

=>{blue Y} {red Y N N N} {green Y Y} map

H(blue)=0 H(red)= .811 H(green)=0

• GAIN(color) = H(all) - |ss|/|all|*H(ss) ss=red,green,blue

= .985 – ( 1/7*0 + 4/7*.811 + 2/7*0) = .522

Page 10: Classification & ID3

04/19/23 Dr. Riggs 10

Distinguish All By Color{1,2,3,4,5,6,7} = All

blue red green

{1} { 2 5 6 7 } { 3 4 }

map: Y Y N N N Y Y

H: 0 -1/4lg1/4 -3/4lg3/4 0wH= 0 + 4/7*(.5+.31) + 0 = .81

Gain = .985 - .464 = .521

Page 11: Classification & ID3

04/19/23 Dr. Riggs 11

Distinguish All By Shape{1,2,3,4,5,6,7} = All

brick sphere wedge

{1} { 2 4 } { 5 6 }

C: Y Y Y N N Y N

H: 0 0 0 1wH= 0 + 2/7*0 + 2/7 *0 + 2/7*1 =.286

Gain = .985 -.286 = .699

pillar

{3 7}

Page 12: Classification & ID3

04/19/23 Dr. Riggs 12

ID31. Given: a learning set (LS)

– examples w/ features & outcome (class)

2. Use each (feature,value) to partition the LS

3. Calculate H for each partition Pf,v

4. Calculate the gain for each feature– Original H – | Pf,v | / |LS| * H(Pf,v)

v

5. Partition by the feature with highest gain

6. Apply ID3 to any subsets Pf,v with H>0