Rule Induction with Extension Matrices Yuen F. Helbig Dr. Xindong Wu.
-
date post
21-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of Rule Induction with Extension Matrices Yuen F. Helbig Dr. Xindong Wu.
Rule Induction with Extension Matrices
Yuen F. Helbig
Dr. Xindong Wu
Outline
Extension matrix approach for rule induction The MFL and MCV optimization problems The AE1 solution The HCV solution Noise handling and discretization in HCV Comparison of HCV with ID3-like algorithms
including C4.5 and C4.5 rules
a Number of attributesXa ath attributee Vector of positive examplese– Vector of negative examples
Value of ath attribute in the kth positive examplen Number of negative examplesp Number of positive examples(rij)axb ijth element of axb matrix
A(i,j) ijth element of matrix A
Extension Matrix Terminology
akv
A positive example is such an example that belongs to a known class, say ‘Play’
All the other examples can be called negative examples
Extension Matrix Definitions
)v,...,(ve ak1kk
(overcast, mild, high, windy) => Play
)v,...,(ve ak1kk
(rainy, hot, high, windy) => Don’t Play
Negative example matrix is defined as
nxaijT
n1 )(r)e,...,(eNEM
rainy hot high windy
rainy cool normal windy
sunny hot normal windy
sunny mild high windy
Negative Example Matrix
kijr when, v+
jk NEMij NEMij when, v+
jk NEMij
The extension matrix (EM) of a positive example against NEM, is defined as
p}{1,...,k ,)(rEM nxaijk k dead-element
Extension Matrix
Example Extension Matrix
rainy hot high windy
rainy cool normal windy
sunny hot normal windy
sunny mild high windy
Negative Extension Matrix (NEM)
Positive Example
overcast mild high windy
Example Extension Matrix
rainy hot
rainy cool normal
sunny hot normal
sunny
* *
*
*
* * *
Extension Matrix (EM)
Positive Example
overcast mild high windy
01
10
1
3 2 1X X X
e.g., {X1 1, X2 0, X1 1} and {X1 1, X3 1, X2 0} are paths in the extension matrix above
A set of ‘n’ non-dead elements that come from ‘i’different rows is called a path in an extension matrix
Attributes
Extension matrix
Paths in Extension Matrices
Conjunctive Formulas
A path in the EMk of the positive example k against NEM corresponds to a conjunctive formula or cover ],r[X L ijji
n
1i i
}r,...,{rn1 nj1j
Path: {X 1, X 0, X 1}
Formula: X 1 X 0 X 1
Path: {X 1, X , X 0}
Formula: X 1 X X 0
1 2 1
1 2 1
1 3 2
1 3 2
1
1
A path in the EMD of
against NE corresponds to a conjunctive
formula or cover ],r[X L ijji
n
1i i
which covers
)e,...,(e n1
Against NE and vice-versa
nxaij)(rEMDDisjunction Matrix
ijr when,
otherwise j)NEM(i,j)(i,EM22 ik
k1k
j)(i,EM:}i,...,{ik1kk11
}e,...,{ek1 ii
}r,...,{rn1 nj1j
all of
Extension Matrix Disjunction
EMD Example
rainy hot high windy
rainy cool normal windy
sunny hot normal windy
sunny mild high windy
Negative Extension Matrix (NEM)
EMD Example
rainy hot
rainy cool normal
sunny hot normal
sunny
* *
*
*
* * *
Extension Matrix Disjunction (EMD)
Positive Example
overcast mild high windy
EMD Example
rainy hot
rainy cool
sunny hot
sunny
* *
* *
* *
* * *
Positive Example
overcast mild normal calm
Extension Matrix Disjunction (EMD)
EMD Example
* * * *
* * *
* * *
* * *
cool
sunny
sunny
Positive Example
rainy hot high calm
Extension Matrix Disjunction (EMD)
MFL and MCV (1)
The minimum formula problem (MFL) Generating a conjunctive formula that covers a
positive example or an intersecting group of positive examples against NEM and has the minimum number of different conjunctive selectors
The minimum cover problem (MCV) Seeking a cover that covers all positive
examples in PE against NEM and has the minimum number of conjunctive formulae with each conjunctive formula being as short as possible
MFL and MCV (2)
NP-hard
Two complete algorithms are designed to solve them when each attribute domain Di {i 1,…,a} satisfies |Di 2|O(na2a) for MFLO(n2a4a pa24a) for MCV
When |Di 2|, the domain can be decomposed into several, each having base 2
AE1 Heuristic
Starting search from columns with the most non-dead elements
Simplifying redundancy by deductive inference rules in mathematical logic
Can easily loose optimum solution
Here, AE1 will select [X2 0], [X1 1] , and [X3
1], instead of [X1 1] and [X3 1]
Simplifying redundancy for MFL and MCV itself is NP-hard
Problems with AE1
1
01
10
01
10
1
HCV is a extension matrix based rule induction algorithm which is Heuristic Attribute based Noise tolerant
Divides the positive examples into intersecting groups.
Uses HFL heuristics to find a conjunctive formula which covers each intersecting group.
Low order polynomial time complexity at induction time
What is HCV ?
HCV Issues
The HCV algorithm
The HFL heuristics
Speed and efficiency
Noise handling capabilities
Dealing with numeric and nominal data
Accuracy and description compactness
HCV Algorithm (1)
Procedure HCV(EM1 , ..., EMp ; Hcv)
integer n, a, p matrix EM1(n,a), ..., EMp(n,a), D(p)
set Hcv S1: D D(j) = 1 (j = 1, . . . , p) indicates that
EM j has been put into an intersecting group.
Hcv initializationS2: for i = 1 to p, do
if D(i) = 0 then { EM EM i
HCV Algorithm (2)
for j = i+1 to p, do if D(j) = 0 then
{ EM2 EM EMj
If there exists at least one path in EM2 then { EM EM2, D(j) 1 }
}next j call HFL(EM; Hfl) HcvHcv Hfl}
next i Return (Hcv)
HFL - Fast Strategy
Selector [X5 {normal, dry-peep}] can be a possible selector, which will cover all 5 rows
normalfastmediumlow
peepdryfastspotslightabsent
normalstripslightlow
peepdryfasthale
normalstripslightabsent
1
01
10
01
10
1
1
01
10
01
10
1
HFL - Precedence
Selector [X1 1] and [X3 1] are two inevitable selectors in the above extension matrix
1
01
10
01
10
1
HFL - Elimination
Attribute X2 can be eliminated by X3
01
101
10
101
010
11
HFL - Least Frequency
Attribute X1 can be eliminated and there still exists a path
10
01
10
01
10
11
HFL Algorithm (1)
Procedure HFL(EM; Hfl)
S0: Hfl {}
S1: /* the fast strategy */
Try the fast strategy on all these rows which haven't
been covered;
If successful, add a corresponding selector to Hfl
and return(Hfl)
S2: /* the precedence strategy */
Apply the precedence strategy to the uncovered rows;
If some inevitable selectors are found,
add them to Hfl, label all the rows they cover,
and go to S1
HFL Algorithm (2)
S3: /* the elimination strategy */ Apply the elimination strategy to those attributes
that have neither been selected nor eliminated; If an eliminable selector is found, reset all the
elements in the corresponding column with *, and go to S2.
S4: /* the least frequency strategy */ Apply the least frequency strategy to those attributes
which have neither been selected nor eliminated, and find a least frequency selector;
Reset all the elements in the corresponding column with *, and go to S2.
Return(Hfl)
Complexity of HFL
S1 - O(na) S2 - O(na) S3 - O(na2) S4 - O(na)
Overall - O( a(na na na2 na) ) O(na3)
Complexity of HCV
Worst case time complexity
Space requirement 2na
1)))(na1)nana(2na(naO(p
1i
p
1ij
3
na)pO(pna 23
HCV Example
Fever Cough X Ray ESR AUSCULTATION DISEASE
high heavy flack normal bubble like Pneumonia
medium heavy flack normal bubble like Pneumonia
low slight spot normal dry peep Pneumonia
high medium flack normal bubble like Pneumonia
medium slight flack normal bubble like Pneumonia
absent slight strip normal normal Tuberculosis
high heavy hole fast dry peep Tuberculosis
low slight strip normal normal Tuberculosis
absent slight spot fast dry peep Tuberculosis
low medium flack fast normal Tuberculosis
HCV Example
absent slight strip normal normal
high heavy hole fast dry peep
low slight strip normal normal
absent slight spot fast dry peep
low medium flack fast normal
NEM
HCV Example
absent slight strip normal
hole fast dry peep
low slight strip normal
absent slight spot fast dry peep
low medium fast normal
*
* *
*
*
EM1
Positive Example 1
high heavy flack normal bubble like
HCV Example
absent slight strip normal
high hole fast dry peep
low slight strip normal
absent slight spot fast dry peep
low medium fast normal
*
*
*
*
EM2
Positive Example 2
medium heavy flack normal bubble like
HCV Example
absent strip normal
high heavy hole fast
strip normal
absent fast
medium flack fast normal
* *
*
* * *
* * *
*
EM3
Positive Example 3
low slight spot normal dry peep
HCV Example
absent slight strip normal
heavy hole fast dry peep
low slight strip normal
absent slight spot fast dry peep
low fast normal
*
*
*
* *
EM4
Positive Example 4
high medium flack normal bubble like
HCV Example
absent strip normal
high heavy hole fast dry peep
low strip normal
absent spot fast dry peep
low medium fast normal
* *
* *
*
*
EM5
Positive Example 5
high medium flack normal bubble like
HCV Example
EM1 EM2
absent slight strip normal
hole fast dry peep
low slight strip normal
absent slight spot fast dry peep
low medium fast normal
*
* *
*
*
HCV Example
EM1 EM2 EM3
absent strip normal
hole fast
strip normal
absent fast
medium fast normal
* *
* * *
* * *
* * *
* *
HCV Example
EM1 EM2 EM3 EM4
absent strip normal
hole fast
strip normal
absent fast
fast normal
* *
* * *
* * *
* * *
* * *
HCV Example
EM1 EM2 EM3 EM4 EM5
absent strip normal
hole fast
strip normal
absent fast
fast normal
* *
* * *
* * *
* * *
* * *
HCV Example
HFL Step 1: Fast Strategy
absent strip normal
hole fast
strip normal
absent fast
fast normal
* *
* * *
* * *
* * *
* * *
HFL Rules = {}
HCV Example
HFL Step 2: Precedence
absent strip normal
hole fast
strip normal
absent fast
fast normal
* *
* * *
* * *
* * *
* * *
HFL Rules = {}
HCV Example
HFL Step 3: Elimination
absent strip normal
hole fast
strip normal
absent fast
fast normal
* *
* * *
* * *
* * *
* * *
HFL Rules = {}
HCV Example
absent strip normal
hole fast
strip normal
absent fast
fast normal
* *
* * *
* * *
* * *
* * *
HFL Rules = {}
HFL Step 4: Least-Frequency
HCV Example
HFL Step 4: Least-Frequency
* * *
* * *
* * *
* * * *
* * *
strip normal
hole fast
strip normal
fast
fast normal
HFL Rules = {}
HCV Example
HFL Step 2: Precedence
* * *
* * *
* * *
* * * *
* * *
strip normal
hole fast
strip normal
fast
fast normal
HFL Rules = {ESR fast }
HCV Example
HFL Step 2: Precedence
* * *
* * * * *
* * *
* * * * *
* * * * *
strip normal
strip normal
HFL Rules = {ESR fast }
HCV Example
* * *
* * * * *
* * *
* * * * *
* * * * *
strip normal
strip normal
HFL Step 1: Fast Strategy
HFL Rules = {ESR fast , Auscultation normal }
HCV Example
HFL Step 1: Fast Strategy
* * * *
* * * * *
* * * *
* * * * *
* * * * *
normal
normal
HFL Rules = {ESR fast , Auscultation normal }
HCV Example
HCV generated rule
C4.5rules generated rule
Example (8)
HCV versus AE1
The use of disjunctive matrix
Reasonable solution to MFL and MCV
Noise handling
Discretization of attributes
HCV Noise Handling
Don’t care values are dead elements
Approximate partitioning
Stopping criteria
Discretization of Attributes
Information Gain Heuristic
Stop splitting criteria Stop if the information gain on all cut points is the
same. Stop if the number of example to split is less than a
certain number. Limit the total number of intervals.
Comparison (1)
Training Set 1 Training Set 2 Training Set 3Algorithmrules conditions rules conditions rules conditions
ID3 53 216 105 498 30 98C4.5 60 262 113 566 27 89
C4.5 with grouping 9 31 55 353 20 102C4.5 Rules 31 101 97 374 23 65
C4.5rules with grouping 8 19 46 188 11 35NewID 21 143 59 401 18 101HCV 7 16 39 168 18 62
Table 1: Number of rules and conditions using Monk 1, 2 and 3 dataset as training set 1, 2 and 3 respectively
Comparison (2)
Table 2: AccuracyAlgorithm Test Set 1 Test Set 2 Test Set 3
ID3 83.3% 68.3% 94.4%C4.5 82.4% 69.7% 90.3%
C4.5 with grouping 100% 82.4% 93.1%C4.5 Rules 92.4% 75.7% 85.4%
C4.5rules with grouping 100% 81.0% 91.4%NewID 93% 78% 89%HCV 100% 81.7% 90.3%
Comparison (3)
Conclusions
Rules generated in HCV take the form of variable-valued logic rules, rather than decision trees
HCV generates very compact rules in low-order polynomial time
Noise handling and discretization
Predictive accuracy comparable to the ID3 family of algorithms viz., C4.5, C4.5rules