NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

32
NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy Jim Prentzas & Ioannis Hatzilygeroudis Construction of Neurules from Training Examples: A Thorough Investigation University of Patras, Dept of Computer Engin. & Informatics & TEI of Lamia, Dept of Informatics & Computer Technology GREECE

description

NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy. Construction of Neurules from Training Examples: A Thorough Investigation. Jim Prentzas & Ioannis Hatzilygeroudis. University of Patras, Dept of Computer Engin. & Informatics & - PowerPoint PPT Presentation

Transcript of NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Page 1: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

NeSy-2006, ECAI-06 Workshop29 August, 2006, Riva del Garda, Italy

Jim Prentzas & Ioannis Hatzilygeroudis

Construction of Neurules from Training Examples: A Thorough Investigation

University of Patras, Dept of Computer Engin. & Informatics &TEI of Lamia, Dept of Informatics & Computer TechnologyGREECE

Page 2: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Outline• Neurules: An overview• Neurules: Syntax and semantics• Production process-Splitting• Neurules and Generalization• Conclusions

Page 3: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Outline• Neurules: An overview• Neurules: Syntax and semantics• Production process-Splitting• Neurules and Generalization• Conclusions

Page 4: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Neurules: An Overview (1)• Neurules integrate symbolic (propositional) rules

and the adaline neural unit• Give pre-eminence to the symbolic framework• Neurules were initially designed as an

improvement to propositional rules (as far as efficiency is concerned) and produced from them

• To facilitate knowledge acquisition, a method for producing neurules from empirical data was specified

Page 5: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Neurules: An Overview (2)• Preserve naturalness and modularity of production

rules in some (large?) degree• Reduce the size of the produced knowledge base• Increase inference efficiency• Allow for efficient and natural explanations

Page 6: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Outline• Neurules: An overview• Neurules: Syntax and semantics• Production process-Splitting• Neurules and Generalization• Conclusions

Page 7: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Neurules: Syntax and Semantics (1)

Ci: conditions (‘fever is high’)D: conclusion (‘disease is inflammation’)sf0, sfi: bias, significance factors

(sf0) if C1 (sf1), C2 (sf2), …, Cn(sfn)

then D

Page 8: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Neurules: Syntax and Semantics (2)

i

n

iiCsfsffD

10=a (a),

. . .

D

(sf1)(sf2)

(sfn)

(sf0)

C1 C2 Cn

1f(x)

x -1

Ci {1 , -1 , 0} {true , false , unknown} D {1 , -1} {success , failure}

Page 9: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Outline• Neurules: An overview• Neurules: Syntax and semantics• Production process-Splitting• Neurules and Generalization• Conclusions

Page 10: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Initial Neurules Construction1. Make one initial neurule for each possible

conclusion either intermediate or final (i.e. for each value of intermediate and output attributes according to dependency information).

2. The conditions of each initial neurule include all attributes that affect its conclusion, according to dependency information, and all their values.

3. Set the bias and significant factors to some initial values (e.g. 0).

Page 11: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Neurules Production Process1. Use dependency information to construct the

initial neurules.

2. For each initial neurule create its training set from the available empirical data.

3. Train each initial neurule with its training set 3.1 If the training succeeds, produce the resulted neurule

3.2 If not, split the training set in two subsets of close examples and apply recursively steps 3.1 and 3.2 for each subset.

Page 12: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Splitting Strategies (1)DEFINITIONS

training example: [v1 v2 … vn d]

success example: d = 1, failure example: d = -1

closeness: the number of common vi between two success examples

least closeness pair (LCP): a pair of success examples with the least closeness in a

training set

Page 13: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Splitting Strategies (2)REQUIREMENTS

1. Each subset contains all failure examples, to avoid misactivations

2. Each subset contains at least one success example, to assure activation of the corresponding neurule

3. The two subsets should not contain common success examples, to avoid activation of more than one neurule for the same data

Page 14: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Splitting Strategies (1)STRATEGY: CLOSENESS-SPLIT

1. Find the LCPs of the training set S and choose one. Its elements are called pivots.

2. Create two subsets of S, each containing one of the pivots and the success examples of S that are closer to its pivot.

3. Insert in both subsets all the failure examples of S.

4. Train two copies of the initial neurule, one with each subset.

Page 15: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Splitting Strategies (2)

C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 D

-1 -1 1 -1 -1 1 1 1 -1 1 -1 -1 -1 -1

-1 -1 1 -1 -1 1 1 1 1 -1 -1 -1 -1 1

-1 -1 1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1

-1 -1 1 1 -1 1 -1 -1 -1 -1 -1 -1 1 -1

-1 -1 1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1

-1 -1 1 1 -1 1 -1 -1 -1 -1 -1 1 1 1

-1 -1 1 1 -1 1 -1 -1 -1 1 -1 -1 -1 -1

-1 -1 1 1 -1 1 -1 -1 -1 1 -1 1 -1 -1

-1 -1 1 1 -1 1 -1 -1 1 -1 -1 1 -1 -1

-1 -1 1 1 -1 1 -1 -1 1 -1 1 -1 -1 1

-1 1 1 -1 -1 -1 -1 1 -1 -1 -1 -1 1 -1

-1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1

-1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 1 -1

-1 1 1 -1 -1 1 -1 1 -1 1 -1 -1 -1 1

1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1

1 1 1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1

1 1 1 1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1

1 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 1

P1

P2

P3

P4

P5

AN EXAMPLE: Training Set

Page 16: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Splitting Strategies (3)

P1-P5: Success examples, F: Set of failure examples

AN EXAMPLE: Splitting Tree

Page 17: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Splitting Strategies (4)

(-13.5) if venous-conc is slight (12.4), venous-conc is moderate (8.2), venous-conc is normal (8.0),

venous-conc is high (1.2), blood-conc is moderate (11.6), blood-conc is slight (8.3), blood-conc is normal (4.4),

blood-conc is high (1.6), arterial-conc is moderate (8.8), arterial-conc is slight (-5.7),

cap-conc is moderate (8.4), cap-conc is slight (4.5), scan-conc is normal (8.4)

then disease is inflammation

AN EXAMPLE: A produced neurule

Page 18: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Splitting Strategies (5)STRATEGY: ALTERN-SPLIT1

1. If all success examples or only failure examples are misclassified, use closeness based split.

2. If some of the success examples and none of the failure examples are misclassified, split the training set in two subsets: one containing the correctly classified success examples and one containing the misclassified success examples. Add all failure examples in both subsets.

3. If some (not all) of the success and some or all of the failure examples are misclassified, split the training set in two subsets: one containing the correctly classified success examples and the other the misclassified success examples. Add all failure examples in both subsets.

Page 19: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Splitting Strategies (6)STRATEGY: ALTERN-SPLIT2

1. If all success examples or only failure examples are misclassified, use closeness based split.

2. If some of the success examples and none of the failure examples are misclassified, split the training set in two subsets: one containing the correctly classified success examples and one containing the misclassified success examples. Add all failure examples in both subsets.

3. If some of the success and failure examples are misclassified, use closeness based split.

Page 20: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Splitting Strategies (7)LCP SELECTION HEURISTICS• Random Choice (RC)

– Pick up an LCP at random• Best Distribution (BD)

– Choose the LCP that results in distribution of the elements of the other LCPs in different subsets

• Mean Closeness (MC)– Choose the LCP that creates subsets with the greatest

mean closeness

Page 21: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Experimental Results (1)

Dataset

CLOSENESS-SPLIT ALTERN-SPLIT1 ALTERN-SPLIT2

RC MC BD RC MC BD RC MC BD

Monks1_train (124) 17 17 13 22 24 24 19 16 13

Monks2_train (169) 46 47 38 34 32 33 43 49 39

Monks3_train (122) 14 11 12 15 15 15 14 11 13

Tic-Tac-Toe (958) 26 26 24 44 41 40 43 41 38

Car(1728) 151 163 153 189 171 169 152 161 154

Nursery(12960) 830 839 823 1330 1382 1378 837 842 821

Comparing LCP Selection Heuristics

Page 22: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Experimental Results (1)

Dataset

CLOSENESS-SPLIT ALTERN-SPLIT1 ALTERN-SPLIT2

RC MC BD RC MC BD RC MC BD

Monks1_train (124) 17 17 13 22 24 24 19 16 13

Monks2_train (169) 46 47 38 34 32 33 43 49 39

Monks3_train (122) 14 11 12 15 15 15 14 11 13

Tic-Tac-Toe (958) 26 26 24 44 41 40 43 41 38

Car(1728) 151 163 153 189 171 169 152 161 154

Nursery(12960) 830 839 823 1330 1382 1378 837 842 821

Comparing LCP Selection Heuristics

Page 23: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Experimental Results (2)

Dataset

CLOSENESS-SPLIT ALTERN-SPLIT1 ALTERN-SPLIT2

RC MC BD RC MC BD RC MC BD

Monks1_train (124) 17 17 13 22 24 24 19 16 13

Monks2_train (169) 46 47 38 34 32 33 43 49 39

Monks3_train (122) 14 11 12 15 15 15 14 11 13

Tic-Tac-Toe (958) 26 26 24 44 41 40 43 41 38

Car(1728) 151 163 153 189 171 169 152 161 154

Nursery(12960) 830 839 823 1330 1382 1378 837 842 821

Comparing Splitting Strategies

Page 24: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Experimental Results (3)

• LCP Selection Heuristics– BD performs better in most cases– MC although is the computationally most expensive is

rather the worst– RC although the simplest does quite well

• Splitting Strategies– CLOSENESS-SPLIT does better than the others– ALTER-SPLIT2 does better than ALTER-SPLIT1– The ‘closeness’ heuristic is proved to be a good choice

Page 25: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Outline• Neurules: An overview• Neurules: Syntax and semantics• Production process-Splitting• Neurules and Generalization• Conclusions

Page 26: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Neurules and Generalization

• Generalization is an important characteristic of NN-based systems

• Neurules never tested as far as their generalization capabilities are concerned, due to the way they were used

• We present here an investigation of their generalization capabilities in comparison with the Adaline Unit and the BPNN

• We use the same data sets used for the comparison of the strategies

Page 27: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Experimental Results (1)Impact of LCP Selection Heuristics on Generalization

Dataset RC MC BD

Monks1 100% 100% 100%

Monks2 96.30% 96.99% 97.92%

Monks3 92.36% 93.52% 96.06%

Tic-Tac-Toe 98.85% 97.50% 98.12%

Car 94.44% 94.56% 94.50%

Nursery 99.63% 99.53% 99.52%

Page 28: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Experimental Results (2)Neurules Generalization vs Adaline and BPNN

Dataset Adaline Unit Neurules BPNN

Monks1 67.82% 100% 100%Monks2 43.75% 97.92% 100%Monks3 92.13% 96.06% 97.22%

Tic-Tac-Toe 61.90% 98.85% 98.23%Car 78.93% 94.56% 95.72%

Nursery 82.26% 99.63%

Page 29: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Experimental Results (3)

• LCP Selection Heuristics Impact– None of RC, MC, BD has a clearly better

impact, but RC and BD seem to do better than MC

• Neurules– Do quite better than Adaline itself– Less good, but very close to BPNN– Creation of the BPNN more time consuming

than that of a neurule base

Page 30: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Outline• Neurules: An overview• Neurules: Syntax and semantics• Production process-Splitting• Neurules and Generalization• Conclusions

Page 31: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Conclusions

• The “closeness” heuristic used in the process of production of neurules is proved to be quite effective

• The random choice selection heuristic does adequately well

• Neurules generalize quite well

Page 32: NeSy-2006, ECAI-06 Workshop 29 August, 2006, Riva del Garda, Italy

Future Plans

• Compare ‘closeness’ with other known machine learning heuristics (e.g. distance-based heuristics)

• Use neurules for rule extraction