Abstract

21
1 Abstract This study presents an analysis of two modified fuzzy ARTMAP neural networks. The modifications are first introduced mathematically. Then, the performance of the systems is studied on benchmark examples with noiseless data. It is shown that each modified ARTMAP system achieves classification accuracy superior to that of standard fuzzy ARTMAP, while retaining comparable complexity of the internal code. 1. In the first modified ARTMAP system, a graded choice-by- difference (CBD) signal function takes the choice signal T j to be dependent on the input position, even when the input lies within the category box R j . Namely, an input near the center of the box R j generates a larger signal T j than an input near the boundary of the box. In order to ensure that the same input would choose the same category if it were immediately re-presented (direct access), the ART match rule was also modified, to correspond to the new choice rule. The resulting graded signal function system creates more accurate decision boundaries, especially when these boundaries are not parallel to the input space axes.

description

Abstract. - PowerPoint PPT Presentation

Transcript of Abstract

Page 1: Abstract

1

Abstract This study presents an analysis of two modified fuzzy ARTMAP neural networks. The modifications are first introduced mathematically. Then, the performance of the systems is studied on benchmark examples with noiseless data. It is shown that each modified ARTMAP system achieves classification accuracy superior to that of standard fuzzy ARTMAP, while retaining comparable complexity of the internal code.1. In the first modified ARTMAP system, a graded choice-by-

difference (CBD) signal function takes the choice signal Tj to be dependent on the input position, even when the input lies within the category box Rj. Namely, an input near the center of the box Rj generates a larger signal Tj than an input near the boundary of the box. In order to ensure that the same input would choose the same category if it were immediately re-presented (direct access), the ART match rule was also modified, to correspond to the new choice rule. The resulting graded signal function system creates more accurate decision boundaries, especially when these boundaries are not parallel to the input space axes.

Page 2: Abstract

2

2. In the second modified ARTMAP system, all category boxes Rj are point boxes. This simplified network learns with a fast-commit/no-recode rule, which does not allow any learning at node j once category j has been established. In addition, vigilance () is set to zero, which eliminates the matching system. Each input that makes a predictive error creates a new category, which is encoded as the input itself. The classification accuracy obtained with this point-box system is better than that of the other studied systems. However, the point-box system has a potential drawback in that its memory requirements may be high for large databases. To alleviate this problem, a strategy for on-line elimination of redundant categories is proposed and evaluated. This strategy can be interpreted as a rule for on-line forgetting of certain stored memories. Its application leads to a significant reduction in memory requirements while retaining classification accuracy.

Page 3: Abstract

3

ARTMAPNeural network for supervised learning.

Output class

Input

Signalfunction

Categorychoice

b

Page 4: Abstract

4

ART1: Weber law (1987)

Fuzzy ARTMAP: Choice-by-difference(CBD, 1994)

NEW: Graded signal function that improves behavior for certain types of data

1. Signal Functions

j

j

jTw

Aw

jjj RRdMT ),()2( a

jR

ju

jv

ad1a

2a2M

cjjj vuw ,

jj MR w

caaA ,

Page 5: Abstract

5

(1994) CBD Signal Function

DIAGONAL Circle-in-Square (CIS)

94.41 % correct40.7 coding nodes

91.70 % correct15.2 coding nodes

SIMULATIONS: Average of 10 runs, 1 training epoch

Training points: 1,000 (DIAG) or 10,000 (CIS), Testing points: 10,000

Page 6: Abstract

6

NEW Graded Signal Function

DIAGONAL Circle-in-Square (CIS)

96.73 % correct (=0.5)46.4 coding nodes

95.26 % correct (=1)17.8 coding nodes

• Smoother boundaries• Improved % correct but slightly more nodes

Page 7: Abstract

7

Choice Signal with CBD and Graded Signal Function in Two-dimensional Input Space

(1994) CBD signal functionSignal constant for inputs within aRj.Flat top (=0)

NEW Graded signal functionSignal position-dependentfor inputs within Rj.Peak-height determined byparameter

Page 8: Abstract

8

Decision Boundaries Between Overlapping Category Boxes with CBD and Graded Signal Function

R1

R2

CBD

Graded

Page 9: Abstract

9

Graded Signal Function Implementation: Match

Input

Match functionReset orResonance ?

ARTMAP design principleRepeated presentation of the same input should lead to a choice of correct node without search.

Need to modify match function.

Output class

Input

b

Tj

Page 10: Abstract

10

2. Point ARTMAP

Input

Minimum algorithm with critical properties of

Adaptive Resonance Theory

Inputa

Labeledcoding points

Output class

Point ARTMAP

ARTMAP Outputclass

b

Page 11: Abstract

11

Point ARTMAP - Learning Cycle

a

Input

a a

Output class

a

… do nothing

1. New input presentation leads to a choice of closest coding point

2a. If chosen coding node matches the output class ...

2b. If it does not ... … store current input into memory

a

Page 12: Abstract

12

Point ARTMAPCircle-in-Square (CIS)

98.72 % correct275 coding nodes

DIAGONAL

97.8 % correct59.2 coding nodes

Page 13: Abstract

13

Point ARTMAP - Results

• Best performance• Fastest learning• When training stops at a given fuzzy

ARTMAP accuracy, memory sizes are comparable

• Potential danger of many coding nodes

HOW to restrict network size while assuring improvement of its performance as more patterns are presented?

Possible solution:Compute continually for each node a measure of its usefulness/criticality and eliminate least useful nodes as needed.

Page 14: Abstract

14

How to compute usefulness of nodes?• Required general properties:

– local, fast, simple computation– computed only for one (or a few) nodes per input

presentationCriterion for acceptability of any usefulness rule:MN number of training patterns from a given training set, in response to which the network

reaches size N.Usefulness rule must assure that the network codelearned in response to M>MN inputs is better than that for MN inputs.

When to eliminate a node?• Many possible choices, in this study “hard limit:”

– Define the maximum network size N– Once network size N is reached, one of the existing nodes

(the least useful one) is eliminated every time a new node is created

Usefulness Rule

Page 15: Abstract

15

Usefulness Rule - DefinitionGeneral definition:

Usefulness error [if eliminated]

Implemented rule:

Usefulness updated every time nodewins the competition and gives correct prediction (step 2a).

Usefulness increases if the node is critical,i.e., if elimination of the current winner would lead to a predictive error.

Usefulness decreases if the node is non-critical,i.e., the network would give a correct predictioneven without it.

Initial usefulness is zero.

a

Page 16: Abstract

16

Point ARTMAP with On-line Elimination

0 5 10 15 20 25 30 35 40 45 5098.7

98.8

98.9

99

99.1

Iteration number [in 10,000s]

% c

orr

ect

Circle-in-Square (CIS) with network size frozen after 10,000 iterations (275 nodes)

Page 17: Abstract

17

Development of Internal Code in Point ARTMAP with On-line Elimination

10,000 train. points 100,000 200,000

500,000400,000300,000

Learned code after presentation of training set of different sizes

Page 18: Abstract

18

Point ARTMAP with On-Line Elimination - Discussion

Properties:• ability to improve coding with time without growing in size• ability to correct a learned error• ability to adapt in a non-stationary environment• can be understood as rule for optimal forgetting

In a noisy environment:• without on-line elimination, the system will grow without

limits• current elimination rule can lead to incorrect behavior• goal - find a rule that will secure optimum performance

Page 19: Abstract

19

Summary

• Grdaded signal function is an extension of the CBD signal function that distinguishes between points within category boxes.

• Point ARTMAP is a minimum version of a system for supervised learning based on Adaptive Resonance Theory. It is fast and very simple, with tendency to proliferate stored categories. This deficiency can be alleviated by several local pruning rules.

• Both systems, especially Point ARTMAP, performed very well on benchmark problems with noiseless data.

• Additional testing on noisy data is needed.

Page 20: Abstract

20

Appendix 1: Simulations with Diagonal data

0 0.2 0.4 0.6 0.8 180

84

88

92

96

100

% C

orre

ct p

redi

ctio

ns

steepness parameter

100 training points

1000 training points

10000 training points

0 0.2 0.4 0.6 0.8 1 PointARTMAP

0

20

40

60

80

Nu

mb

er

of c

od

ing

no

des 177.3

PointARTMAP

100 training points

1000 training points

10000 training points

fuzzyARTMAP

fuzzyARTMAP

Simulations of fuzzy ARTMAP with standard CBD (=0) and with graded CBD function (>0), and of Point ARTMAP on the diagonal benchmark problem

Page 21: Abstract

21

Appendix 2: Simulations with Circle-in-the-square data

steepness parameter

0 0.2 0.4 0.6 0.8 180

84

88

92

96

100

% C

orre

ct p

redi

ctio

ns

100 training points

1000 training points

10000 training points

PointARTMAP

fuzzyARTMAP

0 0.2 0.4 0.6 0.8 10

20

40

60

80

275

PointARTMAP

fuzzyARTMAP

Nu

mb

er

of c

od

ing

no

des

100 training points

1000 training points

10000 training points

Simulations of fuzzy ARTMAP with standard CBD (=0) and with graded CBD function (>0), and of Point ARTMAP on the circle-in-the-square benchmark problem