A NEAT Way for Evolving Echo State Networks

27
ECAI 2010 ECAI 2010 A NEAT Way for Evolving A NEAT Way for Evolving Echo State Networks Echo State Networks Kyriakos C. Chatzidimitriou Pericles A. Mitkas Intelligent Systems and Software Engineering Labgroup Informatics and Telematics Institute Electrical and Computer Eng. Dept. Centre for Research and Technology-Hellas Aristotle University of Thessaloniki Thessaloniki, Greece

Transcript of A NEAT Way for Evolving Echo State Networks

Page 1: A NEAT Way for Evolving Echo State Networks

ECAI 2010ECAI 2010

A NEAT Way for Evolving A NEAT Way for Evolving Echo State NetworksEcho State Networks

Kyriakos C. ChatzidimitriouPericles A. Mitkas

Intelligent Systems and Software Engineering Labgroup Informatics and Telematics Institute Electrical and Computer Eng. Dept.Centre for Research and Technology-Hellas Aristotle University of Thessaloniki

Thessaloniki, Greece

Page 2: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 2

The problemThe problem

• Engineer fully autonomous-intelligent agents

• Model as reinforcement learning problems• Need good Function Approximators• Plus properties like:

– Non-linear– Non-Markovian

Generalization

Page 3: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 3

Adaptive function Adaptive function approximatorsapproximators

• Problems continue: – Q: What FA to choose…?– A: Something powerful/suitable– Q: Adjust the parameters…

• Neural Nets: Number of neurons? Topology? Weights?

– A: Adaptive function approximators

• FAs built automatically, ad-hoc, per problem/environment

• How? Through the synthesis of learning and evolution Good for autonomy

Good for the user

Page 4: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 4

Our proposalOur proposal

• Our proposal for an adaptive FA methodology– Built bottom up, combining powerful ideas and

algorithms from the research literature into a single methodology

– Each one fills a different gap, developing into something complete

– Design goal: cover as many aspects as possible

Page 5: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 5

The IngredientsThe Ingredients

• 1 ESN (Echo State Network) [H. Jaeger]• 50% NEAT

(NeuroEvolution of Augmented Topologies) [K. Stanley]

• 50% TD-Learning

Page 6: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 6

Basic Echo State Basic Echo State NetworkNetwork

If output units are linear: y(t) = w u(t) + w’ x(t)Linear function with a) linear b) non-linear and temporal features

Large number of features

Sparse

Mean around 0

Spectral radius less than 1

Page 7: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 7

NEAT – Basic PrinciplesNEAT – Basic Principles

• Start minimally and complexify

• Weight & structuralmutation

• Speciation through clusteringto protect innovation

• Crossover networksthrough historical markings on connections

• Adapt it and use it as meta-search algorithm (its principles) for ESNs

1 23

1 2

3

13

2

Page 8: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 8

• Combine global and local search

• Evolution helps learning “avoid traps”• Learning helps evolution to “find better locations” nearby• ESNs allow us to do that easily

– Linear learning schemes so one can use all the classic linear RL/SL learning algorithms (TD, LS, LSTD etc.)

– Need to adjust the part of evolution – adjust NEAT

Evolution and learningEvolution and learning

Page 9: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 9

InitializationInitialization

• Start minimally with 1 reservoir neuron– XOR problem

• Input weights: random [-1,1]

• Output weights: 0• Reservoir weights:

– [-1,1]– Density– Mean(Wres) = 0

Page 10: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 10

Mutation – Add nodeMutation – Add node

• Node added– Adds a new feature,

gene increases– Historical markings to

the node– All the reservoir

connections are initially disabled

• Later enabled through link mutation or crossover

1

32

4

Page 11: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 11

More MutationsMore Mutations

• Add/remove connections• Mutate weights

– Restart– Perturb

• The weights added/changed towards making Mean(Wres) = 0

• Mutate density/spectral radius– Restart– Perturb

1

32

4

Page 12: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 12

CrossoverCrossover

1

32

4

5

1

32

Let’s assume the smallest gene is alsothe fittest.

Page 13: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 13

AlignmentAlignment

1

3

2

4

5

1

3

2

Page 14: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 14

FittestFittest

1

3

2

1

3

2

Page 15: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 15

AlignmentAlignment

1

3

2

4

5

1

3

2

Page 16: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 16

LargestLargest

1

3

2

4

5

1

3

2

Page 17: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 17

SpeciationSpeciation

• ESN are supposed to be sparse• Structural similarity on connections like

NEAT would eliminate the notion of speciation

• Similarity on the basic macroscopic ESN properties:– spectral radius, density, # nodes

Page 18: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 18

LearningLearning

• Use simple GD TD-learning for RL• Use Least Squares for time series - online

updating is not required• Tons of methods can be used here (both

under TD-learning or doing policy search using EC)

• Tested also Darwinian versus Lamarckian evolution

Page 19: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 19

Basic FlowBasic Flow

Init Pop

Simulation Learning

Fitness

Speciation

Selection

Mutation

Crossover

Next Gen

Champion

GeneralizationPerformance

Page 20: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 20

ExperimentsExperiments

• Reinforcement Learning– Mountain Car– Single & Double Pole

Balancing

• Time Series– Mackey-Glass

Page 21: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 21

Time Series – Mackey Time Series – Mackey GlassGlass

• Better prediction errors than another recent TWEANN on ESN algorithm – One order of magnitude both for test and

generalization errors

• Main differences is that we start minimally and do crossover, speciation

Page 22: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 22

Mountain CarMountain Car

• Same behavior as NEAT+Q algorithm– NEAT+Q = “NEAT” + “Q-

Learning through back-propagation” – “recurrences”

• Same generalization behavior (around -50)

• Our approach solves also Non-Markovian 2D and 3D cars problems with learning (only position and not a speed signal is available)

Page 23: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 23

Pole BalancingPole Balancing

• Comparable performance with NEAT with respect to networks evaluated

• Our approach takes more time due to the learning procedure

• 1st bell to accommodate a more advanced learning algorithm than simple GD

Page 24: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 24

Vs.Vs.

• Simple ESN– Problem probably due to on-line learning (online-

learning, RL and NNs not a good triplet)– Not a problem with our approach since NE finds

ESNs “that are better able to learn”

• Linear TD-learning (no reservoir)– No reservoir => No Non-Markovian signals, Worse

behavior

• No-learning, only evolution• No clear conclusions, but vs. simple GD TD-Learning (2nd

bell)

Page 25: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 25

ExperimentsExperiments

• Reinforcement Learning– Mountain Car– Single & Double Pole

Balancing – 3D Mountain Car– Server Job

Scheduling [+++]

• Time Series– Mackey-Glass [+]– MSO [-]– Lorentz Attractor [+]

15% improvement than

NEAT+Q

Page 26: A NEAT Way for Evolving Echo State Networks

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 26

Future WorkFuture Work

• Even more automation, driven by the problem at hand– For example adapting operator probabilities online

• Test new RL-TD learning techniques – i.e. iLSTD, GQ

• More difficult test-beds– TAC Ad Auctions– Poker– Real Time Strategy

Page 27: A NEAT Way for Evolving Echo State Networks

ECAI 2010ECAI 2010

Thank you for your attentionThank you for your attention

Questions?

Kyriakos [email protected]

http://issel.ee.auth.gr