A temporal classifier system using spiking neural networks

28
A temporal classifier system using spiking neural networks Gerard David Howard, Larry Bull & Pier-Luca Lanzi {david4.howard, larry.bull} @uwe.ac.uk pierluca.lanzi @polimi.it 1

description

David Howard, Larry Bull, Pier Luca Lanzi. "A temporal classifier system using spiking neural networks". IWLCS, 2011

Transcript of A temporal classifier system using spiking neural networks

  • 1. A temporal classifier system using spiking neural networks
    Gerard David Howard, Larry Bull & Pier-Luca Lanzi
    {david4.howard, larry.bull} @uwe.ac.uk
    pierluca.lanzi @polimi.it
    1

2. Contents
Intro & Motivation
System architecture Spiking XCSF
Constructivism (nodes and connections)
Working in continuous space
Comparison to MLP / Q-learner
Taking time into consideration
Comparison to MLP
Simulated robotics
2
3. Motivation
Many real-world tasks incorporate continuous space and continuous time
Autonomous robotics are an unanswered question: will require some degree of knowledge self-shaping or control over their internal knowledge representation
We introduce an LCS containing spiking networks and demonstrate the usefulness of the representation
Handles continuous space and continuous time
Representation structure dependent on environment
3
4. XCSF
Includes computed prediction, which is calculated from input state (augmented by constant x0) and a weight vector each classifier has a weight vector
Weights are updated linearly using modified delta rule
Main differences from canonical:
SNN replaces condition and calculates action
Self-adaptive parameters give autonomous learning control
Topology of networks altered in GA cycle
Generalisation from computed prediction, computed actions and network topologies
4
5. Spiking networks
Spiking networks have temporal functionality
We use Integrate-and-Fire (IAF) neurons
Each neuron has a membrane potential (m) that varies through time
When m exceeds a threshold, the neuron sends a spike to every neuron in the network that it has a forward connection to, and resets m
Membrane potential is a way of implementing memory
5
6. Spiking networks
IAF Spiking network replaces condition and action, 2 input, 3 output nodes
Each input state processed 5 times by spiking network. Neural outputs are spike trains: high (>=3) or low (1.9) darker regions of grid represent higher expected payoff.Reaching goal returns a reward of 1000, else 0
Agent starts randomly anywhere in the grid except the goal state, aims to reach goal (moving 0.05) in fewest possible steps (avg. opt. 18.6)
1.00
0.50
Agent
0.00
0.50
1.00
11
12. Discrete movement
Agent can make a single discrete movement (N,E,S,W) N=(HIGH,HIGH), E=(HIGH,LOW) etc
Experimental parameters N=20000, =0.95, =0.2, 0=0.005, GA=50, DEL=50
XCSF parameters as normal.
Initial prediction error in new classifiers=0.01, initial fitness=0.1
Additional trial from fixed location lets us perform t-tests. Stability shows first step that 50 consecutive trials reach goal state from this location.
12
13. Discrete movement
13

  • Stability = 50 consecutive optimals from a fixed location

14. Fewer macroclassifiers = greater generalisation 15. Lower mutation rate = more stable evolutionary process