Evolving a Sigma-Pi Network as a Network Simulator by Justin Basilico.

21
Evolving a Sigma-Pi Network as a Network Simulator by Justin Basilico

Transcript of Evolving a Sigma-Pi Network as a Network Simulator by Justin Basilico.

Evolving a Sigma-Pi Network as a Network Simulator

by Justin Basilico

Problem description To evolve a neural network that acts

as a general network which can be “programmed” by its inputs to act as a variety of different networks. Input: Another network and its input. Output: The output of the given network

on the input. Use a sigma-pi network and evolve

its connectivity using a genetic algorithm.

Problem motivation If a network can be created to

simulate other networks given as input, then perhaps we can build neural networks that act upon other neural networks.

It would be interesting to see if one network could apply backpropagation to another network.

Previous work This problem remains largely

unexplored. Evolutionary techniques have been

applied to networks similar to sigma-pi networks Janson & Frenzel, Training Product Unit

Neural Networks with Genetic Algorithms (1993) Evolved product networks, which are “more

powerful” than sigma-pi networks since they allow a variable exponent

Previous work Papers from class that are provide

some information and inspiration: Plate, Randomly connected sigma-pi

neurons can form associator networks (2000)

Belew, McInerney, & Schraudolph, Evolving Networks: Using the Genetic Algorithm with Connectionist Learning (1991)

Chalmers, The Evolution of Learning: An Experiment in Genetic Connectionism (1990)

Approach (Overview) Generate a testing set of 100

random networks to simulate. Generate initial population of

chromosomes for sigma-pi network. For each generation, decode each

chromosome into a sigma-pi network and use fitness function to evaluate network’s fitness as a simulator using the testing set.

Approach (Overview) First try to simulate single-layer

networks: 2 input and 1 output units 2 input and 2 output units

Then try it on a multi-layer network: 2 input, 2 hidden, and 2 output units

Approach Input encoding

The simulation network is given the input to the simulated network along with the weight values for the network it is simulating.

Generate a fully-connected, feed-forward network with random weights along with a random input, then feed the input through the network to get the output.

Approach Input encoding

Example:

outputinput1

bias

w30 w31 w32 w40 w41 w42 w50

input2

w30

w31

w32

w53 w54

Network:

w40

w41

w42

w50

w53

w54

inputs weight layer 1 weight layer 2

Input encoding:

hidden 1

hidden 2

input1

input2

Approach Target output encoding

The output that the randomly weighted network produces on its random input.

Approach Chromosome encoding

Each chromosome encodes the connectivity (architecture) of the sigma-pi network.

To simplify things, allow network weights to either be 1.0, signifying there is a connection there, or 0.0 signifying that there is not.

Initialize chromosome to random string of bits.

Approach Chromosome encoding:

To encode the connectivity of a layer with m units to a layer with n units, use a binary string of length:

(m + 1) n Example: 011010 110001

bias

Approach Genetic algorithm

Selection: Chromosomes ranked by fitness, probability of selection based on rank.

Crossover: Randomly select bits in chromosome for crossover. (I might add in some sort of functional unit here.)

Mutation: Each bit in every chromosome has a mutation rate of 0.01.

Approach Fitness function

Put build a sigma-pi network using the chromosome.

Test the sigma-pi network on a testing set of 100 networks.

Better chromosomes have smaller fitness value.

Approach Fitness function

Attempt 1: Mean squared error. Problem: Evolved networks just always

guessed 0.5 because a sigmoid activation function was used.

Attempt 2: Number of incorrect outputs within a threshold of 0.05. Problem: We want an optimum solution with

as few weights in the network as possible.

Attempt 3: Use second function and also factor in number of 1’s in chromosome.

Results So far:

Tried to train a backpropagation network to do simulation, but it did not work.

Managed to evolve sigma-pi network architectures to simulate simple, one layer networks with linear units.

Still working on simulating networks with two layers and with sigmoid units.

Results Network with 2 input, 1 output units

and linear activation Population: 100 Optimal solution after 24 generations

bias

w30

input1

input2 w31 w32

Results Network with 2 input, 2 output units

and linear activation Population: 150 Optimal solution after 121 generations

(stabilizes after 486)bias

w30

input1

input2 w31 w32 w40 w41 w42

Results (so far) Network with 2 input, 2 hidden, and 1

output units and linear activation Have not gotten it to work yet. Sigma-pi network has 3 hidden layers (11-

9-5-3-1) Might be a problem to do with sparse

connectivity of solution where input weights need to be “saved” for later layers.

Potential solution: Fix more of the network network architecture so the size of the chromosome is smaller.

Future work Expand evolution parameters to allow

wider variation in the evolved networks (network weights, activation functions).

Try to simulate larger networks. Evolve a network that implements

backpropagation: Start small with just the delta-rule for

output and hidden units. Work up to a network that does full

backpropagation.

More future work Evolve networks that create their

own learning rules. Use a learning algorithm for

training sigma-pi networks rather than evolution.

Create a sigma-pi network that simulates other sigma-pi networks.