Lecture 5: Linear Genetic Programmingwolff/AI2/Lect05LGP.pdf · Table of Contents 1 GP fundamentals...

Lecture 5: Linear Genetic ProgrammingCIU036 Artificial Intelligence 2, 2010

Krister Wolff, Ph.D.

Department of Applied MechanicsChalmers University of Technology

41296 Goteborg, Sweden

[email protected]

November 23, 2010

Table of Contents

1 GP fundamentals

2 LGP fundamentals

3 Function regression using LGP

4 Evolution of robot gaits using LGP

5 Sources

- - K. Wolff, Ph.D. November 2010 2/38

Table of Contents

1 GP fundamentals

2 LGP fundamentals



5 Sources

GP fundamentals - - K. Wolff, Ph.D. November 2010 3/38

The flow of a generic EA

I The flow of the algorithm is pretty much the same as for GAs, but inGP the individuals of the population can be interpreted as computerprograms (i.e. a list of instructions, written in a programminglanguage, that is used to control the behavior of a machine).


Representation

I In genetic programming the individuals are represented as trees:

I GP trees consist of operators (e.g. +, ∗) and terminals (e.g. x , 1).

I Evaluation proceeds repetedly from the left: f (x) = (x + 1)x + 1

I Here, the individual is representing a mathematical function.Function fitting is an important application area for GP.


Crossover and mutation

I Crossover in tree-based GP:

I Mutation in tree-based GP:

I Randomly select a node in the tree and replace the existing tree at thatpoint with a new subtree.

I The new subtree is created in the same way as the initial population.


GP literature

I Books on Genetic programming (only two examples):

I Wolfgang Banzhaf EtAl; Geneticprogramming An introduction.

I John R. Koza EtAl; GeneticProgramming III.


Table of Contents

1 GP fundamentals

2 LGP fundamentals



5 Sources

LGP fundamentals - - K. Wolff, Ph.D. November 2010 8/38

The atoms of linear genetic programming

I Linear genetic programming (LGP) is a different ”flavor” of GP. Theindividuals are represented by linear structures built up of registersand instructions:

I RegistersI Variable registers; (denoted ri ) are utilized for input, manipulating

data, and storing the resulting output from a calculation.I Constant registers; (denoted ci ) are write protected once they have

been initialized with a value.

I InstructionsI An instruction consists of operator (e.g. +, −, ×, ÷) and operand(s)

(i.e. the arguments): r3 := c2 + r1

I Note that LGP facilitates the possibilty of using multiple programoutputs, whereas tree-based GP provides only one program output.


LGP programs

I Example: A simple LGP program that consist of five instructions:

1 : r2 := r1 + c1

2 : r3 := r2 + r2

3 : r1 := r2 × r3

4 : if (r1 > c2)

5 : r1 := r1 + c1

I LGP program execution starts with the topmost instruction, andproceeds down the list. It stops when the last instruction have beenexecuted.

I Conditional branching (i.e. instruction 4): If true, the subsequentinstruction (i.e. instruction 5) is executed, otherwise that instructionis skipped.


Examples of typical LGP operators

I A basic LGP instruction set (i.e. set of allowed instructions):

Instruction Description Instruction Description

Addition ri := rj + rk Sine ri := sin rj

Subtraction ri := rj − rk Cosine ri := cos rj

Multiplication ri := rj × rk Square ri := rj2

Division1 ri := rj/rk Square root ri :=√

rj

Exponentiation ri := erj Conditional branch if ri > rj

Logarithm ri := ln rj Conditional branch if ri ≤ rj

I The operand(s) (rj , rk) , can either be variable registers or constantregisters, but the destination (ri ) must be a variable register.

1see next slideLGP fundamentals - - K. Wolff, Ph.D. November 2010 11/38

Semantic correctness of LGP programs

I An invalid instruction of an LGP program may cause a program crash.Examples of such problems are:

I Illegal jump instruction (may arise from augmented branchinginstructions).

I Division by zero.

I For the latter case a protected division operator is used:

ri :=

{rj/rk if rk 6= 0cmax otherwise

where cmax is a large pre-specified constant.

I Newly generate programs must be subject to screening in order toensure semantic correctness.


LGP chromosomes

I Example of LGP encoding scheme. Let the operator, operands, anddestination be represented as a sequence of integers2:

I Each gene consists of four integers, e.g. 1 2 1 4.

I In the case of three number instructions (e.g. Sine, Cosine) thefourt number of the gene should be ignored!

2Misprint! 5 1 5 1 should be 5 1 1 5LGP fundamentals - - K. Wolff, Ph.D. November 2010 13/38

LGP encoding

I Consider the registers:I The set of variable registers: R = {r1, r2, r3}I The set of constant registers: C = {c1, c2, c3}I The union of these sets, R∪ C, is denoted A = {r1, r2, r3, c1, c2, c3}.

I Consider the operators:

I The set of operators addition, subtraction, multiplication, division,and the conditional branching ri > rj : O = {o1, o2, o3, o4, o5}

I Consider the first gene of the chromosome: 1 2 1 4.

I 1: Operator ∈ O (addition)I 2: Destination ∈ R (variable register)I 1: Operand 1 ∈ A (any register)I 4: Operand 2 ∈ A (any register)

I Thus, the decoded instruction is: r2 := r1 + c1.


Evaluation of LGP chromosome

I Before a chromosome can be evaluated, the registers must beinitialized. Suppose that:

I Constant registers: c1 = 1, c2 = 3, c3 = 10I Variable registers: r1 = 1, r2 = r3 = 0

Genes Instruction Result (r1, r2, r3)

1 2 1 4 r2 := r1 + c1 1, 2, 0

1 3 2 2 r3 := r2 + r2 1, 2, 4

3 1 2 3 r1 := r2 × r3 8, 2, 4

5 1 1 5 if(r1 > c2) 8, 2, 4

1 1 1 4 r1 := r1 + c1 9, 2, 4

I Furtermore, suppose that the input was provided through register r1and that the output was taken from r1 as well: Output = r1 = 9

I However, note that any of the registers r1 − r3 could be utilized asinput or output register(s)!


Evolutionary operators in LGP

I It is common to use two-point, length-varying (non-homologous),crossover in LGP:

I Apart from being non-homologous, crossover in LGP works prettymuch in the same way as for the bit strings commonly used in GAs.

I Mutation in LGP proceeds as follows: Randomly select aninstruction for mutation, then with some probability:

I change (i) the destination register, (ii) the operator, and (iii) theoperands. That is, within the allowed ranges of course!


Table of Contents

1 GP fundamentals

2 LGP fundamentals



5 Sources

Function regression using LGP - - K. Wolff, Ph.D. November 2010 17/38

Example 3.11: Function regression

Objective

An ”unknown” function f (x) has been measured at L = 101 equidistantpoints in the range [−2, 2]. Use LGP to find a function f (x) that fits themeasured data.

I Solution: An LGP system was implemented with the following setup:I The set of operators where taken asO = {o1, o2, o3, o4} = {+,−,×, /}, encoded as 1, 2, 3 and 4.

I The set of constant registers, C = {c1, c2, c3}, where initialized toc1 = 1, c2 = 3, c3 = −1.

I The set of variable registers, R = {r1, r2, r3}, where initialized tor1 = xi , r2 = r3 = 0. Thus, R∪ C = A = {r1, r2, r3, c1, c2, c3}.

I The L = 101 data points f (xi ), i = 1, ..., L where considered one at atime taking, the output f (xi ) as the content of r1 at the end of theevaluation.



I The fitness function was taken as F = 1/ERMS, where ERMS is theroot-mean-square (RMS) error:

ERMS =

√√√√1

L

L∑i=1

(f (xi )− f (xi ))2

I The evolutionary parameters:

I Tournament selection, size 5, ptour = 0.75I Two-point non-homologous crossover, pc = 0.20I Mutation rate pmut = 0.04.I Chromosome lenght initially: [5, 25].I Population size: 100 individuals.



I Result of the LGP run, after about 50,000 generations:

I The ”unknown” function: f (x) = 1 + 8x − 3x3 − 2x4 + x6.I A typical ”best” function found by the LGP system:

f (x) = 9x + x2 − 3x3 + 2x4 + x6, ERMS = 1.723.I The chromosome consisted of 13 instructions.


Table of Contents

1 GP fundamentals

2 LGP fundamentals



5 Sources

Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 21/38

Example: Generation of bipedal robot gaits using LGP

Objective

Evolve bipedal walking for a simulated, high DOF humanoid robot, bymeans of LGP,

I from first principles,

I with no domain knowledge on walking,

I no information of morphology availible to the LGP system.


The simulated high DOF humanoid robot

I Generic humanoid robot with 26 DOFs (cf. Honda ASIMO).I Sensor feedback: (i) 1 Inertial measurement unit (IMU), (ii) 2

accelerometers and (iii) 26 joint angle sensors.I Simulation environment:

Open dynamics engine (ODE), see http://www.ode.org/Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 23/38

LGP setup

I The operator set:O = {o1, o2, o3, o4, o5, o6, o7} = {+,−,×, /, sin, >,≤}, which wereencoded as 1, 2, 3, 4, 5, 6 and 7.

I The set of registers (variable and/or constant):

I I = {r0, ..., r25} are used for input. Initialized with joint angles.Constant during execution of en individual.

I R = {r26, ..., r51} are used as output registers (variable duringexecution). Initialized with joint angles.

I C = {r52, ..., r54} are used for the ephemeral constants (see below).

I S = {r55, ..., r66} are used for the sensors (IMU and accelerometers:variable).

I Thus, A = I ∪ R ∪ C ∪ S = {r0, ..., r66}.I Note: An ephemeral constant is initialized with a random value in

the beginning of an LGP run.


Controller model


LGP instruction set

I The LGP for bipedal walking used the following instruction set:

Instruction typeGeneralnotation

Input range

Arithmetic ri := rj + rk ri , rj , rk ∈ Roperations ri := rj − rk

ri := rj × rkri := rj/rk

Trigonometric ri := sin rj ri , rj ∈ Rfunctions

Conditional if (rj > rk) rj , rk ∈ Rbranches if (rj ≤ rk)


Encoding scheme for walking

I The operators where encoded as: O = {o1, o2, o3, o4, o5, o6, o7} ={+,−,×, /, sine, >,≤} = {1, 2, 3, 4, 5, 6, 7}.

I Consider the following LGP gene: 55 51 3 42.

I The encoding follows this scheme:

I 55: Operand 1 ∈ A (any register)I 51: Operand 2 ∈ A (any register)I 3: Operator ∈ O (multiplication)I 42: Destination ∈ R (variable register)

I Thus, the corresponding instruction is: r42 := r55 × r51.


An LGP program: (individual and chromosome)


Genetic operators

I Two-point, non-homologous, crossover was used:

I As for the mutation operator, an LGP standard procedure was used(described on a previous slide).


LGP parameters

I The parameters of the LGP system are listed in a Koza tableau:


Evolutionary algorithm

I The main steps in the EA can be summarized as follows:

I (1) Randomly initialize a population.

I (2) Evaluate all individuals according to the fitness measure:

I (2.1) Execute the individual for N simulation timesteps.I (2.2) Compute the fitness value according to the fitness function.

I (3) Perform tournament selection.

I (4) Apply genetic operators crossover and mutation in order toproduce two new individuals.

I (5) Replace the two losers individuals in the population with the newoffspring.

I (6) If termination criterion is not met, go to step (2).

I (7) Stop. The best individual represents the best solution found.


Objective function

I A simple fitness function turned out to work best:

I where rL and rR are the position vectors of the robot’s feet and y isthe unit vector in forwards direction.

I Thus, the fitness was (rougly) proportionate to the distance walked bythe robot, along a straight line.


Additions

I Energy discharging: Human bipedal walking is very energy efficient!

I Termination of slow individuals. If too slow progress, terminate!

I Different body proportions. An adult and a child have different bodyproportions.

I Dual controllers (brains). The robot was starting from standstill, thusa transition phase from standing to walking was involved!

I Reducing the number of foot steps. Large steps were prefered!

I Climbing over obstacles. Force the individuals to take larger andhigher steps!

I Lifting force. Learning to walk without any supervision might be toohard!


Results

I Individuals evolved that could walk essentially indefinitely.


Robot walking

I A gait cycle:


Conlusions from the LGP study

I Unseeded evolution of bipedal gaits was achieved in physical robotsimulations (runs lasting about 12 hours of real time).

I Individuals evolved that could walk essentially indefinitely.

I Energy discharging was the only addition that made a significantimprovement.

I The dual controller approach made the resulting gaits look morerealistic.

I However, the (best) evolved gaits were still not very humanlike, sincethe incentive (for the EA) to generate such gaits was insufficient.


Table of Contents

1 GP fundamentals

2 LGP fundamentals



5 Sources

Sources - - K. Wolff, Ph.D. November 2010 37/38

Sources

I This lecture was based on the following material:(1) Course book by Wahde, M., pp. 72-78.(2) Paper on LGP, by Wolff, K., Evolution of biped locomotion usinglinear genetic programming. Will be availible on the course web pageshortly.

Sources - - K. Wolff, Ph.D. November 2010 38/38

Lecture 5: Linear Genetic Programmingwolff/AI2/Lect05LGP.pdf · Table of Contents 1 GP fundamentals...

Documents

Transcript of Lecture 5: Linear Genetic Programmingwolff/AI2/Lect05LGP.pdf · Table of Contents 1 GP fundamentals...