Lecture 5: Linear Genetic Programmingwolff/AI2/Lect05LGP.pdf · Table of Contents 1 GP fundamentals...
Transcript of Lecture 5: Linear Genetic Programmingwolff/AI2/Lect05LGP.pdf · Table of Contents 1 GP fundamentals...
Lecture 5: Linear Genetic ProgrammingCIU036 Artificial Intelligence 2, 2010
Krister Wolff, Ph.D.
Department of Applied MechanicsChalmers University of Technology
41296 Goteborg, Sweden
November 23, 2010
Table of Contents
1 GP fundamentals
2 LGP fundamentals
3 Function regression using LGP
4 Evolution of robot gaits using LGP
5 Sources
- - K. Wolff, Ph.D. November 2010 2/38
Table of Contents
1 GP fundamentals
2 LGP fundamentals
3 Function regression using LGP
4 Evolution of robot gaits using LGP
5 Sources
GP fundamentals - - K. Wolff, Ph.D. November 2010 3/38
The flow of a generic EA
I The flow of the algorithm is pretty much the same as for GAs, but inGP the individuals of the population can be interpreted as computerprograms (i.e. a list of instructions, written in a programminglanguage, that is used to control the behavior of a machine).
GP fundamentals - - K. Wolff, Ph.D. November 2010 4/38
Representation
I In genetic programming the individuals are represented as trees:
I GP trees consist of operators (e.g. +, ∗) and terminals (e.g. x , 1).
I Evaluation proceeds repetedly from the left: f (x) = (x + 1)x + 1
I Here, the individual is representing a mathematical function.Function fitting is an important application area for GP.
GP fundamentals - - K. Wolff, Ph.D. November 2010 5/38
Crossover and mutation
I Crossover in tree-based GP:
I Mutation in tree-based GP:
I Randomly select a node in the tree and replace the existing tree at thatpoint with a new subtree.
I The new subtree is created in the same way as the initial population.
GP fundamentals - - K. Wolff, Ph.D. November 2010 6/38
GP literature
I Books on Genetic programming (only two examples):
I Wolfgang Banzhaf EtAl; Geneticprogramming An introduction.
I John R. Koza EtAl; GeneticProgramming III.
GP fundamentals - - K. Wolff, Ph.D. November 2010 7/38
Table of Contents
1 GP fundamentals
2 LGP fundamentals
3 Function regression using LGP
4 Evolution of robot gaits using LGP
5 Sources
LGP fundamentals - - K. Wolff, Ph.D. November 2010 8/38
The atoms of linear genetic programming
I Linear genetic programming (LGP) is a different ”flavor” of GP. Theindividuals are represented by linear structures built up of registersand instructions:
I RegistersI Variable registers; (denoted ri ) are utilized for input, manipulating
data, and storing the resulting output from a calculation.I Constant registers; (denoted ci ) are write protected once they have
been initialized with a value.
I InstructionsI An instruction consists of operator (e.g. +, −, ×, ÷) and operand(s)
(i.e. the arguments): r3 := c2 + r1
I Note that LGP facilitates the possibilty of using multiple programoutputs, whereas tree-based GP provides only one program output.
LGP fundamentals - - K. Wolff, Ph.D. November 2010 9/38
LGP programs
I Example: A simple LGP program that consist of five instructions:
1 : r2 := r1 + c1
2 : r3 := r2 + r2
3 : r1 := r2 × r3
4 : if (r1 > c2)
5 : r1 := r1 + c1
I LGP program execution starts with the topmost instruction, andproceeds down the list. It stops when the last instruction have beenexecuted.
I Conditional branching (i.e. instruction 4): If true, the subsequentinstruction (i.e. instruction 5) is executed, otherwise that instructionis skipped.
LGP fundamentals - - K. Wolff, Ph.D. November 2010 10/38
Examples of typical LGP operators
I A basic LGP instruction set (i.e. set of allowed instructions):
Instruction Description Instruction Description
Addition ri := rj + rk Sine ri := sin rj
Subtraction ri := rj − rk Cosine ri := cos rj
Multiplication ri := rj × rk Square ri := rj2
Division1 ri := rj/rk Square root ri :=√
rj
Exponentiation ri := erj Conditional branch if ri > rj
Logarithm ri := ln rj Conditional branch if ri ≤ rj
I The operand(s) (rj , rk) , can either be variable registers or constantregisters, but the destination (ri ) must be a variable register.
1see next slideLGP fundamentals - - K. Wolff, Ph.D. November 2010 11/38
Semantic correctness of LGP programs
I An invalid instruction of an LGP program may cause a program crash.Examples of such problems are:
I Illegal jump instruction (may arise from augmented branchinginstructions).
I Division by zero.
I For the latter case a protected division operator is used:
ri :=
{rj/rk if rk 6= 0cmax otherwise
where cmax is a large pre-specified constant.
I Newly generate programs must be subject to screening in order toensure semantic correctness.
LGP fundamentals - - K. Wolff, Ph.D. November 2010 12/38
LGP chromosomes
I Example of LGP encoding scheme. Let the operator, operands, anddestination be represented as a sequence of integers2:
I Each gene consists of four integers, e.g. 1 2 1 4.
I In the case of three number instructions (e.g. Sine, Cosine) thefourt number of the gene should be ignored!
2Misprint! 5 1 5 1 should be 5 1 1 5LGP fundamentals - - K. Wolff, Ph.D. November 2010 13/38
LGP encoding
I Consider the registers:I The set of variable registers: R = {r1, r2, r3}I The set of constant registers: C = {c1, c2, c3}I The union of these sets, R∪ C, is denoted A = {r1, r2, r3, c1, c2, c3}.
I Consider the operators:
I The set of operators addition, subtraction, multiplication, division,and the conditional branching ri > rj : O = {o1, o2, o3, o4, o5}
I Consider the first gene of the chromosome: 1 2 1 4.
I 1: Operator ∈ O (addition)I 2: Destination ∈ R (variable register)I 1: Operand 1 ∈ A (any register)I 4: Operand 2 ∈ A (any register)
I Thus, the decoded instruction is: r2 := r1 + c1.
LGP fundamentals - - K. Wolff, Ph.D. November 2010 14/38
Evaluation of LGP chromosome
I Before a chromosome can be evaluated, the registers must beinitialized. Suppose that:
I Constant registers: c1 = 1, c2 = 3, c3 = 10I Variable registers: r1 = 1, r2 = r3 = 0
Genes Instruction Result (r1, r2, r3)
1 2 1 4 r2 := r1 + c1 1, 2, 0
1 3 2 2 r3 := r2 + r2 1, 2, 4
3 1 2 3 r1 := r2 × r3 8, 2, 4
5 1 1 5 if(r1 > c2) 8, 2, 4
1 1 1 4 r1 := r1 + c1 9, 2, 4
I Furtermore, suppose that the input was provided through register r1and that the output was taken from r1 as well: Output = r1 = 9
I However, note that any of the registers r1 − r3 could be utilized asinput or output register(s)!
LGP fundamentals - - K. Wolff, Ph.D. November 2010 15/38
Evolutionary operators in LGP
I It is common to use two-point, length-varying (non-homologous),crossover in LGP:
I Apart from being non-homologous, crossover in LGP works prettymuch in the same way as for the bit strings commonly used in GAs.
I Mutation in LGP proceeds as follows: Randomly select aninstruction for mutation, then with some probability:
I change (i) the destination register, (ii) the operator, and (iii) theoperands. That is, within the allowed ranges of course!
LGP fundamentals - - K. Wolff, Ph.D. November 2010 16/38
Table of Contents
1 GP fundamentals
2 LGP fundamentals
3 Function regression using LGP
4 Evolution of robot gaits using LGP
5 Sources
Function regression using LGP - - K. Wolff, Ph.D. November 2010 17/38
Example 3.11: Function regression
Objective
An ”unknown” function f (x) has been measured at L = 101 equidistantpoints in the range [−2, 2]. Use LGP to find a function f (x) that fits themeasured data.
I Solution: An LGP system was implemented with the following setup:I The set of operators where taken asO = {o1, o2, o3, o4} = {+,−,×, /}, encoded as 1, 2, 3 and 4.
I The set of constant registers, C = {c1, c2, c3}, where initialized toc1 = 1, c2 = 3, c3 = −1.
I The set of variable registers, R = {r1, r2, r3}, where initialized tor1 = xi , r2 = r3 = 0. Thus, R∪ C = A = {r1, r2, r3, c1, c2, c3}.
I The L = 101 data points f (xi ), i = 1, ..., L where considered one at atime taking, the output f (xi ) as the content of r1 at the end of theevaluation.
Function regression using LGP - - K. Wolff, Ph.D. November 2010 18/38
Example 3.11: Function regression
I The fitness function was taken as F = 1/ERMS, where ERMS is theroot-mean-square (RMS) error:
ERMS =
√√√√1
L
L∑i=1
(f (xi )− f (xi ))2
I The evolutionary parameters:
I Tournament selection, size 5, ptour = 0.75I Two-point non-homologous crossover, pc = 0.20I Mutation rate pmut = 0.04.I Chromosome lenght initially: [5, 25].I Population size: 100 individuals.
Function regression using LGP - - K. Wolff, Ph.D. November 2010 19/38
Example 3.11: Function regression
I Result of the LGP run, after about 50,000 generations:
I The ”unknown” function: f (x) = 1 + 8x − 3x3 − 2x4 + x6.I A typical ”best” function found by the LGP system:
f (x) = 9x + x2 − 3x3 + 2x4 + x6, ERMS = 1.723.I The chromosome consisted of 13 instructions.
Function regression using LGP - - K. Wolff, Ph.D. November 2010 20/38
Table of Contents
1 GP fundamentals
2 LGP fundamentals
3 Function regression using LGP
4 Evolution of robot gaits using LGP
5 Sources
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 21/38
Example: Generation of bipedal robot gaits using LGP
Objective
Evolve bipedal walking for a simulated, high DOF humanoid robot, bymeans of LGP,
I from first principles,
I with no domain knowledge on walking,
I no information of morphology availible to the LGP system.
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 22/38
The simulated high DOF humanoid robot
I Generic humanoid robot with 26 DOFs (cf. Honda ASIMO).I Sensor feedback: (i) 1 Inertial measurement unit (IMU), (ii) 2
accelerometers and (iii) 26 joint angle sensors.I Simulation environment:
Open dynamics engine (ODE), see http://www.ode.org/Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 23/38
LGP setup
I The operator set:O = {o1, o2, o3, o4, o5, o6, o7} = {+,−,×, /, sin, >,≤}, which wereencoded as 1, 2, 3, 4, 5, 6 and 7.
I The set of registers (variable and/or constant):
I I = {r0, ..., r25} are used for input. Initialized with joint angles.Constant during execution of en individual.
I R = {r26, ..., r51} are used as output registers (variable duringexecution). Initialized with joint angles.
I C = {r52, ..., r54} are used for the ephemeral constants (see below).
I S = {r55, ..., r66} are used for the sensors (IMU and accelerometers:variable).
I Thus, A = I ∪ R ∪ C ∪ S = {r0, ..., r66}.I Note: An ephemeral constant is initialized with a random value in
the beginning of an LGP run.
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 24/38
Controller model
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 25/38
LGP instruction set
I The LGP for bipedal walking used the following instruction set:
Instruction typeGeneralnotation
Input range
Arithmetic ri := rj + rk ri , rj , rk ∈ Roperations ri := rj − rk
ri := rj × rkri := rj/rk
Trigonometric ri := sin rj ri , rj ∈ Rfunctions
Conditional if (rj > rk) rj , rk ∈ Rbranches if (rj ≤ rk)
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 26/38
Encoding scheme for walking
I The operators where encoded as: O = {o1, o2, o3, o4, o5, o6, o7} ={+,−,×, /, sine, >,≤} = {1, 2, 3, 4, 5, 6, 7}.
I Consider the following LGP gene: 55 51 3 42.
I The encoding follows this scheme:
I 55: Operand 1 ∈ A (any register)I 51: Operand 2 ∈ A (any register)I 3: Operator ∈ O (multiplication)I 42: Destination ∈ R (variable register)
I Thus, the corresponding instruction is: r42 := r55 × r51.
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 27/38
An LGP program: (individual and chromosome)
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 28/38
Genetic operators
I Two-point, non-homologous, crossover was used:
I As for the mutation operator, an LGP standard procedure was used(described on a previous slide).
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 29/38
LGP parameters
I The parameters of the LGP system are listed in a Koza tableau:
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 30/38
Evolutionary algorithm
I The main steps in the EA can be summarized as follows:
I (1) Randomly initialize a population.
I (2) Evaluate all individuals according to the fitness measure:
I (2.1) Execute the individual for N simulation timesteps.I (2.2) Compute the fitness value according to the fitness function.
I (3) Perform tournament selection.
I (4) Apply genetic operators crossover and mutation in order toproduce two new individuals.
I (5) Replace the two losers individuals in the population with the newoffspring.
I (6) If termination criterion is not met, go to step (2).
I (7) Stop. The best individual represents the best solution found.
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 31/38
Objective function
I A simple fitness function turned out to work best:
I where rL and rR are the position vectors of the robot’s feet and y isthe unit vector in forwards direction.
I Thus, the fitness was (rougly) proportionate to the distance walked bythe robot, along a straight line.
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 32/38
Additions
I Energy discharging: Human bipedal walking is very energy efficient!
I Termination of slow individuals. If too slow progress, terminate!
I Different body proportions. An adult and a child have different bodyproportions.
I Dual controllers (brains). The robot was starting from standstill, thusa transition phase from standing to walking was involved!
I Reducing the number of foot steps. Large steps were prefered!
I Climbing over obstacles. Force the individuals to take larger andhigher steps!
I Lifting force. Learning to walk without any supervision might be toohard!
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 33/38
Results
I Individuals evolved that could walk essentially indefinitely.
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 34/38
Robot walking
I A gait cycle:
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 35/38
Conlusions from the LGP study
I Unseeded evolution of bipedal gaits was achieved in physical robotsimulations (runs lasting about 12 hours of real time).
I Individuals evolved that could walk essentially indefinitely.
I Energy discharging was the only addition that made a significantimprovement.
I The dual controller approach made the resulting gaits look morerealistic.
I However, the (best) evolved gaits were still not very humanlike, sincethe incentive (for the EA) to generate such gaits was insufficient.
Evolution of robot gaits using LGP - - K. Wolff, Ph.D. November 2010 36/38
Table of Contents
1 GP fundamentals
2 LGP fundamentals
3 Function regression using LGP
4 Evolution of robot gaits using LGP
5 Sources
Sources - - K. Wolff, Ph.D. November 2010 37/38
Sources
I This lecture was based on the following material:(1) Course book by Wahde, M., pp. 72-78.(2) Paper on LGP, by Wolff, K., Evolution of biped locomotion usinglinear genetic programming. Will be availible on the course web pageshortly.
Sources - - K. Wolff, Ph.D. November 2010 38/38