Different Varieties of Genetic Programming Je-Gun Joung.

Post on 13-Jan-2016

215 views 3 download

Transcript of Different Varieties of Genetic Programming Je-Gun Joung.

Different Varieties of Genetic Programming

Je-Gun Joung

Some of the Many Different Structures Used for GP

9.1 GP with Tree Genomes

Mutation Operators Applied in Tree-based GP

Point Mutation

+

*

x

--

*

1 x 1 - -

x 1 x 1

* -

x 1

+

*

x

--

+

1 x 1 - -

x 1 x 1

* -

x 1

Permutation

+

*

x

--

*

1 x 1 - -

x 1 x 1

* -

x 1

+

*

x

--

+

1 x 1 - -

x 1 1 x

* -

x 1

Hoist

+

*

x

--

*

1 x 1 - -

x 1 x 1

* -

x 1

- -

x 1 x 1

*

Expansion Mutation

+

*

x

--

*

1 x 1 - -

x 1 x 1

* -

x 1

+

*

x

--

*

x 1 - -

x 1 x 1

* -

x1

- -

x 1 x 1

*

Collapse Subtree Mutation

+

*

x

--

*

1 x 1 - -

x 1 x 1

* -

x 1

+

*

x

--

*

1 x 1

x -

x 1

Subtree Mutation

+

*

x

--

*

1 x 1 - -

x 1 x 1

* -

x 1

+

*

- -

x 1 x 1

* -

x 1

-

x 1

Crossover Operators Applied in Tree-based GP

Subtree Exchange Crossover

Selfcrossover

Module CrossoverCrossover Operators Applied within Tre

e-based GP

9.2 GP with Linear Genomes

Linear GP acts on linear genomes, like program code represented by bit strings or code for register machines.

The influence of change in a linear structure can be expected to follow the linear order in which the instructions are executed.

Tree-based GP is that all operators uniformly select nodes from a tree.

Linear GP is that all operators uniformly select nodes from a sequence.

9.2.1 Evolutionary Program Induction with Introns

Wineberg and Oppacher [1994] have formulated an evolutionary programming method they call EPI (evolutionary program induction).

They use fixed length strings to code their individuals and a GA-like crossover.

The code is constructed to maintain a fixed structure within the chromosome that allows similar alleles to compete against each other at a locus during

9.2.2 Developmental Genetic Programming

Developmental genetic programming (DGP) is extension of GP by a developmental step.

In tree based GP, the space of genotypes (search space) is usually identical to the space of phenotypes (solution space)

DGP maps binary sequences, genotype, through a developmental process into separate phenotypes

The Genotype-phenotype Mapping

GenotypeGenotype-Phenotype

Mapping (GPM) Penotype

Search Space(unconstrained) Constraint implementation

Solution space (constrained)

9.2.3 An Example: Evolution in C

Symbolic function regression

)tan(1

)cos()(esin ae

vmfq

An Example Result

Runs lasted for 50 generations at most, with a population size of 500 individuals.

In one experimental run, the genotype 1100 0010 1000 0111 1001 0010 1101 1001 0111 1100

0000 1011 1001 1110 1001 1010 1101 0011 1100 1111

0101 1010 0110 1110 0001 The raw symbol sequence

T*(a)*R)aE+C)E)SRDT)vSqE* Repairing transforms this illegal sequence into

{T((a)*R(a+m)+(S(D((v+q+D} This sequence is unfinished, repairing terminates by completing t

he sequence into

{T((a)*R(a+m))+(S(D((v+q+D(m)))))}

Finally, editing produces double ind(double m, double v, double a)

{return T((a)*R(a+m))+(S(D((v+q+D(m))))); }

A C compiler takes over to generate an executable that is valid on the underlying hardware platform

This executable is the final phenotype encoded by the genotype

mqv

maaf1

1sintan

9.2.4 Machine Language

1: x=x-1 (x-1)2+ (x-1)3

2: y=x*x

3: x=x*y

4: y=x+y Figure 9.13

-1

x

*

*

+

y

+

*

x

--

*

1 x 1 - -

x 1 x 1

* -

x 1

The representation of (x-1)2+(x-1) 3 in a tree-based genome

The reasons for using machine code in GP - as Opposed to

Higher-level languages The most efficient optimization can be done at

the machine code level. High-level tools might simply not be available

for a target processor It could be more convenient to let the

computer evolve small pieces of machine code programs itself rather than learning to master machine code programming

Reasons for Using Binary Machine Code

The GP algorithm can be made very fast by having the individual programs in the population in binary machine code.

The system is also much more memory efficient than a tree based GP system.

An additional advantage is that memory consumption is stable during evolution with no need for garbage collection.

The JB Language

0 = BLOCK (group statements)

1 = LOOP

2 = SET

3 = ZERO (clear)

4 = INCREMENT

Individual genome:

0 0 1 3 1 9 1 2 1 4 1 7

Block stat. 1 stat.2

register 1 = 0

repeat stat.1, register2

register1 = register1+1

The GEMS System

One of the most extensive systems for evolution of machine code is the GEMS system [Crepeau, 1995].

The system includes an almost complete interpreter for the Z-80 8-bit microprocessor.

The Z-80 has 691 different instructions, and GEMS implements 660 instructions.

It has so far been used to evolve a “hello world” program consisting of 58 instructions.

The Crossover of GEM

9.2.5 An Example: Evolution in Machine Language

9.3 GP with Graph Genomes

9.3.1 PADO The graph-based GP system PADO (Parallel Algorithm

Discovery and Orchestration) [Teller and Veloso, 1995] Each program has a stack and an indexed memory for it

s own use of intermediate values and for communication.

There are also the following special nodes in a program Start node Stop node Subprogram calling nodes Library subprogram calling nodes

The Representation of a Program and Subprogram in

the PADO

Fig 9.19

STOP

START

START

STOP

Main Program

Subprogram (private of public)

Stack

Indexed Memory

9.3.2 Cellular Encoding

9.4 Other Genomes

9.4.1 STROGANOFF Iba, Sato, and deGaris [1995] have introduced a more compli

cated structure into the nodes of a tree that could represent a program.

They base their approach on the well-known Group Method of Data Handling (GMDH)

In order to understand STructured Representation On Genetic Algorithms for Nonlinear Function Fitting (STROGANOFF)

The STROGANOFF method applies GP crossover and mutation to a population of the polynominal nodes.

Group Method of Data Handling (GMDH)

P1

P2 P4

X3 X5X1

P3

X2 X4

215224

2132211021 ),( xxaxaxaxaxaazxxP jj

Crossover of trees of GMDHP1

P2 P4

X3 X5X4

Pa

Pc

X3X1 X2 X4

P1

P2 Pb

X3 X5X2 X4

Pa

P4 Pc

X3X1 X2 X4

X2

Pb

Different Mutation of trees of GMDH

P1

P2 P4

X3 X5X1

P3

X2 X4

P1

P2

X4X1

P3

X2 X5

P1

P2

X3X1

P3

X2 X4

P1

P2 P4

X5X1

P3

X2

P1

P2 P4

X3 X5X1

P3

X2 X4

(a) (b)

(c) (d)

X3

P3

X4

9.4.2 GP Using Context-Free Grammars

By the use of a context-free grammar, typing and syntax are automatically assured throughout the evolutionary process

A Context-free grammar can be considered a four-tuple

Definition 9.2 A terminal of a context-free grammar is a symbol for which no production rule exists in the grammar.

Definition 9.3 A production rule is a substitution of the kind where and

),,,( NNS

YX YX NX

A Grammatical Structure

S

B

S

B B

B B- B B- B B- B B-

T T

X 1

T T

X 1

T T

X 1

T T

X 1

* B B*

B+

BX

T|%BB|BB|-BB|BBB

1|xT

S : the start symbol

B : a binary expression

T : a terminal

x and 1 : variables and a constant

9.4.3 Genetic Programming of L-Systems

Lindenmayer systems (also known as L-system [Lindenmayer, 1968][Prusinkiewicz and Lindenmayer, 1990] have been intorduced independently into the area of genetic programming by different researchers [Koza, 1993][Jacob, 1994][Hemmi et al., 1994]

L-systems were invented for the purpose of modeling biological structure formation

The rewriting all non-terminals in parallel is important in this respect.

L-system in their simplest form (0L-systems) are context-free grammars whose production rules are applied not sequentially but simultaneously to the growing tree of non-terminals.

Context-free L-system Individual Encoding a Production Rule System of Lin

denmayer type

0L-System

AxiomA LRule

LRuleLRuleLRule

pred succ pred succ pred succ