Talk II - GEP - SS (1)
Transcript of Talk II - GEP - SS (1)
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 1/28
5/27/20
PROGRAMMINGDr. Sundaram Suresh
School of Computer Engineering
Nanyang Technological University
Singapore
Email: [email protected]
Textbook
Candida Ferreira, Gene Expression Programming:
2
,do Heroismo, Portugal. 2002
Weblink
http://www.gene-expression-programming.com/
www.gepsoft.com
2
GEP code can be found in
http://jgep.sourceforge.net/
http://www.gene-expression-programming.com/Downloads.asp
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 2/28
5/27/20
What is Gene ExpressionProgramming?
GEP is also an evolutionary based algorithm. Gene ex ression ro rammin is develo ed b
3
incorporating both the idea of simple, linearchromosomes of fixed length used in GAs and theramified structures of different sizes and shapesused in GP.
Genes - codes for a smaller program or sub-expression tree.
3
designed to allow the creation of multiple genes.
It is worth emphasizing that GEP is the onlygenetic algorithm with multiple genes.
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
GEP GA – does not represent the I/O relationship
mathematicall
4
GP – complexity in genetic operators andincrease in tree length due to genetic operation.
GEP – is combination GA string representationand GP mathematical expression
GEP uses genetic operators in GA to change the
4
ree eng . u , ere e eng o e s r ngremains the same. GEP is based on human genome…
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 3/28
5/27/20
GEP Representation
We want to represent thearithmetic expression
5
Chromosome made ofgenes
Function set arguments (n)
Gene – head and tails
Heads (h) are specifiedfor a given problem
ai s are ca cu ate aseon number of heads and
t = h(n-1)+1
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
GEP Representation
Arithmetic expression - Gene Equivalent – K-
6
xpress on
Three Gene Re resentation
0 1 2 3 4 5 6
Q * + - a b c
7
d
6Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 4/28
5/27/20
Tree Construction of Three Gene7
7Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Biological process in Gene
The main operations that occur in a Genome are:
8
enome ep ications
Genome restructuring
Transcriptions
Translation and post-translation modifications
8Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 5/28
5/27/20
Replication
Replication of DNAmolecules.
9
.
The strands acts as atemplate for a new,complementary strand.
When copying is complete,there will be two daughterDNA molecules, eachidentical in sequence to the
mother molecule.
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Genome Restructuring This operation modify the gene structure.
Introduce enetic diversit
10
Similar to GA and GP, in GEP also, populations of individuals(computer programs) evolve by developing new abilities andbecoming better adapted to the environment due to thegenetic modifications accumulated over a certain number ofgenerations.
Mutation
10
Recombination Transposition
Gene duplication
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 7/28
5/27/20
Solution Representation
Like in GP, GEP the chromosomes (solutions) are representedusin function set and terminal set.
13
.
In GEP – chromosomes are represented using genes
Genes – heads and tails
Heads are coded using function and terminal set
Tails are coded using only terminal set.
Let F be function set, F = {*,+,-,Q}, where
13
Q – square root
Let T be terminal set, T = {a,b,c,d}
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Solution Representation We want to represent the
arithmetic expression
14
Chromosome made of genes
Max. of arguments for theelements in the Function set (n)
Gene – head and tails
Heads (h) are specified for agiven problem
Tails are calculated based on
num er o ea s ant = h(n-1)+1
Arithmetic expression - GeneEquivalent – K-Expression
0 1 2 3 4 5 6
Q * + - a b c
7
d
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 8/28
5/27/20
Example
K – expression
15
Equivalent Tree is
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Representation… The structural organization of GEP genes is better
understood in terms of open reading frames (ORFs).
16
In biology, an ORF, or coding sequence of a gene,begins with the “start” codon, continues with the aminoacid codons, and ends at a termination codon.
However, a gene is more than the respective ORF, withsequences upstream from the start codon and sequencesdownstream from the stop codon.
Although in GEP the start site is always the first
16
pos on o a gene, e erm na on po n oes noalways coincide with the last position of a gene.
It is common for GEP genes to have non-coding regionsdownstream from the termination point.
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 9/28
5/27/20
Use of non-coding region
They are, in fact, the essence of GEP and17
,genome using any genetic operator withoutrestrictions, always producing syntactically correctprograms without the need for a complicatedediting process or highly constrained ways ofimplementing genetic operators.
17
Indeed, this is the paramount difference between
GEP and previous GP implementations, with orwithout linear genomes
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Example with Head and Tail Function set : {Q,+,-,*,/} Terminal set : {a,b}
18
unct on arguments = Head = 10 Tail = 10(2-1)+1=11 The K- expression
The bold face representthe tail.
Here ORF ends at 10,where as the gene end at20.
ORF is phenotyperepresentation
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 10/28
5/27/20
Use of Non-Coding Region19
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Use of Non-Coding Region20
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 11/28
5/27/20
Use of Non-Coding Region21
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
MultiGene Representation
We saw the representation for single gene
22
represen a on.
Now, we discuss the multi-gene representation forthe chromosome
Number of genes can be greater than one in a
chromosome.
22
For all problem, number of genes and number ofheads are fixed prior.
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 12/28
5/27/20
Three Gene Representation
Three genes
n = 2; h = 4
23
K-Expressions for
Gene1 :
Gene2 :
Gene3 :
Position ‘0’ is the start of thegene and position ‘8’ is the endof the gene.
The ORF ending of each genecan be calculated after treeconstruction.
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Three Gene Representation24
24Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 13/28
5/27/20
Tree Construction25
25Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Translation The process of converting the K-expressions into tree (ET) and
reducing it to mathematical form is called Translation. GEP chromosomes are com osed of one or more ORFs and
26
obviously the encoded individuals have different degrees ofcomplexity.
The simplest individuals are encoded in a single gene, and the ìorganismî is, in this case, the product of a single gene - an ET.
In other cases, the organism is a multi-subunit ET, in which thedifferent sub-ETs are linked together by a particular function.
In other cases, the organism emerges from the spatial organizationof different sub-ETs (e.g., in planning and problems with multipleoutputs).
26
, ,of conventional sub-ETs with different domains (e.g., neuralnetworks). However, in all cases, the whole organism is encoded in alinear genome.
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 15/28
5/27/20
Example with + Link29
29Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Analysis
Can we represent the final tree with K-Expression?
30
.
What is the equivalent K-expression?
Issues?
It is difficult to use the genetic operators to evolve because of less
30
number of tails
Multi-genes representation – faster convergence
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 16/28
5/27/20
Example 2 for Non-coding region inmulti-gene
Chromosome has two genes
Head = 3 and tail = 4
31
Operators: N – NOT, O – OR
Fig a) represent the Kexpression
Fig b) the first operator is theconnecting operator
In gene1 – OOcacab – ‘thelast two character ‘ab’
e ongs o non-co ng reg on
In gene2 – NNNbbcb – the
last three character ‘bcb’belongs to non-coding region
Gene1
Gene2
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Example 332
32Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 17/28
5/27/20
Example 433
33Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Points to Remember The type of linking function, as well as the number of
genes and the length of each gene, are a priori chosen
34
for each problem.
So, we can always start by using a single genechromosome, gradually increasing the length of thehead; if it becomes very large, we can increase thenumber of genes and of course choose a function to linkthem.
34
,another linking function might be more appropriate.
The idea, of course, is to find a good solution, and GEPprovides the means of finding one.
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 18/28
5/27/20
Mutation
Mutations can occur anywhere in the chromosome.35
,chromosomes must remain intact.
In the heads any symbol can change into another(function or terminal); in the tails terminals can onlychange into terminals.
This way, the structural organization of chromosomes
35
is maintained, and all the new individuals produced
by mutation are structurally correct programs.
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Mutation – Mother genome36
K – Expression – Equation 3.5
36
Equivalent ET-tree
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 19/28
5/27/20
Daughter Genome37
37Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Mutation…38
38Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 20/28
5/27/20
2
Neutral Mutation
Mutation occur in non-codingregion is called neutral mutation.This mutation does not affect theET of mother and daughter.
ORF ends at position 7 of thehead
Suppose, mutation occur at tailposition 9.
Chan e ‘a’ to ‘b’
The ‘phenotype’ of daughter
genome is same as mother.
39Workshop on bio-inspired Computing,
VTU, Mysore, 7-10, June, 2010
Comments on mutation If a function is mutated into a terminal or vice versa,
or a function of one ar ument is mutated into a
40
function of two arguments or vice versa, the ET ismodified drastically.
The change in tree size take place with-outincreasing the computational complexity.
It is worth noticing that in GEP there are noconstraints neither in the kind of mutation nor the
40
number of mutations in a chromosome: in all casesthe newly created individuals are syntacticallycorrect programs.
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 22/28
5/27/20
2
A) Transposition…
Any sequence in the genome might become an IS element,therefore these elements are randoml selected throu hout the
43
chromosome.
A copy of the transposition is made and inserted at anyposition in the head of a gene, except at the start position.
Typically, an IS transposition rate (pis) of 0.1 and a set of threeIS elements of different length are used.
43
The transposition operator randomly chooses the chromosome,the start of the IS element, the target site, and the length of the
transposition.
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
A) Transposition… Suppose that the sequence “bba” in gene 2 (positions 12 through 14) was
chosen to be an IS element, and the target site was bond 6 in gene 1 (betweenpositions 5 and 6).
Then, a cut is made in bond 6 and the block “bba” is copied into the s ite ofinsertion.
During transposition, the sequence upstream from the insertion site staysunchanged, whereas the sequence downstream from the copied IS elementloses, at the end of the head, as many symbols as the length of the IS element(in this case the sequence “a*b” was deleted).
--Mother
-- Daughter
44Workshop on bio-inspired Computing,
VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 24/28
5/27/20
2
Root…
During root transposition, the whole head shifts to accommodate the RISelement, losing, at the same time, the last symbols of the head (as many as
47
.
As with IS elements, the tail of the gene subjected to transposition and allnearby genes stay unchanged.
Note, again, that the newly created programs are syntactically correctbecause the structural organization of the chromosome is maintained.
The modifications caused by root transposition are extremely radical,because the root itself is modified.
In nature, if a transposable element is inserted at the beginning of the
47
, ,changes the encoded protein.
Like mutation and IS transposition, root insertion has a tremendoustransforming power and is excellent for creating genetic variation.
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
c) Gene Transposition
In gene transposition an entire gene functions as a transposonand trans oses itself to the be innin of the chromosome.
48
.
In contrast to the other forms of transposition, in genetransposition the transposition (the gene) is deleted in the placeof origin.
This way, the length of the chromosome is maintained.
The chromosome to undergo gene transposition is randomly
48
chosen, and one of its genes (except the first, obviously) israndomly chosen to transpose.
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 25/28
5/27/20
2
Gene…49
49Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Recombination operator
Three type of recombination operator
50
ne-point
Two-point
Gene
Two parents are randomly chosen and paired toexchange the genetic material between them.
50Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 26/28
5/27/20
2
One-point51
51Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Two-point
In two-point recombination the chromosomes are
52
pa re an e wo po n s o recom na on arerandomly chosen.
The material between the recombination points isafterwards exchanged between the twochromosomes, forming two new daughter
52
c romosomes.
Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
8/7/2019 Talk II - GEP - SS (1)
http://slidepdf.com/reader/full/talk-ii-gep-ss-1 27/28
5/27/20
2
Two-point…53
53Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010
Gene Recombination54
54Workshop on bio-inspired Computing, VTU, Mysore, 7-10, June, 2010