Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm...

82
Introduction Algorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization and genetic algorithms Fabian J. Theis Institute of Biophysics University of Regensburg, Germany [email protected] 4th May 2005 Theis Optimization and genetic algorithms

Transcript of Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm...

Page 1: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Lecture on bioinformatics, chapter 2:optimization and genetic algorithms

Fabian J. Theis

Institute of BiophysicsUniversity of Regensburg, Germany

[email protected]

4th May 2005

Theis Optimization and genetic algorithms

Page 2: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Outline

IntroductionReinforcement learningOptimizationImitate natureGenetic algorithms

AlgorithmBasic algorithmData representationSelection

ReproductionProperties and extensions

OverviewConvergence analysisSchema theoremGenetic programming

Examples2d-function optimizationGenetic MastermindHyerplane detection

Theis Optimization and genetic algorithms

Page 3: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Algorithm

IntroductionReinforcement learningOptimizationImitate natureGenetic algorithms

AlgorithmBasic algorithmData representationSelection

ReproductionProperties and extensions

OverviewConvergence analysisSchema theoremGenetic programming

Examples2d-function optimizationGenetic MastermindHyerplane detection

Theis Optimization and genetic algorithms

Page 4: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Introduction

I idea of genetic algorithms (GAs)I extract optimization strategies nature uses successfully → Darwinian

EvolutionI transform them for application in mathematical optimization theory

I abstract goal: find the global optimum of a problem/function in adefined phase space

Theis Optimization and genetic algorithms

Page 5: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Introduction

I idea of genetic algorithms (GAs)I extract optimization strategies nature uses successfully → Darwinian

EvolutionI transform them for application in mathematical optimization theory

I abstract goal: find the global optimum of a problem/function in adefined phase space

Theis Optimization and genetic algorithms

Page 6: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Optimization

I GA as special kind of reinforcement learningI no access to the full problem/functionI but: rewards are given for a given action/search space positionI goal: use rewards to find optimumI this contrasts to learning by (given) examples as in supervised

learning e.g. using neural networksI → traverse search space manually

Theis Optimization and genetic algorithms

Page 7: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Optimization

I simple algorithm: random samplingI pick a single location in the search spaceI store it if reward is higher than at previous locations, discard it

otherwiseI repeat

I other such algorithmsI Markov-Chain-Monte-Carlo search (MCMC)I simulated annealingI if derivative of reward is available: (conjugated) gradient

ascent/descent etc.

Theis Optimization and genetic algorithms

Page 8: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Optimization

I simple algorithm: random samplingI pick a single location in the search spaceI store it if reward is higher than at previous locations, discard it

otherwiseI repeat

I other such algorithmsI Markov-Chain-Monte-Carlo search (MCMC)I simulated annealingI if derivative of reward is available: (conjugated) gradient

ascent/descent etc.

Theis Optimization and genetic algorithms

Page 9: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Optimization

I simple algorithmic maximization possible e.g. by gradient ascent:I a differentiable function f : Rn → R can be maximized by local

updates in directions of its gradientI given sufficiently small learning rate η > 0 and a starting point

x(0) ∈ Rn

I local maxima of f can be found by iterating

x(t + 1) = x(t) + η∆x(t)

with

∆x(t) = (Df )(x(t))> = grad f (x(t)) =∂f

∂x(x(t))

the gradient of f at x(t).

Theis Optimization and genetic algorithms

Page 10: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Optimization

I simple algorithmic maximization possible e.g. by gradient ascent:I a differentiable function f : Rn → R can be maximized by local

updates in directions of its gradientI given sufficiently small learning rate η > 0 and a starting point

x(0) ∈ Rn

I local maxima of f can be found by iterating

x(t + 1) = x(t) + η∆x(t)

with

∆x(t) = (Df )(x(t))> = grad f (x(t)) =∂f

∂x(x(t))

the gradient of f at x(t).

Theis Optimization and genetic algorithms

Page 11: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Optimization

I simple algorithmic maximization possible e.g. by gradient ascent:I a differentiable function f : Rn → R can be maximized by local

updates in directions of its gradientI given sufficiently small learning rate η > 0 and a starting point

x(0) ∈ Rn

I local maxima of f can be found by iterating

x(t + 1) = x(t) + η∆x(t)

with

∆x(t) = (Df )(x(t))> = grad f (x(t)) =∂f

∂x(x(t))

the gradient of f at x(t).

Theis Optimization and genetic algorithms

Page 12: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Stochastic optimization

I Problem: minimize multivariate, real functionf (x), x = (x1, x2, . . . , xn)

T

I if no analytical expression of f (x) is given (but only some values) orf is non-continuous, f can be minimized by stochastic methods

I Random searchI choose x1 randomlyI calculate f (x1)I choose δ randomlyI calculate f (x1 + δ)I

f (x1 + δ) < f (x1) choose x1 + δ asif new starting point

f (x1 + δ) > f (x1) choose new δ

I choose δ = δ(t) with limt→∞ δ(t) → 0

Theis Optimization and genetic algorithms

Page 13: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Stochastic optimization

I Problem: minimize multivariate, real functionf (x), x = (x1, x2, . . . , xn)

T

I if no analytical expression of f (x) is given (but only some values) orf is non-continuous, f can be minimized by stochastic methods

I Random searchI choose x1 randomlyI calculate f (x1)I choose δ randomlyI calculate f (x1 + δ)I

f (x1 + δ) < f (x1) choose x1 + δ asif new starting point

f (x1 + δ) > f (x1) choose new δ

I choose δ = δ(t) with limt→∞ δ(t) → 0

Theis Optimization and genetic algorithms

Page 14: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Stochastic optimization

I Metropolis samplingI like random search butI if f (x1 + δ) > f (x1) only use newly generated x with probability

P =1

1 + exp�

f (x1+δ)−f (x1)α

withlim

t→∞α(t) → 0

I then the algorithm leaves a local minimum with P > 0

Theis Optimization and genetic algorithms

Page 15: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Genetic algorithms

I here: imitate nature’s robust way of evolving successful organismsI organisms ill-suited to an environment die off, whereas fit ones

reproduceI offspring is similar to the parents, so population fitness increases with

generationsI mutation can randomly generate new speciesI ‘The Origin of Species by Means of Natural Selection’, C.R. Darwin,

D. Appleton and Company, NY, 1897

I history:I introduced by J. Holland 1975I further invesigated by his students e.g. K. DeJong 1975I more recently theoretical advances e.g. by M. Vose 1993

Theis Optimization and genetic algorithms

Page 16: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Genetic algorithms

I what’s good for nature is good for artificial systems

I imagine population of individual ‘explorers’ sent into theoptimization phase-space

I explorer is defined by its genes, encoding his phase-space positionI optimization problem is given by a fitness function

I the struggle of ‘life’ beginsI selectionI crossoverI mutation

I according to these rules populations tend to increase overall fitness

Theis Optimization and genetic algorithms

Page 17: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Genetic algorithms

I what’s good for nature is good for artificial systems

I imagine population of individual ‘explorers’ sent into theoptimization phase-space

I explorer is defined by its genes, encoding his phase-space positionI optimization problem is given by a fitness function

I the struggle of ‘life’ beginsI selectionI crossoverI mutation

I according to these rules populations tend to increase overall fitness

Theis Optimization and genetic algorithms

Page 18: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Genetic algorithms

I what’s good for nature is good for artificial systems

I imagine population of individual ‘explorers’ sent into theoptimization phase-space

I explorer is defined by its genes, encoding his phase-space positionI optimization problem is given by a fitness function

I the struggle of ‘life’ beginsI selectionI crossoverI mutation

I according to these rules populations tend to increase overall fitness

Theis Optimization and genetic algorithms

Page 19: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Genetic algorithms

I what’s good for nature is good for artificial systems

I imagine population of individual ‘explorers’ sent into theoptimization phase-space

I explorer is defined by its genes, encoding his phase-space positionI optimization problem is given by a fitness function

I the struggle of ‘life’ beginsI selectionI crossoverI mutation

I according to these rules populations tend to increase overall fitness

Theis Optimization and genetic algorithms

Page 20: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Genetic algorithms

I advantagesI global not only local optimizationI simple and hence easy to implementI easy parallelization possible

I disadvantagesI how to encode phase-space positionI rather low speed and high computational costI parameter dependencies (population size, selection and reproduction

parameters)

Theis Optimization and genetic algorithms

Page 21: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Reinforcement learningOptimizationImitate natureGenetic algorithms

Genetic algorithms

I advantagesI global not only local optimizationI simple and hence easy to implementI easy parallelization possible

I disadvantagesI how to encode phase-space positionI rather low speed and high computational costI parameter dependencies (population size, selection and reproduction

parameters)

Theis Optimization and genetic algorithms

Page 22: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Algorithm

IntroductionReinforcement learningOptimizationImitate natureGenetic algorithms

AlgorithmBasic algorithmData representationSelection

ReproductionProperties and extensions

OverviewConvergence analysisSchema theoremGenetic programming

Examples2d-function optimizationGenetic MastermindHyerplane detection

Theis Optimization and genetic algorithms

Page 23: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Basic genetic algorithm

Data: population, a set of individuals

fitness-function Fitness, a function measuring fitness of an individual

Result: an individual

repeat

parents ← Selection (population, Fitness)1

population ← Reproduction (parents)2

until some individual is fit enough;

return the best individual in population according to Fitness3

Theis Optimization and genetic algorithms

Page 24: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Individual

I an individual encodes the data space position

I classic GA approach: representation by word (chromosome) over afinite alphabet

I each letter is called geneI real DNA: alphabet is {A, G , T , C}I here: usually binary alphabet {0, 1}I some authors speak more general of evolutionary programming if

alphabet is largerI finite alphabet implies discrete search space

I continuous search spaceI use continuous ‘alphabet’ i.e. genes ∈ R or bounded genes ∈ [a, b]I so individual ∈ Rn respectively ∈ [a1, b1]× . . .× [an, bn]

Theis Optimization and genetic algorithms

Page 25: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Individual

I an individual encodes the data space position

I classic GA approach: representation by word (chromosome) over afinite alphabet

I each letter is called geneI real DNA: alphabet is {A, G , T , C}I here: usually binary alphabet {0, 1}I some authors speak more general of evolutionary programming if

alphabet is largerI finite alphabet implies discrete search space

I continuous search spaceI use continuous ‘alphabet’ i.e. genes ∈ R or bounded genes ∈ [a, b]I so individual ∈ Rn respectively ∈ [a1, b1]× . . .× [an, bn]

Theis Optimization and genetic algorithms

Page 26: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Individual

I an individual encodes the data space position

I classic GA approach: representation by word (chromosome) over afinite alphabet

I each letter is called geneI real DNA: alphabet is {A, G , T , C}I here: usually binary alphabet {0, 1}I some authors speak more general of evolutionary programming if

alphabet is largerI finite alphabet implies discrete search space

I continuous search spaceI use continuous ‘alphabet’ i.e. genes ∈ R or bounded genes ∈ [a, b]I so individual ∈ Rn respectively ∈ [a1, b1]× . . .× [an, bn]

Theis Optimization and genetic algorithms

Page 27: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Selection

I goal: select individuals that produce the next generation

I probabilistic selectionI based on fitness function fI better individuals have increased chance of reproductionI usually selection with replacement → very fit individuals reproduce

several times

I selection probabilitiesI roulette wheel (Holland 1975)

P(choice of individual i) =f (i)Pj f (j)

problem: negative f ? minimization?I ranking methods, i.e. choose individuals according to fitness rank

e.g. normalized geometric ranking (Joines and Houck 1994)I tournament selection, i.e. select best among a randomly selected

subset

Theis Optimization and genetic algorithms

Page 28: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Selection

I goal: select individuals that produce the next generation

I probabilistic selectionI based on fitness function fI better individuals have increased chance of reproductionI usually selection with replacement → very fit individuals reproduce

several times

I selection probabilitiesI roulette wheel (Holland 1975)

P(choice of individual i) =f (i)Pj f (j)

problem: negative f ? minimization?I ranking methods, i.e. choose individuals according to fitness rank

e.g. normalized geometric ranking (Joines and Houck 1994)I tournament selection, i.e. select best among a randomly selected

subset

Theis Optimization and genetic algorithms

Page 29: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Selection

I goal: select individuals that produce the next generation

I probabilistic selectionI based on fitness function fI better individuals have increased chance of reproductionI usually selection with replacement → very fit individuals reproduce

several times

I selection probabilitiesI roulette wheel (Holland 1975)

P(choice of individual i) =f (i)Pj f (j)

problem: negative f ? minimization?I ranking methods, i.e. choose individuals according to fitness rank

e.g. normalized geometric ranking (Joines and Houck 1994)I tournament selection, i.e. select best among a randomly selected

subset

Theis Optimization and genetic algorithms

Page 30: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Reproduction

I typically consists of two stagesI crossover (or mating): selected individuals are randomly paired and

(usually two) children are producedI mutation: genes can be altered by random mutation to a different

value according to a small probability

I use genetic operators to produce and alter new offspring → basicsearch mechanism in GAs

Theis Optimization and genetic algorithms

Page 31: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Reproduction

I typically consists of two stagesI crossover (or mating): selected individuals are randomly paired and

(usually two) children are producedI mutation: genes can be altered by random mutation to a different

value according to a small probability

I use genetic operators to produce and alter new offspring → basicsearch mechanism in GAs

Theis Optimization and genetic algorithms

Page 32: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Crossover

I let x, y ∈ An be the genes of the two parents

I simple crossoverI choose r randomly in {1, . . . , n}I generate children x′, y′ ∈ An by

x ′i :=

�xi if i < ryi otherwise

y ′i :=

�yi if i < rxi otherwise

I in the case of continuous genes: arithmetic crossoverI choose r randomly in [0, 1]I generate children x′, y′ ∈ An by

x′ := rx + (1− r)y

y′ := (1− r)x + ry

Theis Optimization and genetic algorithms

Page 33: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Crossover

I let x, y ∈ An be the genes of the two parents

I simple crossoverI choose r randomly in {1, . . . , n}I generate children x′, y′ ∈ An by

x ′i :=

�xi if i < ryi otherwise

y ′i :=

�yi if i < rxi otherwise

I in the case of continuous genes: arithmetic crossoverI choose r randomly in [0, 1]I generate children x′, y′ ∈ An by

x′ := rx + (1− r)y

y′ := (1− r)x + ry

Theis Optimization and genetic algorithms

Page 34: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Crossover

I let x, y ∈ An be the genes of the two parents

I simple crossoverI choose r randomly in {1, . . . , n}I generate children x′, y′ ∈ An by

x ′i :=

�xi if i < ryi otherwise

y ′i :=

�yi if i < rxi otherwise

I in the case of continuous genes: arithmetic crossoverI choose r randomly in [0, 1]I generate children x′, y′ ∈ An by

x′ := rx + (1− r)y

y′ := (1− r)x + ry

Theis Optimization and genetic algorithms

Page 35: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Mutation

I let xi ∈ A be the gene of an individual that is to be mutated

I binary gene: binary mutationI x ′i := 1− xi

I discrete or continuous bounded A: uniform mutationI set x ′i to be a uniformly randomly chosen element of A

I also possible: non-uniform mutationI needs fixed distribution for element choice

Theis Optimization and genetic algorithms

Page 36: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Mutation

I let xi ∈ A be the gene of an individual that is to be mutated

I binary gene: binary mutationI x ′i := 1− xi

I discrete or continuous bounded A: uniform mutationI set x ′i to be a uniformly randomly chosen element of A

I also possible: non-uniform mutationI needs fixed distribution for element choice

Theis Optimization and genetic algorithms

Page 37: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Mutation

I let xi ∈ A be the gene of an individual that is to be mutated

I binary gene: binary mutationI x ′i := 1− xi

I discrete or continuous bounded A: uniform mutationI set x ′i to be a uniformly randomly chosen element of A

I also possible: non-uniform mutationI needs fixed distribution for element choice

Theis Optimization and genetic algorithms

Page 38: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Mutation

I let xi ∈ A be the gene of an individual that is to be mutated

I binary gene: binary mutationI x ′i := 1− xi

I discrete or continuous bounded A: uniform mutationI set x ′i to be a uniformly randomly chosen element of A

I also possible: non-uniform mutationI needs fixed distribution for element choice

Theis Optimization and genetic algorithms

Page 39: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

One generation example

Theis Optimization and genetic algorithms

Page 40: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Analytical example

I optimize f (x) := x2 in [0, 1]

I x → 0.b9b8 . . . b0 ⇒ 1024 values of x

I minimal distance of adjacent values is 2−10

I mutation of bit i changes x by δ = 2−i → x + δ

I crossover of two character sets:

x = 0.b9b8.....b0

⇒ z = 0.b9b8....biai−1....a0

y = 0.a9a8.....a0 = x + δ

I crossover corresponds to improved stochastic search

Theis Optimization and genetic algorithms

Page 41: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Analytical example

I optimize f (x) := x2 in [0, 1]

I x → 0.b9b8 . . . b0 ⇒ 1024 values of x

I minimal distance of adjacent values is 2−10

I mutation of bit i changes x by δ = 2−i → x + δ

I crossover of two character sets:

x = 0.b9b8.....b0

⇒ z = 0.b9b8....biai−1....a0

y = 0.a9a8.....a0 = x + δ

I crossover corresponds to improved stochastic search

Theis Optimization and genetic algorithms

Page 42: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Analytical example

I optimize f (x) := x2 in [0, 1]

I x → 0.b9b8 . . . b0 ⇒ 1024 values of x

I minimal distance of adjacent values is 2−10

I mutation of bit i changes x by δ = 2−i → x + δ

I crossover of two character sets:

x = 0.b9b8.....b0

⇒ z = 0.b9b8....biai−1....a0

y = 0.a9a8.....a0 = x + δ

I crossover corresponds to improved stochastic search

Theis Optimization and genetic algorithms

Page 43: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Analytical example

I optimize f (x) := x2 in [0, 1]

I x → 0.b9b8 . . . b0 ⇒ 1024 values of x

I minimal distance of adjacent values is 2−10

I mutation of bit i changes x by δ = 2−i → x + δ

I crossover of two character sets:

x = 0.b9b8.....b0

⇒ z = 0.b9b8....biai−1....a0

y = 0.a9a8.....a0 = x + δ

I crossover corresponds to improved stochastic search

Theis Optimization and genetic algorithms

Page 44: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Analytical example

I optimize f (x) := x2 in [0, 1]

I x → 0.b9b8 . . . b0 ⇒ 1024 values of x

I minimal distance of adjacent values is 2−10

I mutation of bit i changes x by δ = 2−i → x + δ

I crossover of two character sets:

x = 0.b9b8.....b0

⇒ z = 0.b9b8....biai−1....a0

y = 0.a9a8.....a0 = x + δ

I crossover corresponds to improved stochastic search

Theis Optimization and genetic algorithms

Page 45: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

Basic algorithmData representationSelectionReproduction

Analytical example

I optimize f (x) := x2 in [0, 1]

I x → 0.b9b8 . . . b0 ⇒ 1024 values of x

I minimal distance of adjacent values is 2−10

I mutation of bit i changes x by δ = 2−i → x + δ

I crossover of two character sets:

x = 0.b9b8.....b0

⇒ z = 0.b9b8....biai−1....a0

y = 0.a9a8.....a0 = x + δ

I crossover corresponds to improved stochastic search

Theis Optimization and genetic algorithms

Page 46: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Progress diagram of a genetic algorithm

Gen := 0↓

Create InitialRandom Population

−→ Termination CriteriaYes→ End

| satisfied?| ↓ No| Evaluate fitness of| each individual| ↓| i := 0| ↓

Gen := Gen + 1Yes← i := M?

↓ NoPr←− Select genetic Operation

Pm−→↓ ↓=; Pc ↓

Select one individual Select two individuals Select one individual↓

Perform reproduction Perform Crossover Perform mutation↓

i := i + 1Copy into new Insert two off-springs Insert mutant into

population into new population new population| | |−→ ↓ ←−

i := i + 1

Theis Optimization and genetic algorithms

Page 47: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Convergence analysis

I corresponds to a ‘schema’ (J.Holland 1975)

I example goal: maximize f (x) = x2 in [0, 1]

I generate N numbers in [0, 1] using 10 bit fixed point codingI generate new population by choosing codes with decreasing

probability according to fitness e.g.

1 0, 1... → 1 0, 1... → 1 0, 11...2 0, 1... → 2 0, 1... → 2 0, 11...

.

.

. → →...

N 0, 0... → N 0, 1... → N 0, 10...

1 0, 11...2 0, 11... → 0, 1111111111

.

.

.N 0, 11...

I Each cross over defines new points, with step sizes close to x = 1becoming more and more popular. Final step size will be 2−10.

Theis Optimization and genetic algorithms

Page 48: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Convergence analysis

I corresponds to a ‘schema’ (J.Holland 1975)

I example goal: maximize f (x) = x2 in [0, 1]

I generate N numbers in [0, 1] using 10 bit fixed point codingI generate new population by choosing codes with decreasing

probability according to fitness e.g.

1 0, 1... → 1 0, 1... → 1 0, 11...2 0, 1... → 2 0, 1... → 2 0, 11...

.

.

. → →...

N 0, 0... → N 0, 1... → N 0, 10...

1 0, 11...2 0, 11... → 0, 1111111111

.

.

.N 0, 11...

I Each cross over defines new points, with step sizes close to x = 1becoming more and more popular. Final step size will be 2−10.

Theis Optimization and genetic algorithms

Page 49: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Convergence analysis

I corresponds to a ‘schema’ (J.Holland 1975)

I example goal: maximize f (x) = x2 in [0, 1]

I generate N numbers in [0, 1] using 10 bit fixed point codingI generate new population by choosing codes with decreasing

probability according to fitness e.g.

1 0, 1... → 1 0, 1... → 1 0, 11...2 0, 1... → 2 0, 1... → 2 0, 11...

.

.

. → →...

N 0, 0... → N 0, 1... → N 0, 10...

1 0, 11...2 0, 11... → 0, 1111111111

.

.

.N 0, 11...

I Each cross over defines new points, with step sizes close to x = 1becoming more and more popular. Final step size will be 2−10.

Theis Optimization and genetic algorithms

Page 50: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Convergence analysis

I corresponds to a ‘schema’ (J.Holland 1975)

I example goal: maximize f (x) = x2 in [0, 1]

I generate N numbers in [0, 1] using 10 bit fixed point codingI generate new population by choosing codes with decreasing

probability according to fitness e.g.

1 0, 1... → 1 0, 1... → 1 0, 11...2 0, 1... → 2 0, 1... → 2 0, 11...

.

.

. → →...

N 0, 0... → N 0, 1... → N 0, 10...

1 0, 11...2 0, 11... → 0, 1111111111

.

.

.N 0, 11...

I Each cross over defines new points, with step sizes close to x = 1becoming more and more popular. Final step size will be 2−10.

Theis Optimization and genetic algorithms

Page 51: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Schema

A schema is a bit pattern representing a set of binary characters usingthe symbols {0, 1, .}.Genetic algorithms consist of a finite series of the three steps:

1. choice of parent bit pattern

2. recombination

3. mutation

With which probability do the fitter bit pattern survive from generationto generation?

Theis Optimization and genetic algorithms

Page 52: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Consider the probability of selecting the fitter bit pattern for generatingand for them to survive recombination and mutation operations.

I population at time t: N binary chains (schemas) of length lI O(H, t): number of elements of population at time t that contain

the schema HI d(H): diameter of a schema i.e. the length of the shortest

subpattern containing all non . characters (fixed bits) e.g.d(..1.1..) = d(1.1) = 3

I maximize fitness f defined on all binary chains of length lI selection of a parent chain with probability

p(Hj) =f (Hj)∑Ni=1 f (Hi )

I mean fitness of the population

fµ =1

N

N∑i=1

f (Hi )

p(Hj) =f (Hj)

N · fµTheis Optimization and genetic algorithms

Page 53: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Consider the probability of selecting the fitter bit pattern for generatingand for them to survive recombination and mutation operations.

I population at time t: N binary chains (schemas) of length lI O(H, t): number of elements of population at time t that contain

the schema HI d(H): diameter of a schema i.e. the length of the shortest

subpattern containing all non . characters (fixed bits) e.g.d(..1.1..) = d(1.1) = 3

I maximize fitness f defined on all binary chains of length lI selection of a parent chain with probability

p(Hj) =f (Hj)∑Ni=1 f (Hi )

I mean fitness of the population

fµ =1

N

N∑i=1

f (Hi )

p(Hj) =f (Hj)

N · fµTheis Optimization and genetic algorithms

Page 54: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Consider the probability of selecting the fitter bit pattern for generatingand for them to survive recombination and mutation operations.

I population at time t: N binary chains (schemas) of length lI O(H, t): number of elements of population at time t that contain

the schema HI d(H): diameter of a schema i.e. the length of the shortest

subpattern containing all non . characters (fixed bits) e.g.d(..1.1..) = d(1.1) = 3

I maximize fitness f defined on all binary chains of length lI selection of a parent chain with probability

p(Hj) =f (Hj)∑Ni=1 f (Hi )

I mean fitness of the population

fµ =1

N

N∑i=1

f (Hi )

p(Hj) =f (Hj)

N · fµTheis Optimization and genetic algorithms

Page 55: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Consider the probability of selecting the fitter bit pattern for generatingand for them to survive recombination and mutation operations.

I population at time t: N binary chains (schemas) of length lI O(H, t): number of elements of population at time t that contain

the schema HI d(H): diameter of a schema i.e. the length of the shortest

subpattern containing all non . characters (fixed bits) e.g.d(..1.1..) = d(1.1) = 3

I maximize fitness f defined on all binary chains of length lI selection of a parent chain with probability

p(Hj) =f (Hj)∑Ni=1 f (Hi )

I mean fitness of the population

fµ =1

N

N∑i=1

f (Hi )

p(Hj) =f (Hj)

N · fµTheis Optimization and genetic algorithms

Page 56: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

SelectionI selection probability of a chain containing schema H

P =k∑

j=1

f (Hj)

N · fµ

H1, ...,Hk chains of the population with schema HI fitness f (H) of schema H in generation t

f (H) =

∑j f (Hj)

O(H, t)⇒ p =

O(H, t)f (H)

N · fµ

I probability PA for two chains containing H to be selected as parents:

PA =

[O(H, t)f (H)

N · fµ

]2

I probability PB that exactly one of two selected chains contains H:

PB = 2 · O(H, t)f (H)

N · fµ·(

1− O(H, t)f (H)

N · fµ

)Theis Optimization and genetic algorithms

Page 57: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

SelectionI selection probability of a chain containing schema H

P =k∑

j=1

f (Hj)

N · fµ

H1, ...,Hk chains of the population with schema HI fitness f (H) of schema H in generation t

f (H) =

∑j f (Hj)

O(H, t)⇒ p =

O(H, t)f (H)

N · fµ

I probability PA for two chains containing H to be selected as parents:

PA =

[O(H, t)f (H)

N · fµ

]2

I probability PB that exactly one of two selected chains contains H:

PB = 2 · O(H, t)f (H)

N · fµ·(

1− O(H, t)f (H)

N · fµ

)Theis Optimization and genetic algorithms

Page 58: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

SelectionI selection probability of a chain containing schema H

P =k∑

j=1

f (Hj)

N · fµ

H1, ...,Hk chains of the population with schema HI fitness f (H) of schema H in generation t

f (H) =

∑j f (Hj)

O(H, t)⇒ p =

O(H, t)f (H)

N · fµ

I probability PA for two chains containing H to be selected as parents:

PA =

[O(H, t)f (H)

N · fµ

]2

I probability PB that exactly one of two selected chains contains H:

PB = 2 · O(H, t)f (H)

N · fµ·(

1− O(H, t)f (H)

N · fµ

)Theis Optimization and genetic algorithms

Page 59: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

SelectionI selection probability of a chain containing schema H

P =k∑

j=1

f (Hj)

N · fµ

H1, ...,Hk chains of the population with schema HI fitness f (H) of schema H in generation t

f (H) =

∑j f (Hj)

O(H, t)⇒ p =

O(H, t)f (H)

N · fµ

I probability PA for two chains containing H to be selected as parents:

PA =

[O(H, t)f (H)

N · fµ

]2

I probability PB that exactly one of two selected chains contains H:

PB = 2 · O(H, t)f (H)

N · fµ·(

1− O(H, t)f (H)

N · fµ

)Theis Optimization and genetic algorithms

Page 60: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Recombination

I probability that schema H is contained in child chain: if bothparents contain H, then 1, if only one then 1/2 in the mean

I schema H is divided during crossover with probability

Pdiv =d(H)− 1

l − 1

hence probability of survival during recombination

W ≥(

O(H, t)f (H)

N · fµ

)2

+2

2

(O(H, t)f (H)

N · fµ

) (1− O(H, t)f (H)

N · fµ

) (1− d(H)− 1

l − 1

)≥ O(H, t)f (H)

N · fµ

(1− d(H)− 1

l − 1

(1− O(H, t)f (H)

N · fµ

))

Theis Optimization and genetic algorithms

Page 61: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Recombination

I probability that schema H is contained in child chain: if bothparents contain H, then 1, if only one then 1/2 in the mean

I schema H is divided during crossover with probability

Pdiv =d(H)− 1

l − 1

hence probability of survival during recombination

W ≥(

O(H, t)f (H)

N · fµ

)2

+2

2

(O(H, t)f (H)

N · fµ

) (1− O(H, t)f (H)

N · fµ

) (1− d(H)− 1

l − 1

)≥ O(H, t)f (H)

N · fµ

(1− d(H)− 1

l − 1

(1− O(H, t)f (H)

N · fµ

))

Theis Optimization and genetic algorithms

Page 62: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Mutation

I during recombination a schema H with b(H) fixed bits Bits surviveswith probability (1− p)b(H) with mutation probability p

I hence total survival probability is

W ′ ≥ O(H, t)f (H)

N · fµ

(1− d(H)− 1

l − 1

(1− O(H, t)f (H)

N · fµ

))(1−p)b(H)

I Schema-theorem: If N new chains are generated in generation t,then the mean number of chains containing H in generation t + 1 is

〈O(H, t + 1)〉 = N ·W ′ ≥ P · (1− Pdiv (1− P))(1− p)b(H)

with

P =O(H, t)f (H)

N · fµand Pdiv =

d(H)− 1

l − 1

Theis Optimization and genetic algorithms

Page 63: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Mutation

I during recombination a schema H with b(H) fixed bits Bits surviveswith probability (1− p)b(H) with mutation probability p

I hence total survival probability is

W ′ ≥ O(H, t)f (H)

N · fµ

(1− d(H)− 1

l − 1

(1− O(H, t)f (H)

N · fµ

))(1−p)b(H)

I Schema-theorem: If N new chains are generated in generation t,then the mean number of chains containing H in generation t + 1 is

〈O(H, t + 1)〉 = N ·W ′ ≥ P · (1− Pdiv (1− P))(1− p)b(H)

with

P =O(H, t)f (H)

N · fµand Pdiv =

d(H)− 1

l − 1

Theis Optimization and genetic algorithms

Page 64: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Mutation

I during recombination a schema H with b(H) fixed bits Bits surviveswith probability (1− p)b(H) with mutation probability p

I hence total survival probability is

W ′ ≥ O(H, t)f (H)

N · fµ

(1− d(H)− 1

l − 1

(1− O(H, t)f (H)

N · fµ

))(1−p)b(H)

I Schema-theorem: If N new chains are generated in generation t,then the mean number of chains containing H in generation t + 1 is

〈O(H, t + 1)〉 = N ·W ′ ≥ P · (1− Pdiv (1− P))(1− p)b(H)

with

P =O(H, t)f (H)

N · fµand Pdiv =

d(H)− 1

l − 1

Theis Optimization and genetic algorithms

Page 65: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Conclusions from the schema theoremI chains with higher fitness and small diameter are favored and

reproduce with higher probabilityI too high mutation rate p destroys every schemaI if a schema H is well represented in a population it reproduces

better even at medium fitnessI If f (H) = fµ then

1− Ptrenn(1− P) = 1− Ptrenn

(1− O(H, t)

N

)i.e. schemata with high O(H, t)/N are not cut during crossover andsurvive, hence the algorithm converges to these patterns with meanfitness ⇒ genetic drift (corresponds to random walk in searchspace)

I genetic drift looses bit patterns, which can be countered by highermutation rates.

I correct balance between mutation and crossover is important, butdepends on the problem

Theis Optimization and genetic algorithms

Page 66: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Conclusions from the schema theoremI chains with higher fitness and small diameter are favored and

reproduce with higher probabilityI too high mutation rate p destroys every schemaI if a schema H is well represented in a population it reproduces

better even at medium fitnessI If f (H) = fµ then

1− Ptrenn(1− P) = 1− Ptrenn

(1− O(H, t)

N

)i.e. schemata with high O(H, t)/N are not cut during crossover andsurvive, hence the algorithm converges to these patterns with meanfitness ⇒ genetic drift (corresponds to random walk in searchspace)

I genetic drift looses bit patterns, which can be countered by highermutation rates.

I correct balance between mutation and crossover is important, butdepends on the problem

Theis Optimization and genetic algorithms

Page 67: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Conclusions from the schema theoremI chains with higher fitness and small diameter are favored and

reproduce with higher probabilityI too high mutation rate p destroys every schemaI if a schema H is well represented in a population it reproduces

better even at medium fitnessI If f (H) = fµ then

1− Ptrenn(1− P) = 1− Ptrenn

(1− O(H, t)

N

)i.e. schemata with high O(H, t)/N are not cut during crossover andsurvive, hence the algorithm converges to these patterns with meanfitness ⇒ genetic drift (corresponds to random walk in searchspace)

I genetic drift looses bit patterns, which can be countered by highermutation rates.

I correct balance between mutation and crossover is important, butdepends on the problem

Theis Optimization and genetic algorithms

Page 68: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Genetic programming

I generalization of genetic algorithms to automatically developcomputer programs (CP)

I each structure of the population represents a single CP

I properties of CP’sI hierarchical operationsI alternative calculation using conditionsI iterative calculationsI manipulation of different data typesI procedural calculations

Theis Optimization and genetic algorithms

Page 69: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Genetic programming

I generalization of genetic algorithms to automatically developcomputer programs (CP)

I each structure of the population represents a single CP

I properties of CP’sI hierarchical operationsI alternative calculation using conditionsI iterative calculationsI manipulation of different data typesI procedural calculations

Theis Optimization and genetic algorithms

Page 70: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Genetic programming — coding

representation of a CP by a tree e.g.

if a > b then x := 3

can be represented by the tree

if↙ ↘

> :=↙ ↘ ↙ ↘

a b x 3

I search of a CP that solves a given problem can be performed usingprinciples of genetic algorithms

Theis Optimization and genetic algorithms

Page 71: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Genetic programming — crossover-operatorparents generation

A G↙ ↘ ↙ ↘

B D H I↓ ↙ ↘ ↙ ↘C E F J K

↙ ↘L M

children generation

A G↙ ↘ ↙ ↘

K D H I↙ ↘ ↙ ↘ ↙ ↘

L M E F J B↓C

Theis Optimization and genetic algorithms

Page 72: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Genetic programming — crossover-operatorparents generation

A G↙ ↘ ↙ ↘

B D H I↓ ↙ ↘ ↙ ↘C E F J K

↙ ↘L M

children generation

A G↙ ↘ ↙ ↘

K D H I↙ ↘ ↙ ↘ ↙ ↘

L M E F J B↓C

Theis Optimization and genetic algorithms

Page 73: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

OverviewConvergence analysisSchema theoremGenetic programming

Genetic programming — mutation-operator

a knot is chosen randomly and replaced by a randomly generated subtree:

A A↙ ↘ ↙ ↘

B D B D↓ ↙ ↘ ↓ ↙ ↘C E F C E K

↙ ↘L M

Theis Optimization and genetic algorithms

Page 74: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

2d-function optimizationGenetic MastermindHyerplane detection

Algorithm

IntroductionReinforcement learningOptimizationImitate natureGenetic algorithms

AlgorithmBasic algorithmData representationSelection

ReproductionProperties and extensions

OverviewConvergence analysisSchema theoremGenetic programming

Examples2d-function optimizationGenetic MastermindHyerplane detection

Theis Optimization and genetic algorithms

Page 75: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

2d-function optimizationGenetic MastermindHyerplane detection

Examples

I continuous exampleI global optimization of continuous function f : [a, b] → R

I binary exampleI genetic MastermindI select optimal guess using GA

I example from our researchI perform overcomplete blind source separation by sparse component

analysisI key problem: hyperplane detectionI solution: optimize cost function using GAs

Theis Optimization and genetic algorithms

Page 76: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

2d-function optimizationGenetic MastermindHyerplane detection

Examples

I continuous exampleI global optimization of continuous function f : [a, b] → R

I binary exampleI genetic MastermindI select optimal guess using GA

I example from our researchI perform overcomplete blind source separation by sparse component

analysisI key problem: hyperplane detectionI solution: optimize cost function using GAs

Theis Optimization and genetic algorithms

Page 77: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

2d-function optimizationGenetic MastermindHyerplane detection

Examples

I continuous exampleI global optimization of continuous function f : [a, b] → R

I binary exampleI genetic MastermindI select optimal guess using GA

I example from our researchI perform overcomplete blind source separation by sparse component

analysisI key problem: hyperplane detectionI solution: optimize cost function using GAs

Theis Optimization and genetic algorithms

Page 78: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

2d-function optimizationGenetic MastermindHyerplane detection

2d-function optimization

−10 −8 −6 −4 −2 0 2 4 6 8 10

5

10

15

20

25

30

35

40

45

50

x

multipeak

0 10 20 30 40 50 60 70 80 90 10020

25

30

35

40

45

50

f performance (optimal individual and mean)

Theis Optimization and genetic algorithms

Page 79: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

2d-function optimizationGenetic MastermindHyerplane detection

Genetic Mastermind

Theis Optimization and genetic algorithms

Page 80: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

2d-function optimizationGenetic MastermindHyerplane detection

Hyerplane detection

−1

−0.5

0

0.5

1

−1

−0.5

0

0.5

1

−1

−0.5

0

0.5

1

−1

−0.5

0

0.5

1 −1

−0.5

0

0.5

1

−1

−0.5

0

0.5

1

I perform overcomplete blind source separation by sparse componentanalysis [Georgiev et al., 2004, Theis et al., 2004]

I key problem: hyperplane detection

I solution: optimize cost function using GAs

Theis Optimization and genetic algorithms

Page 81: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

2d-function optimizationGenetic MastermindHyerplane detection

Conclusions

I genetic algorithms perform global optimization

I they mimic nature by letting a population evolve according to theirfitness

I algorithmI selectionI reproduction: by crossover and mutation

I simple applicability in real-world situations

Theis Optimization and genetic algorithms

Page 82: Lecture on bioinformatics, chapter 2: optimization and ...khuri/Aalto_2017/Theis_2005.pdfAlgorithm Properties and extensions Examples Lecture on bioinformatics, chapter 2: optimization

IntroductionAlgorithm

Properties and extensionsExamples

2d-function optimizationGenetic MastermindHyerplane detection

I Resources

I books: [Goldberg, 1989,Schoneburg et al., 1994]

I Matlab GA optimizationtoolbox:http://www.ie.ncsu.edu/

mirage/GAToolBox/gaot

I Details and papers on my websitehttp://fabian.theis.name

I This research was supported bythe DFG and BMBF.

I ReferencesP. Georgiev, F. Theis, and A. Cichocki. Sparse

component analysis and blind source separation ofunderdetermined mixtures. IEEE Trans. on NeuralNetworks in print, 2004.

D. Goldberg. Genetic Algorithms in Search Optimizationand Machine Learning. Addison Wesley Publishing,1989.

E. Schoneburg, F. Heinzmann, and S. Feddersen.Genetische Algorithmen und Evolutionsstrategien.Addison Wesley Publishing, 1994.

F. Theis, P. Georgiev, and A. Cichocki. Robustovercomplete matrix recovery for sparse sources usinga generalized hough transform. In Proc. ESANN2004, pages 343–348, Bruges, Belgium, 2004. d-side,Evere, Belgium. URL http:

//homepages.uni-regensburg.de/∼thf11669/

publications/theis04houghSCA ESANN04.pdf.

Theis Optimization and genetic algorithms