CSM6120 Introduction to Intelligent Systems

53
[email protected] CSM6120 Introduction to Intelligent Systems Evolutionary and Genetic Algorithms

description

CSM6120 Introduction to Intelligent Systems. Evolutionary and Genetic Algorithms. Informal biological terminology. Genes Encoding rules that describe how an organism is built up from the tiny building blocks of life Chromosomes Long strings formed by connecting genes together Recombination - PowerPoint PPT Presentation

Transcript of CSM6120 Introduction to Intelligent Systems

Page 1: CSM6120 Introduction to Intelligent Systems

[email protected]

CSM6120Introduction to Intelligent

SystemsEvolutionary and Genetic Algorithms

Page 2: CSM6120 Introduction to Intelligent Systems

Informal biological terminology Genes

Encoding rules that describe how an organism is built up from the tiny building blocks of life

Chromosomes Long strings formed by connecting genes together

Recombination Process of two organisms mating, producing

offspring that may end up sharing genes of their parents

Page 3: CSM6120 Introduction to Intelligent Systems

Basic ideas of EAs An EA is an iterative procedure which evolves

a population of individuals Each individual is a candidate solution to a given

problem Each individual is evaluated by a fitness function,

which measures the quality of its candidate solution

At each iteration (generation): The best individuals are selected Genetic operators are applied to selected

individuals in order to produce new individuals (offspring)

New individuals are evaluated by fitness function

Page 4: CSM6120 Introduction to Intelligent Systems

TaxonomySearch Techniques

Informed Uninformed

BFSDFS

A* Hill Climbing

Simulated Annealing

Evolutionary Algorithms

Genetic Programming

Genetic Algorithms

Swarm Intelligence

Evolutionary Strategies

Page 5: CSM6120 Introduction to Intelligent Systems

The Genetic Algorithm Directed search algorithms based on the

mechanics of biological evolution

Developed by John Holland, University of Michigan (1970s) To understand the adaptive processes of natural

systems To design artificial systems software that retains

the robustness of natural systems

Provide efficient, effective techniques for optimization and machine learning applications

Page 6: CSM6120 Introduction to Intelligent Systems

Some GA applicationsDomain Application Types

Control gas pipeline, pole balancing, missile evasion, pursuit

Design semiconductor layout, aircraft design, keyboard configuration, communication networks

Scheduling manufacturing, facility scheduling, resource allocation

Robotics trajectory planning

Machine Learning designing neural networks, improving classification algorithms, classifier systems

Signal Processing filter design

Game Playing poker, checkers, prisoner’s dilemma

Combinatorial Optimization

set covering, travelling salesman, routing, bin packing, graph colouring and partitioning

Page 7: CSM6120 Introduction to Intelligent Systems

Application: function optimisation (1)

-10 -5 0 5 10

0

0.5

1

1.5

2

2.5

3

3.5

4

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

f(x) = x2 g(x) = sin(x) - 0.1 x + 2

h(x,y) = x.sin(4x) - y.sin(4y+ ) + 1

Page 8: CSM6120 Introduction to Intelligent Systems

Application: function optimisation (2) Conventional approaches:

Often requires knowledge of derivatives or other specific mathematical technique

Evolutionary algorithm approach: Requires only a measure of solution quality

(fitness function)

Page 9: CSM6120 Introduction to Intelligent Systems

Components of a GAA problem to solve, and ...

Encoding technique (gene, chromosome) Initialization procedure (creation) Evaluation function (environment) Selection of parents (reproduction) Genetic operators (mutation, recombination) Parameter settings (practice and art)

Page 10: CSM6120 Introduction to Intelligent Systems

GA terminology Population

The collection of potential solutions (i.e. all the chromosomes)

Parents/Children Both are chromosomes Children are generated from the parent chromosomes

Generations Number of iterations/cycles through the GA process

Page 11: CSM6120 Introduction to Intelligent Systems

Simple GAinitialize population;evaluate population;

while TerminationCriteriaNotSatisfied{

select parents for reproduction;perform recombination and mutation;evaluate population;

}

Page 12: CSM6120 Introduction to Intelligent Systems

The GA cycle

selection

population evaluation

modification

discard

deleted members

parents

children

modifiedchildren

evaluated children

recombinationchosenparents

Page 13: CSM6120 Introduction to Intelligent Systems

PopulationChromosomes could be:

Bit strings (0101 ... 1100)

Real numbers (43.2 -33.1 ... 0.0 89.2)

Permutations of element (E11 E3 E7 ... E1 E15)

Lists of rules (R1 R2 R3 ... R22 R23)

Program elements (genetic programming)

... any data structure ...

Page 14: CSM6120 Introduction to Intelligent Systems

Representation of an individual can be using discrete values (binary, integer, or any other system with a discrete set of values)

The following is an example of binary representation: CHROMOSOME

GENE

Example: Discrete representation

0 0 1 11 0 1 0

Page 15: CSM6120 Introduction to Intelligent Systems

8 bits Genotype

Phenotype:

• Integer

• Real Number

• Schedule

• ...

• Anything?

Example: Discrete representation

0 0 1 11 0 1 0

Page 16: CSM6120 Introduction to Intelligent Systems

Phenotype could be integer numbers

Genotype:

1*27 + 0*26 + 1*25 + 0*24 + 0*23 + 0*22 + 1*21 + 1*20 =128 + 32 + 2 + 1 = 163

= 163Phenotype:

Example: Discrete representation

0 0 1 11 0 1 0

Page 17: CSM6120 Introduction to Intelligent Systems

Phenotype could be real numberse.g. a number between 2.5 and 20.5 using 8

binary digits

9609.135.25.202561635.2 x

= 13.9609Genotype: Phenotype:

Example: Discrete representation

0 0 1 11 0 1 0

Page 18: CSM6120 Introduction to Intelligent Systems

Phenotype could be a schedulee.g. 8 jobs, 2 time steps

Genotype:

=

12345678

21211122

Job Time Step

Phenotype

Example: Discrete representation

0 0 1 11 0 1 0

Page 19: CSM6120 Introduction to Intelligent Systems

A very natural encoding if the solution we are looking for is a list of real-valued numbers, then encode it as a list of real-valued numbers! (i.e., not as a string of 1s and 0s)

Lots of applications, e.g. parameter optimisation

Example: Real-valued representation

Page 20: CSM6120 Introduction to Intelligent Systems

Representation Task – how to represent the travelling

salesman problem (TSP)?

Find a tour of a given set of cities so that Each city is visited only once The total distance travelled is minimised

Page 21: CSM6120 Introduction to Intelligent Systems

RepresentationOne possibility - an ordered list of city numbers (this is known as an order-based GA)

1) London 3) Dunedin 5) Beijing 7) Tokyo2) Venice 4) Singapore 6) Phoenix 8) Victoria

Chromosome 1 (3 5 7 2 1 6 4 8)Chromosome 2 (2 5 7 6 8 1 3 4)

Page 22: CSM6120 Introduction to Intelligent Systems

Selection

selection

population

parents

Page 23: CSM6120 Introduction to Intelligent Systems

Selection Need to choose which chromosomes to use

based on their ‘fitness’ Why not choose the best chromosomes?

We want a balance between exploration and exploitation

Page 24: CSM6120 Introduction to Intelligent Systems

Roulette wheel selection

Page 25: CSM6120 Introduction to Intelligent Systems

Rank-based selection 1st step

Sort (rank) individuals according to fitness Ascending or descending order (minimization or

maximization)

2nd step Select individuals with probability proportional to their

rank only (ignoring the fitness value) The better the rank, the higher the probability of being

selected

It avoids most of the problems associated with roulette-wheel selection, but still requires global sorting of individuals, reducing potential for parallel processing

Page 26: CSM6120 Introduction to Intelligent Systems

Tournament selection A number of “tournaments” are run

Several chromosomes chosen at random The chromosome with the highest fitness is

selected each time Larger tournament size means that weak

chromosomes are less likely to be selected

Advantages It is efficient to code It works on parallel architectures 

Page 27: CSM6120 Introduction to Intelligent Systems

The GA cycle

selection

population evaluation

modification

discard

deleted members

parents

children

modifiedchildren

evaluated children

recombinationchosenparents

Page 28: CSM6120 Introduction to Intelligent Systems

Crossover: recombination

P1 (0 1 1 0 1 0 1 1) (1 1 0 1 1 0 1 1) C1

P2 (1 1 0 1 1 0 0 1) (0 1 1 0 1 0 0 1) C2

Crossover is a critical feature of GAs: It greatly accelerates search early in evolution of a

population It leads to effective combination of sub-solutions on

different chromosomes Several methods for crossover exist…

Page 29: CSM6120 Introduction to Intelligent Systems

Crossover How would we implement crossover for TSPs?

Parent 1 (3 5 7 2 1 6 4 8)Parent 2 (2 5 7 6 8 1 3 4)

Page 30: CSM6120 Introduction to Intelligent Systems

Crossover

Parent 1 (3 5 7 2 1 6 4 8)Parent 2 (2 5 7 6 8 1 3 4)

Child 1 (3 5 7 6 8 1 3 4)Child 2 (2 5 7 2 1 6 4 8)

Page 31: CSM6120 Introduction to Intelligent Systems

Mutation: local modificationBefore: (1 0 1 1 0 1 1 0)

After: (0 1 1 0 0 1 1 0)

Before: (1.38 -69.4 326.44 0.1)

After: (1.38 -67.5 326.44 0.1)

Causes movement in the search space(local or global)

Restores lost information to the population

Page 32: CSM6120 Introduction to Intelligent Systems

Mutation Given the representation for TSPs, how could

we achieve mutation?

Page 33: CSM6120 Introduction to Intelligent Systems

Mutation involves reordering of the list:

* *Before: (5 8 7 2 1 6 3 4)

After: (5 8 6 2 1 7 3 4)

Mutation

Page 34: CSM6120 Introduction to Intelligent Systems

Note Both mutation and crossover are applied

based on user-supplied probabilities

We usually use a fairly high crossover rate and fairly low mutation rate Why do you think this is?

Page 35: CSM6120 Introduction to Intelligent Systems

Evaluation of fitness

The evaluator decodes a chromosome and assigns it a fitness measure

The evaluator is the only link between a classical GA and the problem it is solving

evaluation

modifiedchildren

evaluated children

Page 36: CSM6120 Introduction to Intelligent Systems

Fitness functions Evaluate the ‘goodness’ of chromosomes

(How well they solve the problem)

Critical to the success of the GA

Often difficult to define well

Must be fairly fast, as each chromosome must be evaluated each generation (iteration)

Page 37: CSM6120 Introduction to Intelligent Systems

Fitness functions Fitness function for the TSP?

(3 5 7 2 1 6 4 8)

As we’re minimizing the distance travelled, the fitness is the total distance travelled in the journey defined by the chromosome

Page 38: CSM6120 Introduction to Intelligent Systems

Deletion

Generational GA:entire populations replaced with each iteration

Steady-state GA:a few members replaced each generation

population

discard

deleted members

Page 39: CSM6120 Introduction to Intelligent Systems

The GA cycle

selection

population evaluation

modification

discard

deleted members

parents

children

modifiedchildren

evaluated children

recombinationchosenparents

Page 40: CSM6120 Introduction to Intelligent Systems

Stopping! The GA cycle continues until

The system has ‘converged’; or A specified number of iterations (‘generations’)

has been performed

Page 41: CSM6120 Introduction to Intelligent Systems

An abstract example

Distribution of Individuals in Generation 0

Distribution of Individuals in Generation N

Page 42: CSM6120 Introduction to Intelligent Systems

Good demo of the GA components http://www.obitko.com/tutorials/genetic-algorithms

/example-function-minimum.php

Page 43: CSM6120 Introduction to Intelligent Systems

TSP example: 30 cities

0

20

40

60

80

100

120

0 10 20 30 40 50 60 70 80 90 100

y

x

Page 44: CSM6120 Introduction to Intelligent Systems

0

20

40

60

80

100

120

0 10 20 30 40 50 60 70 80 90 100

y

x

TSP30 (Performance = 941)

Page 45: CSM6120 Introduction to Intelligent Systems

44626967786462544250404038213567606040425099

0

20

40

60

80

100

120

0 10 20 30 40 50 60 70 80 90 100

y

x

TSP30 (Performance = 800)

Page 46: CSM6120 Introduction to Intelligent Systems

0

20

40

60

80

100

120

0 10 20 30 40 50 60 70 80 90 100

y

x

TSP30 (Performance = 652)

Page 47: CSM6120 Introduction to Intelligent Systems

423835262135327

3846445860697678716967628494

0

20

40

60

80

100

120

0 10 20 30 40 50 60 70 80 90 100

y

x

TSP30 Solution (Performance = 420)

Page 48: CSM6120 Introduction to Intelligent Systems

Overview of performance

0

200

400

600

800

1000

1200

1400

1600

1800

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

Dis

tanc

e

Generations (1000)

TSP30 - Overview of Performance

Best

Worst

Average

Page 49: CSM6120 Introduction to Intelligent Systems

Example: n-queens Put n queens on an n × n board with no two

queens on the same row, column, or diagonal

Page 50: CSM6120 Introduction to Intelligent Systems

Examples Eaters

http://math.hws.edu/xJava/GA/

TSP http://www.heatonresearch.com/articles/65/page1.

html http://www.ads.tuwien.ac.at/raidl/tspga/TSPGA.ht

ml 

Page 51: CSM6120 Introduction to Intelligent Systems

Exercise: The Card Problem You have 10 cards numbered from 1 to 10. You have to

choose a way of dividing them into 2 piles, so that the cards in Pile0 *sum* to a number as close as possible to 36, and the remaining cards in Pile1 *multiply* to a number as close as possible to 360

Encoding Each card can be in Pile0 or Pile1, there are 1024 possible

ways of sorting them into 2 piles, and you have to find the best. Think of a sensible way of encoding any possible solution.

Fitness Some of these chromosomes will be closer to the target

than others. Think of a sensible way of evaluating any chromosome and scoring it with a fitness measure.

Page 52: CSM6120 Introduction to Intelligent Systems

Issues for GA practitioners Choosing basic implementation issues:

Representation Population size, mutation rate, ... Selection, deletion policies Crossover, mutation operators

Termination criteria Performance, scalability Solution is only as good as the fitness function

(often hardest part)

Your assignment will be to code a GA for a given task! Be aware of the above issues…

Page 53: CSM6120 Introduction to Intelligent Systems

Concept is easy to understand

Supports multi-objective optimization

Good for “noisy” environments

Always an answer; answer gets better with time

Inherently parallel; easily distributed

Benefits of GAs