CSM6120 Introduction to Intelligent Systems
description
Transcript of CSM6120 Introduction to Intelligent Systems
Informal biological terminology Genes
Encoding rules that describe how an organism is built up from the tiny building blocks of life
Chromosomes Long strings formed by connecting genes together
Recombination Process of two organisms mating, producing
offspring that may end up sharing genes of their parents
Basic ideas of EAs An EA is an iterative procedure which evolves
a population of individuals Each individual is a candidate solution to a given
problem Each individual is evaluated by a fitness function,
which measures the quality of its candidate solution
At each iteration (generation): The best individuals are selected Genetic operators are applied to selected
individuals in order to produce new individuals (offspring)
New individuals are evaluated by fitness function
TaxonomySearch Techniques
Informed Uninformed
BFSDFS
A* Hill Climbing
Simulated Annealing
Evolutionary Algorithms
Genetic Programming
Genetic Algorithms
Swarm Intelligence
Evolutionary Strategies
The Genetic Algorithm Directed search algorithms based on the
mechanics of biological evolution
Developed by John Holland, University of Michigan (1970s) To understand the adaptive processes of natural
systems To design artificial systems software that retains
the robustness of natural systems
Provide efficient, effective techniques for optimization and machine learning applications
Some GA applicationsDomain Application Types
Control gas pipeline, pole balancing, missile evasion, pursuit
Design semiconductor layout, aircraft design, keyboard configuration, communication networks
Scheduling manufacturing, facility scheduling, resource allocation
Robotics trajectory planning
Machine Learning designing neural networks, improving classification algorithms, classifier systems
Signal Processing filter design
Game Playing poker, checkers, prisoner’s dilemma
Combinatorial Optimization
set covering, travelling salesman, routing, bin packing, graph colouring and partitioning
Application: function optimisation (1)
-10 -5 0 5 10
0
0.5
1
1.5
2
2.5
3
3.5
4
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
f(x) = x2 g(x) = sin(x) - 0.1 x + 2
h(x,y) = x.sin(4x) - y.sin(4y+ ) + 1
Application: function optimisation (2) Conventional approaches:
Often requires knowledge of derivatives or other specific mathematical technique
Evolutionary algorithm approach: Requires only a measure of solution quality
(fitness function)
Components of a GAA problem to solve, and ...
Encoding technique (gene, chromosome) Initialization procedure (creation) Evaluation function (environment) Selection of parents (reproduction) Genetic operators (mutation, recombination) Parameter settings (practice and art)
GA terminology Population
The collection of potential solutions (i.e. all the chromosomes)
Parents/Children Both are chromosomes Children are generated from the parent chromosomes
Generations Number of iterations/cycles through the GA process
Simple GAinitialize population;evaluate population;
while TerminationCriteriaNotSatisfied{
select parents for reproduction;perform recombination and mutation;evaluate population;
}
The GA cycle
selection
population evaluation
modification
discard
deleted members
parents
children
modifiedchildren
evaluated children
recombinationchosenparents
PopulationChromosomes could be:
Bit strings (0101 ... 1100)
Real numbers (43.2 -33.1 ... 0.0 89.2)
Permutations of element (E11 E3 E7 ... E1 E15)
Lists of rules (R1 R2 R3 ... R22 R23)
Program elements (genetic programming)
... any data structure ...
Representation of an individual can be using discrete values (binary, integer, or any other system with a discrete set of values)
The following is an example of binary representation: CHROMOSOME
GENE
Example: Discrete representation
0 0 1 11 0 1 0
8 bits Genotype
Phenotype:
• Integer
• Real Number
• Schedule
• ...
• Anything?
Example: Discrete representation
0 0 1 11 0 1 0
Phenotype could be integer numbers
Genotype:
1*27 + 0*26 + 1*25 + 0*24 + 0*23 + 0*22 + 1*21 + 1*20 =128 + 32 + 2 + 1 = 163
= 163Phenotype:
Example: Discrete representation
0 0 1 11 0 1 0
Phenotype could be real numberse.g. a number between 2.5 and 20.5 using 8
binary digits
9609.135.25.202561635.2 x
= 13.9609Genotype: Phenotype:
Example: Discrete representation
0 0 1 11 0 1 0
Phenotype could be a schedulee.g. 8 jobs, 2 time steps
Genotype:
=
12345678
21211122
Job Time Step
Phenotype
Example: Discrete representation
0 0 1 11 0 1 0
A very natural encoding if the solution we are looking for is a list of real-valued numbers, then encode it as a list of real-valued numbers! (i.e., not as a string of 1s and 0s)
Lots of applications, e.g. parameter optimisation
Example: Real-valued representation
Representation Task – how to represent the travelling
salesman problem (TSP)?
Find a tour of a given set of cities so that Each city is visited only once The total distance travelled is minimised
RepresentationOne possibility - an ordered list of city numbers (this is known as an order-based GA)
1) London 3) Dunedin 5) Beijing 7) Tokyo2) Venice 4) Singapore 6) Phoenix 8) Victoria
Chromosome 1 (3 5 7 2 1 6 4 8)Chromosome 2 (2 5 7 6 8 1 3 4)
Selection
selection
population
parents
Selection Need to choose which chromosomes to use
based on their ‘fitness’ Why not choose the best chromosomes?
We want a balance between exploration and exploitation
Roulette wheel selection
Rank-based selection 1st step
Sort (rank) individuals according to fitness Ascending or descending order (minimization or
maximization)
2nd step Select individuals with probability proportional to their
rank only (ignoring the fitness value) The better the rank, the higher the probability of being
selected
It avoids most of the problems associated with roulette-wheel selection, but still requires global sorting of individuals, reducing potential for parallel processing
Tournament selection A number of “tournaments” are run
Several chromosomes chosen at random The chromosome with the highest fitness is
selected each time Larger tournament size means that weak
chromosomes are less likely to be selected
Advantages It is efficient to code It works on parallel architectures
The GA cycle
selection
population evaluation
modification
discard
deleted members
parents
children
modifiedchildren
evaluated children
recombinationchosenparents
Crossover: recombination
P1 (0 1 1 0 1 0 1 1) (1 1 0 1 1 0 1 1) C1
P2 (1 1 0 1 1 0 0 1) (0 1 1 0 1 0 0 1) C2
Crossover is a critical feature of GAs: It greatly accelerates search early in evolution of a
population It leads to effective combination of sub-solutions on
different chromosomes Several methods for crossover exist…
Crossover How would we implement crossover for TSPs?
Parent 1 (3 5 7 2 1 6 4 8)Parent 2 (2 5 7 6 8 1 3 4)
Crossover
Parent 1 (3 5 7 2 1 6 4 8)Parent 2 (2 5 7 6 8 1 3 4)
Child 1 (3 5 7 6 8 1 3 4)Child 2 (2 5 7 2 1 6 4 8)
Mutation: local modificationBefore: (1 0 1 1 0 1 1 0)
After: (0 1 1 0 0 1 1 0)
Before: (1.38 -69.4 326.44 0.1)
After: (1.38 -67.5 326.44 0.1)
Causes movement in the search space(local or global)
Restores lost information to the population
Mutation Given the representation for TSPs, how could
we achieve mutation?
Mutation involves reordering of the list:
* *Before: (5 8 7 2 1 6 3 4)
After: (5 8 6 2 1 7 3 4)
Mutation
Note Both mutation and crossover are applied
based on user-supplied probabilities
We usually use a fairly high crossover rate and fairly low mutation rate Why do you think this is?
Evaluation of fitness
The evaluator decodes a chromosome and assigns it a fitness measure
The evaluator is the only link between a classical GA and the problem it is solving
evaluation
modifiedchildren
evaluated children
Fitness functions Evaluate the ‘goodness’ of chromosomes
(How well they solve the problem)
Critical to the success of the GA
Often difficult to define well
Must be fairly fast, as each chromosome must be evaluated each generation (iteration)
Fitness functions Fitness function for the TSP?
(3 5 7 2 1 6 4 8)
As we’re minimizing the distance travelled, the fitness is the total distance travelled in the journey defined by the chromosome
Deletion
Generational GA:entire populations replaced with each iteration
Steady-state GA:a few members replaced each generation
population
discard
deleted members
The GA cycle
selection
population evaluation
modification
discard
deleted members
parents
children
modifiedchildren
evaluated children
recombinationchosenparents
Stopping! The GA cycle continues until
The system has ‘converged’; or A specified number of iterations (‘generations’)
has been performed
An abstract example
Distribution of Individuals in Generation 0
Distribution of Individuals in Generation N
Good demo of the GA components http://www.obitko.com/tutorials/genetic-algorithms
/example-function-minimum.php
TSP example: 30 cities
0
20
40
60
80
100
120
0 10 20 30 40 50 60 70 80 90 100
y
x
0
20
40
60
80
100
120
0 10 20 30 40 50 60 70 80 90 100
y
x
TSP30 (Performance = 941)
44626967786462544250404038213567606040425099
0
20
40
60
80
100
120
0 10 20 30 40 50 60 70 80 90 100
y
x
TSP30 (Performance = 800)
0
20
40
60
80
100
120
0 10 20 30 40 50 60 70 80 90 100
y
x
TSP30 (Performance = 652)
423835262135327
3846445860697678716967628494
0
20
40
60
80
100
120
0 10 20 30 40 50 60 70 80 90 100
y
x
TSP30 Solution (Performance = 420)
Overview of performance
0
200
400
600
800
1000
1200
1400
1600
1800
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
Dis
tanc
e
Generations (1000)
TSP30 - Overview of Performance
Best
Worst
Average
Example: n-queens Put n queens on an n × n board with no two
queens on the same row, column, or diagonal
Examples Eaters
http://math.hws.edu/xJava/GA/
TSP http://www.heatonresearch.com/articles/65/page1.
html http://www.ads.tuwien.ac.at/raidl/tspga/TSPGA.ht
ml
Exercise: The Card Problem You have 10 cards numbered from 1 to 10. You have to
choose a way of dividing them into 2 piles, so that the cards in Pile0 *sum* to a number as close as possible to 36, and the remaining cards in Pile1 *multiply* to a number as close as possible to 360
Encoding Each card can be in Pile0 or Pile1, there are 1024 possible
ways of sorting them into 2 piles, and you have to find the best. Think of a sensible way of encoding any possible solution.
Fitness Some of these chromosomes will be closer to the target
than others. Think of a sensible way of evaluating any chromosome and scoring it with a fitness measure.
Issues for GA practitioners Choosing basic implementation issues:
Representation Population size, mutation rate, ... Selection, deletion policies Crossover, mutation operators
Termination criteria Performance, scalability Solution is only as good as the fitness function
(often hardest part)
Your assignment will be to code a GA for a given task! Be aware of the above issues…
Concept is easy to understand
Supports multi-objective optimization
Good for “noisy” environments
Always an answer; answer gets better with time
Inherently parallel; easily distributed
Benefits of GAs