More on coevolution and learning Jing Xiao April, 2008.

20
More on coevolution and learning Jing Xiao April, 2008

Transcript of More on coevolution and learning Jing Xiao April, 2008.

Page 1: More on coevolution and learning Jing Xiao April, 2008.

More on coevolution and learning

Jing XiaoApril, 2008

Page 2: More on coevolution and learning Jing Xiao April, 2008.

Coevolution Evaluation of individuals is based on

interactions with other evolving individuals A potential for open-ended evolution can

arise Ranking of individuals is dependent on the

co-evolving individuals Inspired by biological coevolution:

reciprocal evolutionary change in interacting species

Page 3: More on coevolution and learning Jing Xiao April, 2008.

Competitive Coevolution Competitive coevolution: individuals

compete (e.g., evolution of game players). Test-based problem: the quality of a

candidate solution is determined by its performance on a set of tests. Typically, two types of individuals are involved: candidate solutions (learners) and tests (e.g., predators and preys). Tests can also be evolved.

Page 4: More on coevolution and learning Jing Xiao April, 2008.

Pareto-Coevolution Test-based Problems as Multi-

objective Problems Each test represents an objective. Solutions are non-dominated learners. Complete Evaluation Set (De Jong04)

A test that makes a distinction between two learners belongs to CES

Ideal evaluation is based on CES

Page 5: More on coevolution and learning Jing Xiao April, 2008.

Cooperative or Compositional Coevolution Solution requires coordinated combination

of all types of components, where each component is a type of individual.

A static fitness function is used as by standard EAs.

An example of this type of problem is to find components (e.g. partial neural networks) that together form an adequate whole.

Another example is the identification of combinations of strategies that work well together (such as a team of robots)

Page 6: More on coevolution and learning Jing Xiao April, 2008.

Competitive (test-based) vs. Cooperative (compositional)

A lot of debate on the distinctions between the two

Some consider the classification subjective (because the underlying algorithms can be similar)

Some problems may involve both elements (e.g. a team sport)

Substance is more important

Page 7: More on coevolution and learning Jing Xiao April, 2008.

Example 1 Potter, M and De Jong, K. (1994). A Cooperative

Coevolutionary Approach to Function Optimization. Characterized by the following

Initialize a separate population (species) for each numerical function variable

a complete solution is composed of a representative from each species

fitness of an individual in a species is determined from the function value of a complete solution (or the values of several solutions) involving this individual and individuals of other species, i.e., in terms of how well a individual “collaborates” with other species

each species is evolved by a GA (or any evolutionary algorithm) using the best values of the other species

Page 8: More on coevolution and learning Jing Xiao April, 2008.

Example 1 (cont’d) The performance of this coevolution

approach was compared to that of a standard GA on four multimodal functions.

In all cases the coevolution approach significantly outperformed the standard GA both in minimum value found and in the speed of convergence.

For one of the functions, the advantage of the coevolution approach was less because of inter-dependency of variables.

Page 9: More on coevolution and learning Jing Xiao April, 2008.

Example 2

Potter, M, De Jong, K and Grefenstette, J (1995). A Coevolutonary Approach to Learning Sequential Decision Rules.

Evolve complex behaviors of simulated robots.

Page 10: More on coevolution and learning Jing Xiao April, 2008.

Example 2 (cont’d) Task environment:

A robot (called “friendly robot”) must maneuver itself around an obstacle-free room where food pellets appear at random locations and times.

The robot must consume these food pellets to replace lost energy and stay alive. Energy loss is a function of the speed and turning rate of the robot. There is a small energy drain even if the robot is stationary.

Another robot (“enemy robot”) is controlled by a fixed set of hand-crafted rules to complete with the friendly robot for food pellets.

Page 11: More on coevolution and learning Jing Xiao April, 2008.

Example 2 (cont’d) Each robot has 10 sensors and 2 effectors

(for speed and turning rate). The sensors consist of

a food pellet detector, a clock indicating the amount of time passed

since the appearance of the last food pellet, the robot’s current energy level, heading, and

speed, the range and bearing to the center of the room, the range and bearing to the other robot, the range and bearing to the food pellet if it is

present

Page 12: More on coevolution and learning Jing Xiao April, 2008.

Example 2 (cont’d)

The goal is to evolve a set of behaviors (rules) to enable the friendly robot to survive indefinitely.

Fitness of a rule set is the average energy level of the robot over a large number of training episodes for food gathering

Page 13: More on coevolution and learning Jing Xiao April, 2008.

Example 2 (cont’d)

The hand-crafted rules for the enemy robot make it move with a random speed in a slightly wandering fashion towards food pellets when they appear and towards the room center otherwise.

Page 14: More on coevolution and learning Jing Xiao April, 2008.

Example 2 (cont’d) Two subsets of rules are considered for the

friendly robot: Rules for the situation when food is present Rules for the situation when food is absent

Initial subpopulations are seeded by dividing initial rules corresponding to a standard non-coevolutionary case (called the Samuel system) into the corresponding subsets.

Page 15: More on coevolution and learning Jing Xiao April, 2008.

Example 2 Results

Page 16: More on coevolution and learning Jing Xiao April, 2008.

Example 3

E. Uchibe and M. Asada. (2006) Incremental coevolution with competitive and cooperative tasks in a multi-robot environment. Proceedings of the IEEE, Vol. 94, Issue 7, pages 1412--1424, 2006.

Page 17: More on coevolution and learning Jing Xiao April, 2008.

Evolutionary Learning

Two major approaches: Michigan approach: each individual

represents a single rule. The whole pop. represents the complete system

Pitt approach: each individual represents a complete system

Credit assignment is difficult in the Michigan approach

Page 18: More on coevolution and learning Jing Xiao April, 2008.

Evolving Rule-Based Systems Encoding rules: many different ways

Example: given a set of reactive rules: IF c11, c12, …, c1n THEN a1;

IF c21, c22, …, c2n THEN a2; … Encode them in a single chromosome:c11 c12 … c1n a1 c21 c22 … c2n a2 … Order is not essential — set-based

Page 19: More on coevolution and learning Jing Xiao April, 2008.

Evolving Rule-Based Systems (cont’d)

Recombination: Two parent recombination: select

randomly two subset rules from two parents and swap them

Global discrete recombination: rules in an offspring is selected randomly from all individuals

Mutation: Swap subsets of conditions change subsets of conditions or actions

Page 20: More on coevolution and learning Jing Xiao April, 2008.

Evolving Rule-Based Systems (cont’d)

Fitness evaluation: based on training error based on training error plus complexity Based on separate validation set