More on coevolution and learning Jing Xiao April, 2008.

More on coevolution and learning

Jing XiaoApril, 2008

Coevolution Evaluation of individuals is based on

interactions with other evolving individuals A potential for open-ended evolution can

arise Ranking of individuals is dependent on the

co-evolving individuals Inspired by biological coevolution:

reciprocal evolutionary change in interacting species

Competitive Coevolution Competitive coevolution: individuals

compete (e.g., evolution of game players). Test-based problem: the quality of a

candidate solution is determined by its performance on a set of tests. Typically, two types of individuals are involved: candidate solutions (learners) and tests (e.g., predators and preys). Tests can also be evolved.

Pareto-Coevolution Test-based Problems as Multi-

objective Problems Each test represents an objective. Solutions are non-dominated learners. Complete Evaluation Set (De Jong04)

A test that makes a distinction between two learners belongs to CES

Ideal evaluation is based on CES

Cooperative or Compositional Coevolution Solution requires coordinated combination

of all types of components, where each component is a type of individual.

A static fitness function is used as by standard EAs.

An example of this type of problem is to find components (e.g. partial neural networks) that together form an adequate whole.

Another example is the identification of combinations of strategies that work well together (such as a team of robots)

Competitive (test-based) vs. Cooperative (compositional)

A lot of debate on the distinctions between the two

Some consider the classification subjective (because the underlying algorithms can be similar)

Some problems may involve both elements (e.g. a team sport)

Substance is more important

Example 1 Potter, M and De Jong, K. (1994). A Cooperative

Coevolutionary Approach to Function Optimization. Characterized by the following

Initialize a separate population (species) for each numerical function variable

a complete solution is composed of a representative from each species

fitness of an individual in a species is determined from the function value of a complete solution (or the values of several solutions) involving this individual and individuals of other species, i.e., in terms of how well a individual “collaborates” with other species

each species is evolved by a GA (or any evolutionary algorithm) using the best values of the other species

Example 1 (cont’d) The performance of this coevolution

approach was compared to that of a standard GA on four multimodal functions.

In all cases the coevolution approach significantly outperformed the standard GA both in minimum value found and in the speed of convergence.

For one of the functions, the advantage of the coevolution approach was less because of inter-dependency of variables.

Example 2

Potter, M, De Jong, K and Grefenstette, J (1995). A Coevolutonary Approach to Learning Sequential Decision Rules.

Evolve complex behaviors of simulated robots.

Example 2 (cont’d) Task environment:

A robot (called “friendly robot”) must maneuver itself around an obstacle-free room where food pellets appear at random locations and times.

The robot must consume these food pellets to replace lost energy and stay alive. Energy loss is a function of the speed and turning rate of the robot. There is a small energy drain even if the robot is stationary.

Another robot (“enemy robot”) is controlled by a fixed set of hand-crafted rules to complete with the friendly robot for food pellets.

Example 2 (cont’d) Each robot has 10 sensors and 2 effectors

(for speed and turning rate). The sensors consist of

a food pellet detector, a clock indicating the amount of time passed

since the appearance of the last food pellet, the robot’s current energy level, heading, and

speed, the range and bearing to the center of the room, the range and bearing to the other robot, the range and bearing to the food pellet if it is

present

Example 2 (cont’d)

The goal is to evolve a set of behaviors (rules) to enable the friendly robot to survive indefinitely.

Fitness of a rule set is the average energy level of the robot over a large number of training episodes for food gathering

Example 2 (cont’d)

The hand-crafted rules for the enemy robot make it move with a random speed in a slightly wandering fashion towards food pellets when they appear and towards the room center otherwise.

Example 2 (cont’d) Two subsets of rules are considered for the

friendly robot: Rules for the situation when food is present Rules for the situation when food is absent

Initial subpopulations are seeded by dividing initial rules corresponding to a standard non-coevolutionary case (called the Samuel system) into the corresponding subsets.

Example 2 Results

Example 3

E. Uchibe and M. Asada. (2006) Incremental coevolution with competitive and cooperative tasks in a multi-robot environment. Proceedings of the IEEE, Vol. 94, Issue 7, pages 1412--1424, 2006.

Evolutionary Learning

Two major approaches: Michigan approach: each individual

represents a single rule. The whole pop. represents the complete system

Pitt approach: each individual represents a complete system

Credit assignment is difficult in the Michigan approach

Evolving Rule-Based Systems Encoding rules: many different ways

Example: given a set of reactive rules: IF c11, c12, …, c1n THEN a1;

IF c21, c22, …, c2n THEN a2; … Encode them in a single chromosome:c11 c12 … c1n a1 c21 c22 … c2n a2 … Order is not essential — set-based

Evolving Rule-Based Systems (cont’d)

Recombination: Two parent recombination: select

randomly two subset rules from two parents and swap them

Global discrete recombination: rules in an offspring is selected randomly from all individuals

Mutation: Swap subsets of conditions change subsets of conditions or actions

Evolving Rule-Based Systems (cont’d)

Fitness evaluation: based on training error based on training error plus complexity Based on separate validation set

More on coevolution and learning Jing Xiao April, 2008.

Documents

Transcript of More on coevolution and learning Jing Xiao April, 2008.