Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Genetic Programming

SeminarAndrew Williams

22nd November 2006

Programme

1. What is genetic programming? Tree-based GP Linear GP (Stack GP)

2. My (TB) GP system Lawnmower Roshambo

3. More challenging games Chess Hexapawn

4. Game tree searching5. Application to hexapawn6. Results

What is genetic programming? Genetic programming is a way of building programs

which can solve problems expressed in high-level terms

GP seeks to overcome some of the limitations of genetic algorithms Fixed-length representations of solutions to a problem The need to encode the problem Genetic algorithms struggle with problems which involve an

analysis of history Prisoners' dilemma

What is genetic programming? GP starts with a random set of programs Each program is run against a problem set

(some data) We record the fitness of each program

Its ability to solve the problem Once the whole population has had a turn, we

recombine the members based on fitness The fittest survive...

Create random population

Run population against environment (problem set)

Terminate?

Sort best to worst

Select one program

Perform mutation

Insert mutant into the new population

Select two programs

Perform crossover

Insert two new programs into the

new population

Select one program

Clone

Insert clone into the new population

Evaluate best

Repeat the shaded section until the new generation is complete

Tree based GPAddInt

AddInt DivInt

MulInt MulInt

a a b b

d c

Result = a2+b2+d/c

Creating a new generation

We apply genetic operators Mutation

Change a terminal (or change a function to another with the same arity)

Crossover Identify a subset and swap it with a subset from

another program Truncation

Replace a function with a terminal

Mutation

AddInt

MulInt MulInt

a a b b

AddInt

MulInt MulInt

a a b b

AddInt

MulInt MulInt

a a b b

AddInt

MulInt SubInt

a a b b

AddInt

MulInt MulInt

a a b b

AddInt

AddInt DivInt

j k x y

AddInt

MulInt MulInt

a a b b

AddInt

MulInt

b b

AddInt

DivInt

x y

AddInt

MulInt

a a

MulInt

b b

MulInt

a a

AddInt

j k

Crossover

Truncation

AddInt

MulInt MulInt

a a b b

AddInt

MulInt MulInt

a a b b

AddInt

MulInt

a a

b

AddInt

MulInt

a a

b

Linear GPR1 = aR1 = R1 * aR2 = bR2 = R2 * bR3 = R1R3 = R1 + R2R4 = dR4 = R4 / cR5 = R3R5 = R5 + R4

The papers tend to call this “assembly language”

Nowadays we'd call it a virtual machine And it doesn't need to be

so primitive Can return > 1 result Easy to optimize Easy to spot “dead” code

My GP System - Overview

C++ Very object-oriented

Intended to be easy to use (Dual use: research + teaching AI) Very inefficient

Consists of two parts:1. GP classes, which are used in all applications

2. Problem-specific classes which belong to specific applications

Abstract class GPApplication links the two

My GP System (GP classes)

GP Classes A Program is made up of several

Nodes Terminal

ConstantInt VariableInt Hooked to C++ by pointers Change it in C++ and it changes in the GP

program FunctionInt

Function arguments are Nodes

My GP System (GP classes)

GP Classes (continued) Population

Consists of several (maybe 500-1000) Programs Represents one generation Can get extremely large

A new Population (generation) is created by applying the genetic operators to the previous generation

This is done in a copy constructor REMEMBER to delete the previous generation!

Controlling the GP process

control.h sets various parameters:

#define RANDOMPROGRAMDEPTH 4

#define INITIALMAXIMUMDEPTH 6.0f

#define MAXIMUMDEPTHMULTIPLIER 1.002f

#define MAXGENERATIONS 250

#define POPULATIONSIZE 500

GPApplication class GPApplication {public: SetOfFunctions *fS; SetOfTerminals *tS; Population *generations[MAXGENERATIONS]; virtual void add_functions()=0; virtual void add_terminals()=0; virtual int run_GP_simulation()=0; GPApplication(); ~GPApplication();};

Problem-specific classes The list of functions provided is:void LawnMowerGPApp::add_functions() {fS->add_function(new MowerLeft(NULL, lawn));fS->add_function(new MowerRight(NULL, lawn));fS->add_function(new MowerMove(NULL, lawn));fS->add_function(new IfBlocked(NULL, lawn));fS->add_function(new Seq2(NULL, lawn));fS->print_functions();printf("\n");

}

These functions operate on the Lawn class, which contains the playfield

A typical function (edited highlights)class AddInt : public FunctionInt {public:

AddInt(Node *parent, Node *op0, Node *op1);AddInt(Node *parent);int calculate_result();Node *clone();

};

AddInt::AddInt(Node *parent, Node *op0, Node *op1) : \ FunctionInt(parent, 2, "AddInt") {

set_argument(0, op0);set_argument(1, op1);

}

AddInt::AddInt(Node *parent) : FunctionInt(parent, 2, "AddInt") {}

int AddInt::calculate_result() {return get_argument(0)->get_value() + get_argument(1)->get_value();

}

Node *AddInt::clone() {AddInt *newAI = new AddInt(NULL);this->duplicate_function(newAI);return newAI;

}

Problem-specific classes Case Study – The Lawnmower # # # # # # # # # # # # # # # # : : : : : : : : : : : : : # # : : : : : : : : : : : * : # # : : : : : : : : : : : * * # # : : * : : : : * : : : * M # # : * * * : : : : : : : : # # : : * : : : : : : : * : # # : : * * : : * : * : : : # # : : : : : : : : : # # : * : : : : : : : * : : * # # : : : : : : : : : : * : : # # : : : : : : * : : : : : * # # : : : : : : * : * : : : : # # : * : : : : : : : : : : : # # : : : : : : : : : : : : : # # # # # # # # # # # # # # # #

(Seq2 (MowerMove) (MowerLeft))

Roshambo

The BBC recently covered the “World Championship” of Roshambo It was won by a Briton!

Has anyone here ever played? There's a website for serious players

With an agreed set of rules Information about good sportsmanship Everything that a real sport should have

It's virtually impossible to tell if it's serious or not

Roshambo

Computer roshambo is interesting Two world championships were held at

Computer Olympiads in the late 1990s A classic example of a history-based problem

Genetic algorithms would fail Why not just play (pseudo-)randomly?

Very hard for human players, but trivial for computers

Roshambo Functionsvoid RoshamboGPApp::add_functions() {

fS->add_function(new WhatBeats(NULL));fS->add_function(new Random(NULL));fS->add_function(new RandomRP(NULL));fS->add_function(new RandomRS(NULL));fS->add_function(new RandomPS(NULL));fS->add_function(new PastGP(NULL, history, genePlayer->who_am_i()));fS->add_function(new PastOpp(NULL, history, genePlayer->who_am_i()));fS->add_function(new IfWonLast(NULL, history, genePlayer->who_am_i()));fS->add_function(new IfDrewLast(NULL, history, genePlayer->who_am_i()));fS->add_function(new IfLostLast(NULL, history, genePlayer->who_am_i()));fS->print_functions();printf("\n");

}

Roshambo – Thought Experiment Why not play randomly? Suppose there are three players:

R plays randomly S plays suboptimally (predictably) A plays adaptively – ie randomly until a pattern is detected, then plays

to beat the pattern R vs S is a draw (more or less) R vs A is a draw (because A can't find a pattern, so he

continues to play randomly) A vs S is a big win for A So A wins the tournament

Roshambo – Question Could a genetic programming system beat single-

strategy Roshambo players? Approach:

Devise a set of functions for roshambo playing Eg What opponent played last Eg What beats X Eg Did I win the last game?

Create a roshambo player for the strategy we want to test against

“Breed” the population to try to beat the strategy Play an extended match (50000 games) to determine

success

Results BeatLast plays to beat the genetic player's last hand

Result: BL 0 - 50000 GP Program: (IfDrewLast 2 (PastOpp 0))

KeepWinning plays the same hand again if it wins, otherwise it chooses randomly between the other two hands Result:

KeepWinning=12608 DRAW=12398 gene=24994

Program: (IfLostLast (IfWonLast 2 1) (IfLostLast 1 (IfLostLast 2 (IfLostLast 1 2))))

Results

RepeatLast – play the same as GP played last time Result: RLR 0 – 50000 GP Program: (WhatBeats (PastGP 0))

Frequency – Play to beat what GP plays most Result: Freq 0 – 50000 GP Program: (IfDrewLast 2 (PastOpp 0))

Results

Random – simply select one at random: Result: Random 16589 – 16788 GP Program: 1 (ie scissors)

Roshambo - Problem The key to successful Roshambo is adaptation

Play randomly until you spot a pattern The Roshambo GP App simply learns to play

against one opponent So how should it adapt?

Play randomly, checking each specialised GP against the history

When one seems to be making progress, switch to that one while keeping a check on the others

Change again if the opponent changes Rather tough for a GP system to work all that out!

Roshambo Problem (continued) What is required is a structure which drives

the Roshambo GP application This structure would not be GP derived

So is this cheating?

Certainly in the world of GP research, this is not considered cheating

More complicated games

Chess M. Sipper, Y. Azaria, A. Hauptman, and Y.

Shichel, "Attaining human-competitive game playing with genetic programming," IEEE Transactions on Systems, Man and Cybernetics – Part C http://www.cs.bgu.ac.il/~sipper/papabs/gpgames.pdf

A game tree for a 2-person, perfect information game3

3

5 93

1

1 211

-6

4 9-6

-8

-2 -86

Max Min LeafKey:

A game tree for a 2-person, perfect information game3

3

5 93

1

1 211

-6

4 9-6

-8

-2 -86

Max Min LeafKey:

Two ways to apply GP to this problem:

• Reduce the number of lines

or

• Improve the quality of the green circles

Chess

Sipper et al sought to improve the quality of the information in the green circles They used crafty, a very strong chess program They added a number of functions to crafty Then they played their new version against the

original and against another program

Chess - results

Results are slightly strange They adjudicated some games ??? 70% of games drawn in a 150 game tournament

Crafty is “fast and dumb” They added “slow and smart” evaluation terms This feels odd

It is out of line with the “philosophy” of crafty With that sort of evaluation term, it should do better “Slow and smart” uses more time, so this probably hindered their

results

The chess part of their paper is very hard for me to understand

Hexapawn

Used in a student assignment

Students had to improve my program for their assignment

Competition with other students

Hexapawn GP App

Given the Sipper et al example, found it hard to think of appropriate evaluation terms Didn’t want to “cheat” by adding sophisticated

terms which are not available to the GP version’s opponent

Found it extremely hard to create a problem set So far, not a successful implementation

Problem set solution

// First clear the boardfor(sq=0; sq < SQUARES; sq++) {

bd[sq] = EMPTY;}

// Must be at least one!sq = rand() % (SIZE*2);bd[sq] = WPAWN;bd[SQUARES-1-sq] = BPAWN;

// Now put on the WPAWN piecesfor(sq=0; sq < SIZE*2; sq++) {

if((rand() % 256) > 64) {bd[sq] = WPAWN;bd[SQUARES-1-sq] = BPAWN;

}}

Probably the only decent thing about this application!

Hexapawn results

Result: GP 25 – 5 Opponent Program: (DivInt (AddInt (DivInt (SubInt (SubInt

(SubInt 1 (DistBonus)) (DivInt (DivInt (DistBonus) 1) 1)) (DistBonus)) (DistBonus)) (AddInt (DivInt 1 (AddInt (DistBonus) (SubInt 1 (DistBonus)))) (AddInt (DistBonus) (DistBonus)))) (SubInt (DivInt 1 (MulInt (AddInt 1 (DistBonus)) 1)) (DistBonus)))

There are lots of problems here This took > 21 hours on a 7x7 board with a depth 2

search Not enough games in the grand match at the end The winning program is obvious nonsense

Hexapawn results (continued) One interesting thing did emerge – one of the

key terms in the evaluation that I have used for more than two years (DistBonus(), which rewards pawns which move forward) turns out to be unhelpful At least for shallow searches I doubt if I would have discovered this

independently

Future work

Work to date has been exploratory Now focusing in on two things: Using genetic programming for search control

(reducing the number of lines) A more detailed analysis of hexapawn

evaluation (Perhaps first) Improvements in the GP

system’s efficiency Especially reducing duplicate chromosomes

Any Questions?

Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Documents

Transcript of Genetic Programming Seminar Andrew Williams 22 nd November 2006.