Genetic Programming Seminar Andrew Williams 22 nd November 2006.

41
Genetic Programming Seminar Andrew Williams 22 nd November 2006
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Page 1: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Genetic Programming

SeminarAndrew Williams

22nd November 2006

Page 2: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Programme

1. What is genetic programming? Tree-based GP Linear GP (Stack GP)

2. My (TB) GP system Lawnmower Roshambo

3. More challenging games Chess Hexapawn

4. Game tree searching5. Application to hexapawn6. Results

Page 3: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

What is genetic programming? Genetic programming is a way of building programs

which can solve problems expressed in high-level terms

GP seeks to overcome some of the limitations of genetic algorithms Fixed-length representations of solutions to a problem The need to encode the problem Genetic algorithms struggle with problems which involve an

analysis of history Prisoners' dilemma

Page 4: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

What is genetic programming? GP starts with a random set of programs Each program is run against a problem set

(some data) We record the fitness of each program

Its ability to solve the problem Once the whole population has had a turn, we

recombine the members based on fitness The fittest survive...

Page 5: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Create random population

Run population against environment (problem set)

Terminate?

Sort best to worst

Select one program

Perform mutation

Insert mutant into the new population

Select two programs

Perform crossover

Insert two new programs into the

new population

Select one program

Clone

Insert clone into the new population

Evaluate best

Repeat the shaded section until the new generation is complete

Page 6: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Tree based GPAddInt

AddInt DivInt

MulInt MulInt

a a b b

d c

Result = a2+b2+d/c

Page 7: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Creating a new generation

We apply genetic operators Mutation

Change a terminal (or change a function to another with the same arity)

Crossover Identify a subset and swap it with a subset from

another program Truncation

Replace a function with a terminal

Page 8: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Mutation

AddInt

MulInt MulInt

a a b b

AddInt

MulInt MulInt

a a b b

AddInt

MulInt MulInt

a a b b

AddInt

MulInt SubInt

a a b b

Page 9: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

AddInt

MulInt MulInt

a a b b

AddInt

AddInt DivInt

j k x y

AddInt

MulInt MulInt

a a b b

AddInt

MulInt

b b

AddInt

DivInt

x y

AddInt

MulInt

a a

MulInt

b b

MulInt

a a

AddInt

j k

Crossover

Page 10: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Truncation

AddInt

MulInt MulInt

a a b b

AddInt

MulInt MulInt

a a b b

AddInt

MulInt

a a

b

AddInt

MulInt

a a

b

Page 11: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Linear GPR1 = aR1 = R1 * aR2 = bR2 = R2 * bR3 = R1R3 = R1 + R2R4 = dR4 = R4 / cR5 = R3R5 = R5 + R4

The papers tend to call this “assembly language”

Nowadays we'd call it a virtual machine And it doesn't need to be

so primitive Can return > 1 result Easy to optimize Easy to spot “dead” code

Page 12: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

My GP System - Overview

C++ Very object-oriented

Intended to be easy to use (Dual use: research + teaching AI) Very inefficient

Consists of two parts:1. GP classes, which are used in all applications

2. Problem-specific classes which belong to specific applications

Abstract class GPApplication links the two

Page 13: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

My GP System (GP classes)

GP Classes A Program is made up of several

Nodes Terminal

ConstantInt VariableInt Hooked to C++ by pointers Change it in C++ and it changes in the GP

program FunctionInt

Function arguments are Nodes

Page 14: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

My GP System (GP classes)

GP Classes (continued) Population

Consists of several (maybe 500-1000) Programs Represents one generation Can get extremely large

A new Population (generation) is created by applying the genetic operators to the previous generation

This is done in a copy constructor REMEMBER to delete the previous generation!

Page 15: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Controlling the GP process

control.h sets various parameters:

#define RANDOMPROGRAMDEPTH 4

#define INITIALMAXIMUMDEPTH 6.0f

#define MAXIMUMDEPTHMULTIPLIER 1.002f

#define MAXGENERATIONS 250

#define POPULATIONSIZE 500

Page 16: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

GPApplication class GPApplication {public: SetOfFunctions *fS; SetOfTerminals *tS; Population *generations[MAXGENERATIONS]; virtual void add_functions()=0; virtual void add_terminals()=0; virtual int run_GP_simulation()=0; GPApplication(); ~GPApplication();};

Page 17: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Problem-specific classes The list of functions provided is:void LawnMowerGPApp::add_functions() {fS->add_function(new MowerLeft(NULL, lawn));fS->add_function(new MowerRight(NULL, lawn));fS->add_function(new MowerMove(NULL, lawn));fS->add_function(new IfBlocked(NULL, lawn));fS->add_function(new Seq2(NULL, lawn));fS->print_functions();printf("\n");

}

These functions operate on the Lawn class, which contains the playfield

Page 18: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

A typical function (edited highlights)class AddInt : public FunctionInt {public:

AddInt(Node *parent, Node *op0, Node *op1);AddInt(Node *parent);int calculate_result();Node *clone();

};

AddInt::AddInt(Node *parent, Node *op0, Node *op1) : \ FunctionInt(parent, 2, "AddInt") {

set_argument(0, op0);set_argument(1, op1);

}

AddInt::AddInt(Node *parent) : FunctionInt(parent, 2, "AddInt") {}

int AddInt::calculate_result() {return get_argument(0)->get_value() + get_argument(1)->get_value();

}

Node *AddInt::clone() {AddInt *newAI = new AddInt(NULL);this->duplicate_function(newAI);return newAI;

}

Page 19: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Problem-specific classes Case Study – The Lawnmower # # # # # # # # # # # # # # # # : : : : : : : : : : : : : # # : : : : : : : : : : : * : # # : : : : : : : : : : : * * # # : : * : : : : * : : : * M # # : * * * : : : : : : : : # # : : * : : : : : : : * : # # : : * * : : * : * : : : # # : : : : : : : : : # # : * : : : : : : : * : : * # # : : : : : : : : : : * : : # # : : : : : : * : : : : : * # # : : : : : : * : * : : : : # # : * : : : : : : : : : : : # # : : : : : : : : : : : : : # # # # # # # # # # # # # # # #

(Seq2 (MowerMove) (MowerLeft))

Page 20: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Roshambo

The BBC recently covered the “World Championship” of Roshambo It was won by a Briton!

Has anyone here ever played? There's a website for serious players

With an agreed set of rules Information about good sportsmanship Everything that a real sport should have

It's virtually impossible to tell if it's serious or not

Page 21: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Roshambo

Computer roshambo is interesting Two world championships were held at

Computer Olympiads in the late 1990s A classic example of a history-based problem

Genetic algorithms would fail Why not just play (pseudo-)randomly?

Very hard for human players, but trivial for computers

Page 22: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Roshambo Functionsvoid RoshamboGPApp::add_functions() {

fS->add_function(new WhatBeats(NULL));fS->add_function(new Random(NULL));fS->add_function(new RandomRP(NULL));fS->add_function(new RandomRS(NULL));fS->add_function(new RandomPS(NULL));fS->add_function(new PastGP(NULL, history, genePlayer->who_am_i()));fS->add_function(new PastOpp(NULL, history, genePlayer->who_am_i()));fS->add_function(new IfWonLast(NULL, history, genePlayer->who_am_i()));fS->add_function(new IfDrewLast(NULL, history, genePlayer->who_am_i()));fS->add_function(new IfLostLast(NULL, history, genePlayer->who_am_i()));fS->print_functions();printf("\n");

}

Page 23: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Roshambo – Thought Experiment Why not play randomly? Suppose there are three players:

R plays randomly S plays suboptimally (predictably) A plays adaptively – ie randomly until a pattern is detected, then plays

to beat the pattern R vs S is a draw (more or less) R vs A is a draw (because A can't find a pattern, so he

continues to play randomly) A vs S is a big win for A So A wins the tournament

Page 24: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Roshambo – Question Could a genetic programming system beat single-

strategy Roshambo players? Approach:

Devise a set of functions for roshambo playing Eg What opponent played last Eg What beats X Eg Did I win the last game?

Create a roshambo player for the strategy we want to test against

“Breed” the population to try to beat the strategy Play an extended match (50000 games) to determine

success

Page 25: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Results BeatLast plays to beat the genetic player's last hand

Result: BL 0 - 50000 GP Program: (IfDrewLast 2 (PastOpp 0))

KeepWinning plays the same hand again if it wins, otherwise it chooses randomly between the other two hands Result:

KeepWinning=12608 DRAW=12398 gene=24994

Program: (IfLostLast (IfWonLast 2 1) (IfLostLast 1 (IfLostLast 2 (IfLostLast 1 2))))

Page 26: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Results

RepeatLast – play the same as GP played last time Result: RLR 0 – 50000 GP Program: (WhatBeats (PastGP 0))

Frequency – Play to beat what GP plays most Result: Freq 0 – 50000 GP Program: (IfDrewLast 2 (PastOpp 0))

Page 27: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Results

Random – simply select one at random: Result: Random 16589 – 16788 GP Program: 1 (ie scissors)

Page 28: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Roshambo - Problem The key to successful Roshambo is adaptation

Play randomly until you spot a pattern The Roshambo GP App simply learns to play

against one opponent So how should it adapt?

Play randomly, checking each specialised GP against the history

When one seems to be making progress, switch to that one while keeping a check on the others

Change again if the opponent changes Rather tough for a GP system to work all that out!

Page 29: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Roshambo Problem (continued) What is required is a structure which drives

the Roshambo GP application This structure would not be GP derived

So is this cheating?

Certainly in the world of GP research, this is not considered cheating

Page 30: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

More complicated games

Chess M. Sipper, Y. Azaria, A. Hauptman, and Y.

Shichel, "Attaining human-competitive game playing with genetic programming," IEEE Transactions on Systems, Man and Cybernetics – Part C http://www.cs.bgu.ac.il/~sipper/papabs/gpgames.pdf

Page 31: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

A game tree for a 2-person, perfect information game3

3

5 93

1

1 211

-6

4 9-6

-8

-2 -86

Max Min LeafKey:

Page 32: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

A game tree for a 2-person, perfect information game3

3

5 93

1

1 211

-6

4 9-6

-8

-2 -86

Max Min LeafKey:

Two ways to apply GP to this problem:

• Reduce the number of lines

or

• Improve the quality of the green circles

Page 33: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Chess

Sipper et al sought to improve the quality of the information in the green circles They used crafty, a very strong chess program They added a number of functions to crafty Then they played their new version against the

original and against another program

Page 34: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Chess - results

Results are slightly strange They adjudicated some games ??? 70% of games drawn in a 150 game tournament

Crafty is “fast and dumb” They added “slow and smart” evaluation terms This feels odd

It is out of line with the “philosophy” of crafty With that sort of evaluation term, it should do better “Slow and smart” uses more time, so this probably hindered their

results

The chess part of their paper is very hard for me to understand

Page 35: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Hexapawn

Used in a student assignment

Students had to improve my program for their assignment

Competition with other students

Page 36: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Hexapawn GP App

Given the Sipper et al example, found it hard to think of appropriate evaluation terms Didn’t want to “cheat” by adding sophisticated

terms which are not available to the GP version’s opponent

Found it extremely hard to create a problem set So far, not a successful implementation

Page 37: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Problem set solution

// First clear the boardfor(sq=0; sq < SQUARES; sq++) {

bd[sq] = EMPTY;}

// Must be at least one!sq = rand() % (SIZE*2);bd[sq] = WPAWN;bd[SQUARES-1-sq] = BPAWN;

// Now put on the WPAWN piecesfor(sq=0; sq < SIZE*2; sq++) {

if((rand() % 256) > 64) {bd[sq] = WPAWN;bd[SQUARES-1-sq] = BPAWN;

}}

Probably the only decent thing about this application!

Page 38: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Hexapawn results

Result: GP 25 – 5 Opponent Program: (DivInt (AddInt (DivInt (SubInt (SubInt

(SubInt 1 (DistBonus)) (DivInt (DivInt (DistBonus) 1) 1)) (DistBonus)) (DistBonus)) (AddInt (DivInt 1 (AddInt (DistBonus) (SubInt 1 (DistBonus)))) (AddInt (DistBonus) (DistBonus)))) (SubInt (DivInt 1 (MulInt (AddInt 1 (DistBonus)) 1)) (DistBonus)))

There are lots of problems here This took > 21 hours on a 7x7 board with a depth 2

search Not enough games in the grand match at the end The winning program is obvious nonsense

Page 39: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Hexapawn results (continued) One interesting thing did emerge – one of the

key terms in the evaluation that I have used for more than two years (DistBonus(), which rewards pawns which move forward) turns out to be unhelpful At least for shallow searches I doubt if I would have discovered this

independently

Page 40: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Future work

Work to date has been exploratory Now focusing in on two things: Using genetic programming for search control

(reducing the number of lines) A more detailed analysis of hexapawn

evaluation (Perhaps first) Improvements in the GP

system’s efficiency Especially reducing duplicate chromosomes

Page 41: Genetic Programming Seminar Andrew Williams 22 nd November 2006.

Any Questions?