Genetic Programming Seminar Andrew Williams 22 nd November 2006.
-
date post
21-Dec-2015 -
Category
Documents
-
view
216 -
download
0
Transcript of Genetic Programming Seminar Andrew Williams 22 nd November 2006.
Genetic Programming
SeminarAndrew Williams
22nd November 2006
Programme
1. What is genetic programming? Tree-based GP Linear GP (Stack GP)
2. My (TB) GP system Lawnmower Roshambo
3. More challenging games Chess Hexapawn
4. Game tree searching5. Application to hexapawn6. Results
What is genetic programming? Genetic programming is a way of building programs
which can solve problems expressed in high-level terms
GP seeks to overcome some of the limitations of genetic algorithms Fixed-length representations of solutions to a problem The need to encode the problem Genetic algorithms struggle with problems which involve an
analysis of history Prisoners' dilemma
What is genetic programming? GP starts with a random set of programs Each program is run against a problem set
(some data) We record the fitness of each program
Its ability to solve the problem Once the whole population has had a turn, we
recombine the members based on fitness The fittest survive...
Create random population
Run population against environment (problem set)
Terminate?
Sort best to worst
Select one program
Perform mutation
Insert mutant into the new population
Select two programs
Perform crossover
Insert two new programs into the
new population
Select one program
Clone
Insert clone into the new population
Evaluate best
Repeat the shaded section until the new generation is complete
Tree based GPAddInt
AddInt DivInt
MulInt MulInt
a a b b
d c
Result = a2+b2+d/c
Creating a new generation
We apply genetic operators Mutation
Change a terminal (or change a function to another with the same arity)
Crossover Identify a subset and swap it with a subset from
another program Truncation
Replace a function with a terminal
Mutation
AddInt
MulInt MulInt
a a b b
AddInt
MulInt MulInt
a a b b
AddInt
MulInt MulInt
a a b b
AddInt
MulInt SubInt
a a b b
AddInt
MulInt MulInt
a a b b
AddInt
AddInt DivInt
j k x y
AddInt
MulInt MulInt
a a b b
AddInt
MulInt
b b
AddInt
DivInt
x y
AddInt
MulInt
a a
MulInt
b b
MulInt
a a
AddInt
j k
Crossover
Truncation
AddInt
MulInt MulInt
a a b b
AddInt
MulInt MulInt
a a b b
AddInt
MulInt
a a
b
AddInt
MulInt
a a
b
Linear GPR1 = aR1 = R1 * aR2 = bR2 = R2 * bR3 = R1R3 = R1 + R2R4 = dR4 = R4 / cR5 = R3R5 = R5 + R4
The papers tend to call this “assembly language”
Nowadays we'd call it a virtual machine And it doesn't need to be
so primitive Can return > 1 result Easy to optimize Easy to spot “dead” code
My GP System - Overview
C++ Very object-oriented
Intended to be easy to use (Dual use: research + teaching AI) Very inefficient
Consists of two parts:1. GP classes, which are used in all applications
2. Problem-specific classes which belong to specific applications
Abstract class GPApplication links the two
My GP System (GP classes)
GP Classes A Program is made up of several
Nodes Terminal
ConstantInt VariableInt Hooked to C++ by pointers Change it in C++ and it changes in the GP
program FunctionInt
Function arguments are Nodes
My GP System (GP classes)
GP Classes (continued) Population
Consists of several (maybe 500-1000) Programs Represents one generation Can get extremely large
A new Population (generation) is created by applying the genetic operators to the previous generation
This is done in a copy constructor REMEMBER to delete the previous generation!
Controlling the GP process
control.h sets various parameters:
#define RANDOMPROGRAMDEPTH 4
#define INITIALMAXIMUMDEPTH 6.0f
#define MAXIMUMDEPTHMULTIPLIER 1.002f
#define MAXGENERATIONS 250
#define POPULATIONSIZE 500
GPApplication class GPApplication {public: SetOfFunctions *fS; SetOfTerminals *tS; Population *generations[MAXGENERATIONS]; virtual void add_functions()=0; virtual void add_terminals()=0; virtual int run_GP_simulation()=0; GPApplication(); ~GPApplication();};
Problem-specific classes The list of functions provided is:void LawnMowerGPApp::add_functions() {fS->add_function(new MowerLeft(NULL, lawn));fS->add_function(new MowerRight(NULL, lawn));fS->add_function(new MowerMove(NULL, lawn));fS->add_function(new IfBlocked(NULL, lawn));fS->add_function(new Seq2(NULL, lawn));fS->print_functions();printf("\n");
}
These functions operate on the Lawn class, which contains the playfield
A typical function (edited highlights)class AddInt : public FunctionInt {public:
AddInt(Node *parent, Node *op0, Node *op1);AddInt(Node *parent);int calculate_result();Node *clone();
};
AddInt::AddInt(Node *parent, Node *op0, Node *op1) : \ FunctionInt(parent, 2, "AddInt") {
set_argument(0, op0);set_argument(1, op1);
}
AddInt::AddInt(Node *parent) : FunctionInt(parent, 2, "AddInt") {}
int AddInt::calculate_result() {return get_argument(0)->get_value() + get_argument(1)->get_value();
}
Node *AddInt::clone() {AddInt *newAI = new AddInt(NULL);this->duplicate_function(newAI);return newAI;
}
Problem-specific classes Case Study – The Lawnmower # # # # # # # # # # # # # # # # : : : : : : : : : : : : : # # : : : : : : : : : : : * : # # : : : : : : : : : : : * * # # : : * : : : : * : : : * M # # : * * * : : : : : : : : # # : : * : : : : : : : * : # # : : * * : : * : * : : : # # : : : : : : : : : # # : * : : : : : : : * : : * # # : : : : : : : : : : * : : # # : : : : : : * : : : : : * # # : : : : : : * : * : : : : # # : * : : : : : : : : : : : # # : : : : : : : : : : : : : # # # # # # # # # # # # # # # #
(Seq2 (MowerMove) (MowerLeft))
Roshambo
The BBC recently covered the “World Championship” of Roshambo It was won by a Briton!
Has anyone here ever played? There's a website for serious players
With an agreed set of rules Information about good sportsmanship Everything that a real sport should have
It's virtually impossible to tell if it's serious or not
Roshambo
Computer roshambo is interesting Two world championships were held at
Computer Olympiads in the late 1990s A classic example of a history-based problem
Genetic algorithms would fail Why not just play (pseudo-)randomly?
Very hard for human players, but trivial for computers
Roshambo Functionsvoid RoshamboGPApp::add_functions() {
fS->add_function(new WhatBeats(NULL));fS->add_function(new Random(NULL));fS->add_function(new RandomRP(NULL));fS->add_function(new RandomRS(NULL));fS->add_function(new RandomPS(NULL));fS->add_function(new PastGP(NULL, history, genePlayer->who_am_i()));fS->add_function(new PastOpp(NULL, history, genePlayer->who_am_i()));fS->add_function(new IfWonLast(NULL, history, genePlayer->who_am_i()));fS->add_function(new IfDrewLast(NULL, history, genePlayer->who_am_i()));fS->add_function(new IfLostLast(NULL, history, genePlayer->who_am_i()));fS->print_functions();printf("\n");
}
Roshambo – Thought Experiment Why not play randomly? Suppose there are three players:
R plays randomly S plays suboptimally (predictably) A plays adaptively – ie randomly until a pattern is detected, then plays
to beat the pattern R vs S is a draw (more or less) R vs A is a draw (because A can't find a pattern, so he
continues to play randomly) A vs S is a big win for A So A wins the tournament
Roshambo – Question Could a genetic programming system beat single-
strategy Roshambo players? Approach:
Devise a set of functions for roshambo playing Eg What opponent played last Eg What beats X Eg Did I win the last game?
Create a roshambo player for the strategy we want to test against
“Breed” the population to try to beat the strategy Play an extended match (50000 games) to determine
success
Results BeatLast plays to beat the genetic player's last hand
Result: BL 0 - 50000 GP Program: (IfDrewLast 2 (PastOpp 0))
KeepWinning plays the same hand again if it wins, otherwise it chooses randomly between the other two hands Result:
KeepWinning=12608 DRAW=12398 gene=24994
Program: (IfLostLast (IfWonLast 2 1) (IfLostLast 1 (IfLostLast 2 (IfLostLast 1 2))))
Results
RepeatLast – play the same as GP played last time Result: RLR 0 – 50000 GP Program: (WhatBeats (PastGP 0))
Frequency – Play to beat what GP plays most Result: Freq 0 – 50000 GP Program: (IfDrewLast 2 (PastOpp 0))
Results
Random – simply select one at random: Result: Random 16589 – 16788 GP Program: 1 (ie scissors)
Roshambo - Problem The key to successful Roshambo is adaptation
Play randomly until you spot a pattern The Roshambo GP App simply learns to play
against one opponent So how should it adapt?
Play randomly, checking each specialised GP against the history
When one seems to be making progress, switch to that one while keeping a check on the others
Change again if the opponent changes Rather tough for a GP system to work all that out!
Roshambo Problem (continued) What is required is a structure which drives
the Roshambo GP application This structure would not be GP derived
So is this cheating?
Certainly in the world of GP research, this is not considered cheating
More complicated games
Chess M. Sipper, Y. Azaria, A. Hauptman, and Y.
Shichel, "Attaining human-competitive game playing with genetic programming," IEEE Transactions on Systems, Man and Cybernetics – Part C http://www.cs.bgu.ac.il/~sipper/papabs/gpgames.pdf
A game tree for a 2-person, perfect information game3
3
5 93
1
1 211
-6
4 9-6
-8
-2 -86
Max Min LeafKey:
A game tree for a 2-person, perfect information game3
3
5 93
1
1 211
-6
4 9-6
-8
-2 -86
Max Min LeafKey:
Two ways to apply GP to this problem:
• Reduce the number of lines
or
• Improve the quality of the green circles
Chess
Sipper et al sought to improve the quality of the information in the green circles They used crafty, a very strong chess program They added a number of functions to crafty Then they played their new version against the
original and against another program
Chess - results
Results are slightly strange They adjudicated some games ??? 70% of games drawn in a 150 game tournament
Crafty is “fast and dumb” They added “slow and smart” evaluation terms This feels odd
It is out of line with the “philosophy” of crafty With that sort of evaluation term, it should do better “Slow and smart” uses more time, so this probably hindered their
results
The chess part of their paper is very hard for me to understand
Hexapawn
Used in a student assignment
Students had to improve my program for their assignment
Competition with other students
Hexapawn GP App
Given the Sipper et al example, found it hard to think of appropriate evaluation terms Didn’t want to “cheat” by adding sophisticated
terms which are not available to the GP version’s opponent
Found it extremely hard to create a problem set So far, not a successful implementation
Problem set solution
// First clear the boardfor(sq=0; sq < SQUARES; sq++) {
bd[sq] = EMPTY;}
// Must be at least one!sq = rand() % (SIZE*2);bd[sq] = WPAWN;bd[SQUARES-1-sq] = BPAWN;
// Now put on the WPAWN piecesfor(sq=0; sq < SIZE*2; sq++) {
if((rand() % 256) > 64) {bd[sq] = WPAWN;bd[SQUARES-1-sq] = BPAWN;
}}
Probably the only decent thing about this application!
Hexapawn results
Result: GP 25 – 5 Opponent Program: (DivInt (AddInt (DivInt (SubInt (SubInt
(SubInt 1 (DistBonus)) (DivInt (DivInt (DistBonus) 1) 1)) (DistBonus)) (DistBonus)) (AddInt (DivInt 1 (AddInt (DistBonus) (SubInt 1 (DistBonus)))) (AddInt (DistBonus) (DistBonus)))) (SubInt (DivInt 1 (MulInt (AddInt 1 (DistBonus)) 1)) (DistBonus)))
There are lots of problems here This took > 21 hours on a 7x7 board with a depth 2
search Not enough games in the grand match at the end The winning program is obvious nonsense
Hexapawn results (continued) One interesting thing did emerge – one of the
key terms in the evaluation that I have used for more than two years (DistBonus(), which rewards pawns which move forward) turns out to be unhelpful At least for shallow searches I doubt if I would have discovered this
independently
Future work
Work to date has been exploratory Now focusing in on two things: Using genetic programming for search control
(reducing the number of lines) A more detailed analysis of hexapawn
evaluation (Perhaps first) Improvements in the GP
system’s efficiency Especially reducing duplicate chromosomes
Any Questions?