Genetic Programming

37
Genetic Programming CSCE155 Fall 2004 Leen-Kiat Soh Department of Computer Science and Engineering University of Nebraska

description

Genetic Programming. CSCE155 Fall 2004 Leen-Kiat Soh Department of Computer Science and Engineering University of Nebraska. Acknowledgments. The materials in this presentation are based on http://www.genetic-programming.org http://www.genetic-programming.com/gpanimatedtutorial.html. - PowerPoint PPT Presentation

Transcript of Genetic Programming

Page 1: Genetic Programming

Genetic ProgrammingCSCE155Fall 2004

Leen-Kiat SohDepartment of Computer Science and Engineering

University of Nebraska

Page 2: Genetic Programming

Acknowledgments

• The materials in this presentation are based on – http://www.genetic-programming.org– http://www.genetic-programming.com/gpanimatedtuto

rial.html

Page 3: Genetic Programming

Introduction

• One of the central challenges of computer science is to get a computer to do what needs to be done, without telling it how to do it

• Genetic programming addresses this challenge by providing a method for automatically creating a working computer program from a high-level problem statement of the problem– Automatic programming (a.k.a. program synthesis or

program induction)

Page 4: Genetic Programming

Basic Steps

• GP – A domain-independent method– Iteratively transforms a population of computer

programs into a new generation of programs– Two sets of steps:

• Preparatory steps• Executional steps

Page 5: Genetic Programming

Preparatory Steps

• The human user communicates the high-level statement of the problem to the genetic programming system by performing certain well-defined preparatory steps:– The set of terminals– The set of primitive functions– The fitness measure– Certain parameters for controlling the run– The termination criterion and method for designating

the result of the run

Page 6: Genetic Programming

Preparatory Steps

• The first two preparatory steps specify the ingredients that are available to create the computer programs– A run of GP is a competitive search among a diverse

population of programs composed of the available functions and terminals

Terminal Set Function Set Fitness Measure Parameters

Termination Criterion & Result Designation

GP

Computer Program

Page 7: Genetic Programming

Preparatory StepsTerminal and Function Sets

• The identification of the function set and terminal set for a particular problem is usually a straightforward process– The function set may consist of merely the arithmetic

functions (+, -, *, /) and a conditional branching operator

– The terminal set may consist of the program’s external inputs (independent variables) and numerical constants

– Defines the search space

Page 8: Genetic Programming

Preparatory StepsTerminal and Function Sets

• Robot mopping floor example– Function set: moving, turning, swishing the mop, etc.

• Controller example– Function set: signal processing functions that operate on time-

domain signals, including integrators, differentiators, leads, lags, gains, adders, subtractors, etc.

– Terminal set: reference signal and plant output

• Analog electrical circuit synthesis example– Function set: building transistors, capacitors, resistors, etc.– Terminal set: wire, a circuit’s placement and routing, etc.

Page 9: Genetic Programming

Preparatory StepsFitness Measure

• Specifies what needs to be done– The primary mechanism for communicating the high-

level statement of the problem’s requirements to the GP system

– E.g., if the goal is to get GP to automatically synthesize an amplifier, the fitness function is the mechanism for telling GP to synthesize a circuit that amplifying an incoming signal is rewarding

– Defines the search’s desired goal

Page 10: Genetic Programming

Preparatory StepsControl Parameters

• Specifies the control parameters for the run– Population size, probabilities of performing the

genetic operations, the maximum size for programs, etc.

– Defines the search’s administrative details

Page 11: Genetic Programming

Preparatory StepsTermination

• Specifies the termination criterion and the method of designating the result of the run– Termination criterion: a maximum number of

generations to be run, a problem-specific success predicate, etc.

• E.g., when the value of fitness for numerous successive best-of-generation individuals appear to have reached a plateau

– The single best-so-far individual is then harvested and designated as the result of the run

– Defines the search’s administrative details

Page 12: Genetic Programming

Executional Steps• GP typically

– Starts with a population of randomly generated computer programs composed of the available programmatic ingredients (functional and terminal sets)

– Iteratively transforms a population of programs into a new generation of the population by applying analogs of naturally occurring genetic operations

• Operations are applied to individual(s) selected from the population

• Individual(s) are probabilistically selected to participate in the genetic operations based on their fitness measure

Page 13: Genetic Programming

Executional Steps• Steps are:

– Randomly create an initial population (generation 0) of individual computer programs composed of the available functions and terminals

– Iteratively perform the “genetic evolution” sub-steps (called a generation) on the population until the termination criterion is satisfied

– After the termination criterion is satisfied, harvest the single best program in the population produced during the run (the best-so-far individual) and designate it as the result of the run

• If the run is successful, the result may be a solution (or approximate solution) to the problem

Page 14: Genetic Programming

Executional Steps

• “Genetic Evolution” steps are: – Execute each program in the population and ascertain

its fitness using the problem’s fitness measure– Select one or two individual program(s) from the

population with a probability based on fitness (with re-selection allowed) to participate in the genetic operations

– Create new individual program(s) using genetic operations

Page 15: Genetic Programming

Genetic Operations• Reproduction Operation

– Simply allow the selected program to survive to the next generation without any changes

– This reproduction is typically performed quite frequently (say, 10%-15% during each generation of the run)

Page 16: Genetic Programming

Genetic Operations• Mutation Operation

– Only one parental program is needed – A mutation point is randomly chosen for the selected program,

the subtree rooted at that point is deleted and a new subtree is grown using the same random growth process that was used to generate the initial population

– This asexual mutation is typically performed sparingly (say, 1% during each generation of the run)

Page 17: Genetic Programming

Genetic Operations• Crossover (Sexual Recombination) Operation

– Two parental programs are needed– A crossover point is randomly chosen in the first parent and a

crossover point is randomly chosen in the second parent. Then the subtree rooted at the crossover point of the first, or receiving, parent is deleted and replaced by the subtree from the second, or contributing, parent

– This mutation is the predominant operation in GP (say, 85% to 90%)

Page 18: Genetic Programming

Genetic Operations• Architecture-Altering Operations

– Based on gene duplication and gene deletion in nature– For computer programs related problems:

• Dynmically add and delete subrountines, arguments, iterations, loops, recursions, and memory, and also different hierarchical arrangements of these elements

– Programs with architectures that are well-suited to the problem at hand will tend to grow and prosper in the competitive evolutionary process; while inadequate ones wither away.

– These operations are applied sparingly during the run (say, 0.5% to 1% on each generation)

Page 19: Genetic Programming

Genetic Operations• Architecture-Altering Operations, Cont’d

– Subtroutine duplication• Duplicates a pre-existing subroutine in an individual program, gives

a new name to the copy, and randomly divides the pre-existing calls to the old subroutine between the two

• Broadens the hierarchy and may lead to divergence later of the two subroutines, sometimes yielding specialization

– Argument duplication• Duplicates one argument of a subroutine, randomly divides internal

references to it, and preserves overall program semantics by adjusting all calls to the subroutine

• Enlarges the dimensionality of the subspace on which the subroutine operates

Page 20: Genetic Programming

Genetic Operations• Architecture-Altering Operations, Cont’d

– Subtroutine creation• Creates a new subroutine from part of a main result-producing

branch• Deepens the hierarchy of references in the overall program

– Subtroutine deletion• Deletes a pre-existing subroutine• Narrows or make shallower the hierarchy of subroutines

– Argument deletion• Deletes an argument from a subroutine• Reduces the amount of information available to the subroutine

– Generalization

Page 21: Genetic Programming

Flowchart

Page 22: Genetic Programming

Tidbits• Each individual program in the population is executed so

that each can be measured in terms of how well it performs the task at hand– This translates into a single explicit numerical value, called

fitness– E.g., the amount of error between an individual program’s output

and the desired output, the amount of time, the accuracy, the number of lines, the payoff that a game-playing program produces, etc.

• The creation of the initial random population is a blind random search of the search space of the problem– Typically, the individual programs in generation 0 all have

exceedingly poor fitness; but some are (usually) more fit than others and are selected for the next generation

Page 23: Genetic Programming

Tidbits• With probabilistic selection, better individuals are favored

over inferior individuals– The best individual in the population is not necessarily selected– The worst individual in the population is not necessarily passed

over• After each generation, the population of offspring

replaces the now-old generation• All programs in the initial random population (generation

0) of a run of GP are syntactically valid, executable programs– The genetic operations that are performed are also designed to

produce offspring that are syntactically valid, executable programs

Page 24: Genetic Programming

Example of a GP RunSymbolic Regression of A Quadratic Polynomial

• Goal: automatically create a computer program whose output is equal to the values of the quadratic polynomial x*x + x + 1 in the range from -1 to 1

• Preparatory Steps: – Terminal Set: independent variable x– Functional Set: flexible, say: +, -, *, %– Fitness measure: compare result of an individual program with

the result of x*x + x + 1 • A fitness (error) of zero would indicate a perfect fit

Page 25: Genetic Programming

Example of a GP RunSymbolic Regression of A Quadratic Polynomial

• Executional Steps:

Figure 1 Initial population of four randomly created individuals of generation 0

Page 26: Genetic Programming

Example of a GP RunSymbolic Regression of A Quadratic Polynomial

• Executional Steps:

Figure 2 The fitness of each of the four randomly created individuals of generation 0 is equal to the area between two curves: (a) 0.67, (b) 1.0, (c) 1.67, and (d) 2.67

Page 27: Genetic Programming

Example of a GP RunSymbolic Regression of A Quadratic Polynomial

• Executional Steps:

Figure 3 Population of generation 1 (after one reproduction, one mutation, and one two-offspring crossover operation)

Page 28: Genetic Programming

Human-Competitive Results• An automatically created result is “human-competitive” if it

satisfies one or more of the eight criteria below:– (A) The result was patented as an invention in the past, is an

improvement over a patented invention, or would qualify today as a patentable new invention

– (B) The result is equal to or better than that was accepted as a new scientific result at the time when it was published in a peer-reviewed scientific journal

– (C) The result is equal to or better than was placed into a database or archive of results maintained by an internationally recognized panel of scientific experts

– (D) The result is publishable in its own right as a new scientific result—independent of the fact that the result was mechanically created

Page 29: Genetic Programming

Human-Competitive Results• An automatically created result is “human-competitive” if it

satisfies one or more of the eight criteria below, cont’d:– (E) The result is equal to or better than the most recent human-

created solution to a long-standing problem for which there has been a succession of increasingly better human-created solutions

– (F) The result is equal to or better than a result that was considered an achievement in its field at the time it was first discovered

– (G) The result solves a problem of indisputable difficulty in its field– (H) The result holds its own or wins a regulated competition

involving human contestants (in the form of either live human players or human-written computer programs)

Page 30: Genetic Programming

36 Instances of GP-Generated Human-Competitive Results

• 15 instances where GP has created an entity that either infringes or duplicates the functionality of a previously patented 20th-century invention

• 6 instances where GP has done the same with respect to a 21st-century invention

• 2 instances where GP has created a patentable new invention

• Fields include– Computational molecular biology, cellular automata, sorting

networks, and the synthesis of the design of both the topology and component sizing for complex structures, such as analog electrical circuits, controllers, and antenna

Page 31: Genetic Programming

36 Instances of GP-Generated Human-Competitive Results

Claimed instance Basis for claim of human-competitiveness  

Reference

1 Creation of a better-than-classical quantum algorithm for the Deutsch-

Jozsa “early promise” problem

B, F Spector, Barnum, and Bernstein 1998

2 Creation of a better-than-classical quantum algorithm for Grover’s database search problem

B, F Spector, Barnum, and Bernstein 1999

3 Creation of a quantum algorithm for the depth-two AND/OR query problem that is better than any previously published result

D Spector, Barnum, Bernstein, and Swamy 1999; Barnum, Bernstein, and Spector 2000

4 Creation of a quantum algorithm for the depth-one OR query problem that is better than any previously published result

D Barnum, Bernstein, and Spector 2000

5 Creation of a protocol for communicating information through a quantum gate that was previously thought not to permit such communication

D Spector and Bernstein 2003

6 Creation of a novel variant of quantum dense coding D Spector and Bernstein 2003

7 Creation of a soccer-playing program that won its first two games in the

Robo Cup 1997 competition

H Luke 1998

Page 32: Genetic Programming

36 Instances of GP-Generated Human-Competitive Results

Claimed instance Basis for claim of human-competitiveness  

Reference

8 Creation of a soccer-playing program that ranked in the middle of the field of 34 human-written programs in the Robo Cup 1998 competition

H Andre and Teller 1999

9 Creation of four different algorithms for the transmembrane segment identification problem for proteins

B, E Sections 18.8 and 18.10 of Genetic Programming II and sections 16.5 and 17.2 of Genetic Programming III

10 Creation of a sorting network for seven items using only 16 steps

A, D Sections 21.4.4, 23.6, and 57.8.1 of Genetic Programming III

11 Rediscovery of the Campbell ladder topology for lowpass and highpass filters

A, F Section 25.15.1 of Genetic Programming III and section 5.2 of Genetic Programming IV

12 Rediscovery of the Zobel “M-derived half section” and “constant K” filter sections

A, F Section 25.15.2 of Genetic Programming III

13 Rediscovery of the Cauer (elliptic) topology for filters A, F Section 27.3.7 of Genetic Programming III

14 Automatic decomposition of the problem of synthesizing a crossover filter

A, F Section 32.3 of Genetic Programming III

Page 33: Genetic Programming

36 Instances of GP-Generated Human-Competitive Results

Claimed instance Basis for claim of human-competitiveness  

Reference

15 Rediscovery of a recognizable voltage gain stage and a Darlington emitter-follower section of an amplifier and other circuits

A, F Section 42.3 of Genetic Programming III

16 Synthesis of 60 and 96 decibel amplifiers A, F Section 45.3 of Genetic Programming III

17 Synthesis of analog computational circuits for squaring, cubing, square root, cube root, logarithm, and Gaussian functions

A, D, G Section 47.5.3 of Genetic Programming III

18 Synthesis of a real-time analog circuit for time-optimal control of a robot

G Section 48.3 of Genetic Programming III

19 Synthesis of an electronic thermometer A, G Section 49.3 of Genetic Programming III

20 Synthesis of a voltage reference circuit A, G Section 50.3 of Genetic Programming III

21 Creation of a cellular automata rule for the majority classification problem that is better than the Gacs-Kurdyumov-Levin (GKL) rule and all other known rules written by humans

D, E Andre, Bennett, and Koza 1996 and section 58.4 of Genetic Programming III

22 Creation of motifs that detect the D–E–A–D box family of proteins and the manganese superoxide dismutase family

C Section 59.8 of Genetic Programming III

Page 34: Genetic Programming

36 Instances of GP-Generated Human-Competitive Results

Claimed instance Basis for claim of human-competitiveness  

Reference

23 Synthesis of topology for a PID-D2 (proportional, integrative, derivative, and second derivative) controller

A, F Section 3.7 of Genetic Programming IV

24 Synthesis of an analog circuit equivalent to Philbrick circuit A, F Section 4.3 of Genetic Programming IV

25 Synthesis of a NAND circuit A, F Section 4.4 of Genetic Programming IV

26 Simultaneous synthesis of topology, sizing, placement, and routing of analog electrical circuits

A. F, G Chapter 5 of Genetic Programming IV

27 Synthesis of topology for a PID (proportional, integrative, and derivative) controller

A, F Section 9.2 of Genetic Programming IV

28Rediscovery of negative feedback A, E, F, G

Chapter 14 of Genetic Programming IV

29Synthesis of a low-voltage balun circuit A

Section 15.4.1 of Genetic Programming IV

30Synthesis of a mixed analog-digital variable capacitor circuit A

Section 15.4.2 of Genetic Programming IV

31Synthesis of a high-current load circuit A

Section 15.4.3 of Genetic Programming IV

32Synthesis of a voltage-current conversion circuit A

Section 15.4.4 of Genetic Programming IV

Page 35: Genetic Programming

36 Instances of GP-Generated Human-Competitive Results

Claimed instance Basis for claim of human-competitiveness  

Reference

33Synthesis of a cubic function generator A

Section 15.4.5 of Genetic Programming IV

34Synthesis of a tunable integrated active filter A

Section 15.4.6 of Genetic Programming IV

35 Creation of PID tuning rules that outperform the Ziegler-Nichols and Åström-Hägglund tuning rules

A, B, D, E, F, G Chapter 12 of Genetic Programming IV

36 Creation of three non-PID controllers that outperform a PID controller using the Ziegler-Nichols or Åström-Hägglund tuning rules

A, B, D, E, F, G Chapter 13 of Genetic Programming IV

Page 36: Genetic Programming

Web and Literature• The home page of Genetic Programming Inc. at www.genetic-

programming.com. • For information about the field of genetic programming in general,

visit www.genetic-programming.org • The home page of John R. Koza at Genetic Programming Inc.

(including online versions of most papers) and the home page of John R. Koza at Stanford University

• Information about the 1992 book Genetic Programming: On the Programming of Computers by Means of Natural Selection, the 1994 book Genetic Programming II: Automatic Discovery of Reusable Programs, the 1999 book Genetic Programming III: Darwinian Invention and Problem Solving, and the 2003 book Genetic Programming IV: Routine Human-Competitive Machine Intelligence.

Page 37: Genetic Programming

Web and Literature• For information on 3,198 papers (many on-line) on genetic

programming (as of June 27, 2003) by over 900 authors, see William Langdon’s bibliography on genetic programming.

• For information on the Genetic Programming and Evolvable Machines journal published by Kluwer Academic Publishers

• Important Conferences: – Genetic and Evolutionary Computation (GECCO) conference– NASA/DoD Conference on Evolvable Hardware Conference (EH) – Euro-Genetic-Programming Conference