EVA I - Department of Theoretical Computer Science and...

$: EVA I - Department of Theoretical Computer Science and ...ktiml.mff.cuni.cz/~neruda/eva1-16en.pdf · EVOLUTIONARY ALGORITHMS Biological mo\vaon , basic parts ROMAN NERUDA: EVA1 -$
EVAINAIL025–2016/17RomanNerudaENGLISHVERSION–13-01-2017

INTRODUCTIONTopics,sources,outlines.

ROMANNERUDA:EVA1-2013/14 2

Literature

•  Mitchell,M.:Introduc)ontoGene)cAlgorithms.MITPress,1996.

•  Eiben,A.EandSmith,J.E.:Introduc)ontoEvolu)onaryCompu)ng,Springer,2007.

•  MichalewiczZ.:Gene)cAlgorithms+DataStructures=Evolu)onPrograms(3ed),Springer,1996

•  Holland,J.:Adapta)oninNaturalandAr)ficialSystems,MITPress,1992(2nded).

•  Goldberg,D.:Gene)cAlgorithmsinSearch,Op)miza)onandMachineLearning,Addison-Wesley,1989.


Topics•  Evolu\onmodels,popula\on,recombina\on.•  Gene\calgorithms.encoding,operators,selec\on,crossover,

muta\on.•  Naturalselec\on,simula\on,objec\vefunc\on,roule_ewheel,

tournament,eli\sm.•  Representa\onalschemata,schematatheorem,buildingblocks

hypothesis.•  Prisonner‘sdilemma,strategies,equilibria,evolu\onarystability.•  Evolu\onstrategies,coopera\on,meta-parameters.•  Differen\alevolu\on,CMA-ES.•  EAandcombinatorialproblems,NP-hardtasks,TSP,...•  Machinelearninganddatamining,evolu\onofrule-basedsystems,

Michiganvs.Pi_sburgh.•  Learningclassifiersystems,bucketbrigadealgorithm,Q-learning.


EVOLUTIONARYALGORITHMSBiologicalmo\va\on,basicparts


Darwinevolu\ontheory•  1859–Ontheoriginof

species•  Limitedenvironment

resources•  Reproduc\onisthekey

tolife•  Be_erfi_ed(adapted)

individualshavebiggerchancestoreproduce

•  Successfulphenotypetraitsarereproduced,modified,recombined


Mendelgene\cs•  1856-Versucheüber

Pflanzenhybriden•  Geneasabasichereditaryunit•  Everydiploidindividualhas

twopairsofallels,oneistransmi_edtooffspringindependentlyofothers.

•  It‘scomplicated:–  Polygeny–moregenes

influenceonetrait–  Pleiotropy–onegene

influencesmoretraits–  MitochondrialDNA–  Epigene\cs


DNA•  1953–Watson&Crick–double

helixstructureofDNA•  Molecular-biologicalview:

–  Howisthegene\cinforma\onstoredinalivingorganism

–  Howisitinheri_ed•  DNAconsistsof4nucleo\des/

bases–adenin,guanin,cytosin,thymin

•  Codon–atrippletofnucleo\desencoding1outof23aminoacids(redundancy)

•  These23aminoacidsarethebasicbuildingstructureofcarbohydratesinalllivingorganisms


Moleculargene\cs•  Crossover•  Muta\on•  Transcrip\on:DNA->RNA•  Transla\on:RNA->protein•  GENOTYPE->PHENOTYPE•  One-direc\on,complex

mapping•  Lamarckism:

–  Thereisaninversemappingfromphenotypetogenotype

–  Acquiredtraitscanbeinherited


EA-summary

•  Naturalevolu\on:environment,individuals,fitness

•  Ar\ficialevolu\on:problem,candidatesolu\ons,qualityofasolu\onmeasure

•  Easarepopula\on-basedstochas\csearchalgorithms

•  Recombina\onandmuta\oncreatevariability•  Selek\onleadsthesearchintherightdirec\on


GeneralEA

•  EAsarerobustmeta-algorithms

•  Nofreelunchtheorem–thereisnoonebestalgorithm

•  Itpaystocreatedomain-specificvariantsofEAs–  Representa\on–  Operators


GeneralEA•  Createini\alpopula\on

P(0)atrandom•  InacyclecreateP(t+1)

fromP(t):–  Parentalselec\on–  Recombina\on,andmuta\on

–  NewindividualsP‘(t+1)arecreated

–  Environmentalselec\onchoosesP(t+1)basedonP(t)aP‘(t+1)


Gene\calgorithms

•  1975-Holland•  Binaryencodedindividuals•  Roule_e-wheelselec\on•  1-pointcrossover•  Bitwisemuta\ons•  Inversion•  SchamatatheorytoexplainthemechanismhowGAswork


Evolu\onaryprogramming

•  1965–Fogel,OwensaWalsh•  Evolu\onoffiniteautomata•  Nodis\nc\onbetweengenotypeandphenotype

•  Focusonmuta\ons•  Nocrossover,usually•  Tournamentselec\on


Evolu\onatystrategies

•  1964-Rechenberg,Schwefel•  Op\miza\onofrealnumbervectorsindifficultcomputa\onalmathproblems

•  Floa\ngpointencodingofindividuals•  Muta\onisthebasicoperator•  Themuta\onstepisheuris\callycontrolledorudergoesanadapta\on(evolving)

•  Determinis\cenvironmentalselec\on


Gene\cprogramming

•  1992–Koza•  Evolu\onofindividualsrepresen\ng(LISP)trees

•  Used(notonly)toevolvecomputerprograms•  Specificoperatorsofcrossover,muta\on,ini\aliza\on

•  Furtherapplica\ons(neuroevolu\on,evolvinghw,…)


SIMPLEGENETICALGORITHMHollandSGA,binaryreprezenta\on,operatorsandtheirvariants


GA

•  Gene\calgorithms–70sUSA,Holland,DeJong,Goldberg,…

•  TheoriginalproposalisnowadayscalledSGA(simpleGA)– Minimalsetofoperators,thesimplestindividualencoding,researchoftheore\calproper\es

•  Gradually,theSGAhasbeenenrichedof–ortransformedto–furtheroperators,encodings,waysofdealingwithpopula\ons,etc.


SGA-basics•  t=0;Generateatrandom

ini\alpopulationP(0)ofnl-bitgenes(individuals)

•  StepfromP(t)toP(t+1):–  Computef(x)foreachxfromP(t)

–  Repeatn/2times:•  Selectapairx,yfromP(t)•  Crossoverx,ywithprobabilitypC

•  MutateeverybitofxandywithprobabilitypM

•  Insertx,ytoP(t+1)


Selec\on

•  RouleGewheelselec)on:– Selec\onmechanismisbasedontheindividualfitnessvalue

– Expectednumberofindividualselec\onsočekávanýshouldbepropor\onalonthera\oofitsfitnessandanaveragefitnessofthepopula\on

– Roulettewheelselec\on:eachindividualhasanallocatedsliceofaroule_ewheelcorrespondingtoitsfitness,thewheelisspunn-times


Křížení

•  VGAjekříženíhlavnímoperátorem•  Rekombinujevlastnos\rodičů•  Doufáme,žerekombinacepovedeklepšífitness

•  Jednobodovékřížení:– náhodnězvolímebodkřížení,– vyměnímeodpovídajícíčás\jedinců– PravděpodobnostpCtypickyvrozsahudese\n


Muta\on

•  InsimpleGA,muta\onoperatorislessimportant,actsasamechanismagainststuckinlocalextrema

•  (Onthecontrary,inEPneboearlyES,muta\onistheonlysourceofvariability)

•  Bit-stringmuta\on:– WithprobabilitypM,everybitoftheindividualischanged

–  pMissmall(eg.tochange1bitinindividualonaverage)


Inversionandother

•  TheoriginalHolland’sSGAproposalcontainsanothergene\coperator–inversion

•  Inversion– Reversingapartofthebitstring– BUTwithkeepingthemeaningofbits– Morecomplicatedtechnically–  Inspira\oninnature– Didnotproventobebeneficial


SCHEMATHEORYSchematheorem,buildingblockshypothesis,implicitparalelism,k-armbandit


Schemata•  Individualisawordinalphabet{0,1}•  Schemaisawordinalphabet{0,1,*}–  (*=don'tcare)

•  Schemarepresentsasetofindividuals•  Schemawithr*represents2rindividuals•  Individualwithlengthmisrepresentedby2mschemata

•  Thereis3mschemataoflengthm•  Inpopula\onofnindividualsthereisbetween2mandn.2mschematarepresented


Proper\esofschemata

•  OrderofschemaS:o(S)– Numberof0and1(fixedposi\ons)

•  DefininiglengthofschemaS:d(S)– Distancebetweenthefirstandthelastfixedposi\on

•  FitnessoftheschemaS:F(S)– Averagefitnessoftheindividualsinapopula\onthatcorrespondtotheschemaS

– NotethatfitnessofSdependsonthecontextofapopula\on.


Theschematheorem•  Short(w.r.t.defininglength),above-average(w.r.t.fitness),low-orderschemataincreaseexponen)allyinsuccessivegenera)onsofGA.(Holland)

•  Buildingblocshypothesis:– GAseekssubop\malsolu\onofthegivenproblembyrecombina\onofshort,low-orderabove-averageschemata(calledbuildingblocks).

–  “justasachildcreatesmagnificentfortressthrougharrangementofsimpleblocksofwood,sodoesaGAseeknearop\malperformance...”


ProofofTST

•  Popula\onP(t),P(t+1),...nindividualsoflengthm

•  Whathappenstoapar\cularschemaSduring:–  Selection–  Crossover– Muta\on

•  C(S,t)...Numberofindividualsrepresen\ngschemaSinpopulationP(t)

•  Wewilles\mateC(S,t+1)inthreesteps


ProofofTST

•  Selec\on:– Anindividualprobabilityofselec\onis:ps(v)=F(v)/F(t),whereF(t)=ΣF(u),{uinP(t)}– Probabilityofselec\onodschemaS:ps(S)=F(S)/F(t)– Thus:C(S,t+1)=C(S,t)nps(S)– Orequivalently:C(S,t+1)=C(S,t)F(S)/Fprum(t)WhereFprum(t)=F(t)/n…isaveragefitnessinP(t)


ProofofTST

•  ...S\llselec\on:– So,wehave:C(S,t+1)=C(S,t)F(S)/Fprum(t)–  Iftheschemawere“above-average”ofe%:– F(S,t)=Fprum(t)+eFprum(t),fort=0,...– C(S,t+1)=C(S,t)(1+e)– C(S,t+1)=C(S,0)(1+e)t–  I.e.thenumberofabove-averageschematagrowsexponen\ally(inconsecu\vepopula\ons(andwithselec\ononly)).


ProofofTST

•  Crossover:– Probabilitythataschemawillbedestroyed/surviveacrossover:

– pd(S)=d(S)/(m-1) – ps(S)=1–d(S)/(m-1)– Crossingoverwithprobabilitypc:– ps(S)>=1–pc.d(S)/(m-1)

•  Selec\onandcrossovertogether:– C(S,t+1)>=C(S,t).F(S)/Fprum(t)[1-pc.d(S)/(m-1)]


ProofofTST•  Muta\on:–  1bitwillnotsurvive:pm–  1bitwillsurvive:1–pm– ASchemawillsurvive(pm<<1):–  ps(S)=(1–pm)o(S)–  ps(S)=…roughly…=1–pm.o(S),forsmallpm

•  Selec\on,crossoverandmuta\ontogether:•  C(S,t+1)>=C(S,t).F(S)/Fprum(t)[1-pc.d(S)/(m-1)-pm.o(S)]

•  QED.


ConsequencesofTSTandBBH

•  Encodingma_ers•  Sizema_ers•  Prematureconvergenceharms•  WhenGAsucks:–  (111*******),(********11)areabove-average– ButF(111*****11)<<F(000*****00)–  Idealis(1111111111);GAhashard\mesfindingit– Theselec\oncondi\onmightbeimproved


Implicitparalelism

•  GAworkswithindividuals,butimplicitlyitevolvesmuchmoreschemata:2mton.2m.

•  Buthowmanyschemataisprocessedefficiently:– Holland(andothers):(Undercertaincircumstances,suchasn=2m,schematastayabove-average,...)Numberofschematathatreallygrowexponen\allyisintheorderofn3.

•  Itwasjokinglycommentedastheonlycasewherecombinatorialexplosionisonourside.


Explora\onvs.Exploita\on

•  OriginalHollandmo\vace:GAis“adap\veplan”lookingforequilibriumbetween:– explora)on(findingnewareasforsearch)– exploita)on(u\lizingcurrentknowledge)

•  Justexplora\on:randomwalks,notu\lizingpreviousknowledge

•  Justexploita\on:stuckinginlocalop\ma,rigidity


1-armedbandit


2-armedbandit

•  Ncoins,2-armedbandit(armspayoffshaveexpectedvaluesm1,m2andvariancess1,s2).N-ncoinsisallocatedtothebe_erarm,ncoinstotheworseone.

•  Goal:tomaximizeoutcome/tominimalizeloss.•  Analy\calsolu\on:toallocateexponen\allymoretrialstothecurrentlywinningarm

•  N-n*=O(exp(cn*));–  cdependsonm1,m2,s1,s2;andn*istheop\malvalue


BanditandSGA

•  GAalsoallocatesexponen\allymoretrials(slotsinpopula\on)tothemoresuccessfulschemata

•  Itthussolvestheexplora\onvs.exploita\onproblemintheop\malway

•  Schemataplaysmanymul\-armedbanditgames–  Thewinningprizeisnumberofslotsinpopupla\on–  Itishardtoes\matethefitnessofascheme–  FirstpeoplethoughtthatSGAplays3m–armedbandit,– Whereallschemataarecompe\ngarms…


…butit’scomplicated•  Actually,muchmoregamesisplayedinparallel•  Schemata“compete”for“conflic\ng”fixedposi\onsinagene

•  Schemataoforderkalwayscompeteforthosekfixedposi\ons–theyplay2k–armedbandit

•  So,thebestofthosegamesgettheexponen\alslotsinpopula\on

•  But,itdependsifwecanes\matethefitnessofaschemeinapar\cularpopula\onwell(whichcanbeaproblem)


Thus,abadtaskforSGAis...•  =2;forx~111*...*•  f(x)=1;forx~0*...*•  =0;otherwise.•  Forschematawenowhave:–  F(1*...*)=1/2;–  F(0*...*)=1

•  But,theSGAes\matesF(1*...*)~2,•  Becauseschemata111*...*willbemuchmorecommoninapopula\on

•  SGAheredoesnotsampleschemataindependently,soitdoesnotes\matetheirrealfitness.


Problems

•  Thearmsinbanditareindependent,buttheSGAdoesnotsampleschemataindependently

•  Selec\ondoesnotworkideally,asintheTST,itisdynamicandithassta\s\calerrors.

•  SGAmaximizesitson-lineperformance,theyshouldbesuitableforadap\vetasks(Itisapi_ytostoparunningSGA;-)

•  (Paradoxically,maybe)themostcommonapplica\onofGAistoletthem“only”findtheonebestsolu\on.


Sta\cBBH

•  GrafensteGe,91:PeopleconsiderthatGAconvergestosolu)onswithactualsta)s)caveragefitness;andnot(asitreallyhappens)tothosethatexistinpopula)ons,i.e.withthebestobservedfitness

•  Then,peoplecanbedisappointed:– Collateralconvergence– Largefitnessvariance


Collateralconvergence

•  WhenGAconvergessomewhere,theschemataarenolongersampleduniformly,butwithabias

•  If,e.g.asheme111***...*isgood,itwillspreadinapopula\ona�erfewgenera\ons,i.e.almostallindividualswillhavethisprefix.

•  Butthen,almosteverysampleofascheme***000...*arealsosamplesofascheme111000*...*.

•  Thus,theGAwillnotes\mateF(***000*...*)correctly.


Largefitnessvariance

•  GAwillnotes\matefitnessofaschemewellinthecaseifthesta\caveragefitnesshasalargevariance.

•  Suchasthescheme1*...*fromourevilexample.•  Thevarianceofitsfitnessislarge,sotheGAwillprobablyconvergetothosepartsofasearchspacewherethefitnessisbig.

•  Whichinturnwillbiasfurthersamplingofthescheme.So,thesta\cfitnessisnotes\matedwell,again.


REPRESENTATIONANDOPERATORSIntegerandfloa\ngpointrepresenta\onsoperators,selec\on

ROMANNERUDA:EVA1-2014/15

Encoding•  Binary–  Classic(Holland)–  Therearenicetheore\calresults(be_erthanschematatheory,wewillseenextsemestr)

– Hollandargumen:binarystringsoflength100arebeGerthandecimalofleghth30becausetheyencoderoughlythesameinforma)onbuthavemoreschemta(2100>230).

–  ButweknowschemataarenotthatimportantasHollandthought

–  Theimportantfactoristhatbinarzencodingissome\mesunnaturalforagivenproblem.


Otherencodings•  Alphabetswithmoresymbols•  Integers•  Floa\ngpoint

•  Yetanotherexamples:–  Permuta\ons,–  Trees(programs),–  Matrices,–  Neuralnetworks(differentways),–  Finiteautomata–  Graphs,–  A-lifeagents…


Selec\on-overview•  Roule>e-wheelselecDon–  tradi\onal,fitness-propor\onal

•  SUS(stochasDcuniversalsampling)–  Justonerandomposi\oninaroule_ewheel,otherposi\onsareshi�soverangle1/n

–  „morefairroule_e“–why?•  Turnament–  k-tournament– comparingkrandomlyselectedindividuals,thewinnerischosenbyselec\on

–  Typically,kisasmallnumber,like2,3,5–  Canbeusedincaseswherefitnessisnotexplicitlygiven(agameisplayed,orasimula\onisinvolved)


Integerencoding•  Muta\on:–  „unbiased“–newrandomvaluefromthewholedomain

–  „biased“–newvaluerepresentsarandomshi�(normaldistribu\on)fromtheoriginalvalue

•  Crossover:– One-point,mul\pe-point,…– Uniform–ineverygenewethrowacoinfromwhichparentthevalueischosen

–  Bewareofordinalrepresenta\onsincaseswheretheorderdoesnotmakesense(then,probably,thebiasedmuta\ondoesnotmakesense)


Floa\ngpointencoding

•  Historically,thefirsta_emptswereencodingrealnumbersintobit-stringrepresenta\ons

•  Notusedo�entoday,exceptforthecaseswhenalimitedprecisionmakesgoodsense(compressionofasearchspace,explicitcontrolovertheaccuracyoftherepresenta\on)

•  Commonprac\cetodayistoencoderealvaluesasfloa\ngpointrepresenta\on,andtheoperatorstakethisintoaccount


Floa\ngpointoperators

•  Muta\on– biased– Unbiased

•  Crossover– Structural

•  One-point,uniform,...

– Arithme\c•  Combina\onofvalues


Arithme\ccrossover

•  Simpleaverageofparents‘values•  Variants:– Someotherconvexcombina\on:

•  z=a*x+(1-a)*y,where0<a<1– Howmanyvaluesfromanindividualtocross:

•  Typicallyallofthem•  Some\mesjustonechosenatrandom•  Some\mesacombina\onwith1-pointcrossover


EVOLUTIONOFCOOPERATIONPrisonersandtheirdilemma,Nash,vonNeumann,Axelrod,Dawkins


Altruismvs.darwinism?•  Darwinismisinherentlycompe\\v–survivalofthefi_est

–  socialdarwinism– backingthelaissez-faire(„letitbe“)capitalism–  AndrewCarnegie,TheGospelofWealth,1900Whilethelawofcompe))on

maybesome)meshardfortheindividual,itisbestfortherace,becauseitensuresthesurvivalofthefiGestineverydepartment.Weacceptandwelcome,therefore,ascondi)onstowhichwemustaccommodateourselves,greatinequalityofenvironment;theconcentra)onofbusiness,industrialandcommercial,inthehandsofthefew;andthelawofcompe))onbetweenthese,asbeingnotonlybeneficial,butessen)altothefutureprogressoftherace.

•  Butthereisalotofcoopera\onbothinnatureandsociety•  Themainproblemofevolu\onary(social)biology:•  HowcanaltruisDcbehaviorbeevolved,whenit(by

definiDon)decreasesafitnessofanindividual?


Theoriesofevolu\onofaltruism•  Groupselec\on

–  Evolu\oncanworkongroupsofindividuals(Darwin)–  Howtoexplainindividualswhocheatanddonothelp

•  Kinselec\on–  Preserva\onofalmostiden\calgenesincloserela\ves–  Howtoexplainaltruismofstrangers,evenotherspecies

•  Dawkins,selfishgene–  Theunitofevolu\onisagene,notanindividual– Wilson:„theorganismisonlyDNA'swayofmakingmoreDNA.“

•  Trivers,1971:reciprocalaltruism– Mutualbenefitsforbothorganisms(evendifferentspecies)–  Shadowofthefuture,paralellwithiteratedprisonersdilemma


Prisoner’sdilema


i/j D C

D 2/2 0/5

C 5/0 3/3

i/j D C

D P/P S/T

C T/S R/R

• Tempta)on>Reward>Penalty>Suckerspayoff• R>P:mutualcoopera)onisbeGerthanmutualdecep)on• T>RaP>S:decep)onisadominantstrategyforbothplayers• (50s-RANDcorp.)

Nash

•  Astrategysisdominantforagenti,ifitgivesbe_erorthesameresultthananyotherstrategyofanagentiagainstallstrategiesofagentj

•  StrategiessiandsjareinNashequilibrium,if:–  Ifagentiplaysstrategysi,agentjdoesbestwithstrategysj–  Ifjplayssj,idoesbestwithsi

•  Or,siandsjarethebestmutualanswerstoeachother•  ThisisNashequilibriumofpurestrategies

•  ButnoteverygamehasaNashequilibriuminpurestrategies•  AndsomegameshavemoreNashequilibria


NashandPareto•  Mixedstrategies–random

selec\onamongpurestrategies–  Nashtheorem:Everygamewith

finitenumberofstrategieshaveNashequilibriuminmixedstrategies.

•  Thesolu\onisPareto-op)mal/efficient–  Ifthereisnootherstrategywhich

wouldimproveagentoutcomewithoutworseningsomeotheragentoutcome

–  Thesolu\onisnotPareto-efficinet:ifanoutcomeofoneagentcenbeimprovedwithoutdecreasingotheragent‘soutcome


Thus…

•  Forra\onalagentsthereisnodilemma/oristhere?– DDisNashequiilibrium– DDistheonlysolu\onthatisnotPareto-op\mal– CCisasolu\onmaximizingcommonoutcome

•  Tragedyofthecommons•  Whatisra\onal,andarepeoplera\onal?•  Shadowoffuture–iteratedversion–Axelrod


Iteratedprisoner‘sdilemma•  Playersplaymoregames,

theyremembertheresults/acitonsoftheoponent,andcanmodifytheirstrategiesaccordingtothehistory

•  T>R>P>S,•  2R>T+S–itdoesnotpayoff

toalternateCandD•  IfthegameisplayedN-

\mes(andtheplayersknowtheN)itcanbeprovedbyinduc\on,thebeststrategyis„deceiveallthe\me“.


Axelrodtournaments•  Thefirsttournament:

–  14strategiesplusRANDOM,200games,everybodyplayedwitheverybody(ncludititself),5xrepeat

•  TFT=TitForTatstrategy–  Startcooperate,thencopyoponent‘smoves

•  Thesecondtournament:–  62strategiíes–everybodyknewtheresultsofprevioustournament–TFT

winsagain

•  Thethird„ecological“tournament–  Resemblingthegenera\onsofGA,ini\alpopula\onwasthesecond

tournamentstrategies,therewere1000genera\ons–  Thenumberofindividualsinthenextgenera\onwaspropor\onalto

numberofvictoriesinthepreviousgenera\on–  Aaaaand,theTFTwinsagain!


Whatdoesitmeanforstrategies?•  4importantproper\esofsuccessfulstrategies:–  Niceness–donotdeceivefirst–  Provocability–quicklypunishdecep\on–  Forgiveness–butquicklycalmdown–  Clarity–besimple,soothersunderstandyou

•  Thereisnotasinglestrategythatwouldwinagainstallstrategies

•  Itisnecessarytobesuccessfulagainstverydiversestrategies(ALL-D,TFTT,RANDOM,TRIGGER)

•  Itisalsogoodtolearnplaywellagainstitself•  A_emptstobeatTFTbymoredecep\ondidnothelp


Whatdoesitmeanforcoopera\on?

•  Inenvironmentsthatsupportcoopera\on…–  Payoffsfavorcoopera\on,–  ThereisabigprobabilityofiteratedPD(shadowofthefuture)

•  …thecoopera\onisusuallyevolved–  Butnotalways,suchasintheALL-Dworld

•  Ra\onality,intelligence,consciousnes,…isnotnecessaryforcoopera\on,justbiggerfitnessvalues

•  Ini\alcoopera\oncanemergeatrandom,andthenitcansurvive


Twentyyearsa�er•  Inenvironmentswithnoise,thePavlovstrategy(win-stay,lose-shi�)issuccessfu

•  IfthepayoffRorP=>C,•  ifTorS=>D

•  A�er20yearsthetournamentwasrepeatedwithmorestrategiesfromeachteam

•  Thewinningstrategieswerecoopera\ngasateam•  Fewmoves(10)tahůtorecognizetheoponent,thenallstrategieshelpedonefatherstrategyfromtheteamtogetbe_erscore

•  Theteamswereevenfigh\ngtheorganizers(falseteamstogetmoreslotsinthetournament…)


EVOLUTIONARYSTRATEGIESMo\va\on,popula\oncycle,floa\ngpointmuta\ons,meta-evolu\on


Evolu\onatystrategies•  Rechenberg,Schwefel,60s•  Op\miza\onofrealfunc\onofmanyparameters•  'evolu\onofevolu\on'•  Evolvedindividual:– Gene)cparameters-affec\ngthebehavior–  Strategicparameters-affec\ngevolu\on

•  Newindividualisacceptedonlyifitisbe_er•  Moreindividualsasparents•  Todaysmostsuccessful(andcomplex)isCMA-ES(correla\onmatrixadapta\on-ES)


ESnota\on•  Importantparameters:– Mnumberindividualsinpopula\on–  Lnumberofnewindividuals–  Rpočet'rodičů'

•  Specialselec\onrelatednota\on:–  (M+L)ES–Mindividualstoanewgenera\onisselectedfromM+Loldandnewindividuals

–  (M,L)ES–Mindividualstoanewgenera\onisselectedonlyfromLnewindividuals•  Usually,the(M,L)strategiesaremorerobust– lesspronetostuckinlocalop\ma

•  Theindividual:C(i)=[Gn(i),Sk(i)],k=1,orn,or2n


ESpopula\oncycle

•  n=0;Ini\alizeatrandomapopula\onPnofMindividuals

•  EvaluatethefitnessvaluesofindividualsinPn•  Un\lthesolu\onisnotgoodenough:–  RepeatL\mes:

•  chooseRparents,•  Crossthemover,mutate,evaluatethenewindividual

–  ChooseMnewindividuals(dependingontheEStype)–  ++n


ESindividualandmuta\on•  C(i)=[Gn(i),Sk(i)]•  Skarestandarddevia\onsofbiasedfloa\ngpointmuta\ons•  k=1:

–  OnecommonstddevforallevolvedparametersG’s•  k=n:

–  Non-correlatedmuta\ons,nindividualnormaldistribu\ons–  Eachparameterhasitsownstddev–  Geometricly,themuta\onsarewithinanellipseparalleltoaxes

•  k=2n:–  Rota\onsarealsoincluded,theellipseisnotparalleltoaxes–  correlatedmuta\ons,theycorrespondtomuta\onsfromn-

dimensionalnormaldistribu\on–  nparametersforrota\ons,nforstddevs2n


ESmuta\ons•  Gene\cparameters:

–  Addingrandomnumberfromnormaldistrbu\onwithcorrespondingdevia\on,androta\on,respec\velly

•  Standarddevia\ons:–  Increaseordecreaseaccordingtothesuccessofthemuta\on–  Originally,theso-called1/5rule(heuris\c,„thebestcaseiswhenthemuta\onhas20%successrate“,thus,thestddevisincreasedforlowersuccessrates,anddecreasedwhenthesuccessrateishigher

–  MorecommonnowistoaddarandomnumberdrawnfromN(0,1)

•  Rota\on:–  AddarandomnumberdrawnfromN(0,1)


EScrossover

•  Uniform•  „Gangbang“ofmoreparents– Local(R=2)– Global(R=M)

•  Twoversions:– Discrete– Arithme\c(average)


DIFFERENTIALEVOLUTIONAlterna\ve,geometricallymo\vatedEVA


DE–schemeandini\aliza\on

•  InicializaDon:randomparametervalues•  MutaDon:„shi�“accordingtotheothers•  Crossover:uniform„withasafeguard“•  SelecDon:comparisonandpossiblereplacementbyabe_eroffspring


DE – schéma a inicializace

• Inicializace: náhodné hodnoty parametrů• Mutace: „posun“ podle ostatních• Křížení: uniformní „s pojistkou“• Selekece: porovnání a případné nahrazení

lepším potomkem

ROMAN NERUDA: EVA1 - 2013/14 61

Mutace Křížení SelekceInicializace

Muta\on

•  Everyindividualinapopula\onundergoesmuta\on,crossover,andselec\on

•  Foranindividualxi,pwechoosethreedifferentindividualsxa,p,xb,p,xc,patrandom

•  Defineadonorv:vi,p+1=xa,p+F.(xb,p-xc,p)•  Fisamuta\onparameter,avaluefrominterval<0;2>


Crossover

•  Uniformcrowwoveroforiginalindividualwithadonor

•  ParameterCcontrolstheprobabilityofachange•  Atleastoneelementmustcomefromadonor•  Probevectorui,p+1:•  uj,i,p+1=vj,i,p+1;iffrandji<=Corj=Irand•  uj,i,p+1=xj,i,p+1;iffrandji>CandjǂIrand•  randjiispseudorandomnumberfrom<0;1>•  Irandpseudorandomintegerfrom<1;2;...;D>


Selec\on

•  Comparefitnessofxandv,selectthebe_er:– xi,p+1=ui,p+1;ifff(ui,p+1)<=f(xi,p)– xi,p+1=xi,p;otherwise–  fori=1,2,...,N

•  Muta\on,crossover,andselec\onisrepeatedun\lsometermina\oncriterionissa\sfied(typically,thefitnessofthebestindividualisgoodenough)


PARTICLESWARMOPTIMIZATIONIndividualisapar\clefloa\nginaswarminthefitnesslandscape


PSO•  Popula\on-basedsearchheuris\c•  Eberhart,Kennedy,1995•  Inspira\onofswarmsofinsect/fish•  Individualistypicallyafloa\ngpointvector•  Itiscalledapar/cle•  Nocrossover•  Nomuta\onasweknowit•  Individualsaremovinginaswarmthroughtheirparameter

space•  Thealgorithmisusinglocalandglobalmemory:

–  pBest– eachpar\cleremembersaposi\onwiththebestfitness–  gBest– bestpBestamongallpar\cles


PSOalgorithm•  Ini\alizeeachpar\cle•  Do•  Foreachpar\cle•  Computefitnessofpar\cle•  Ifthefitnessisbe_erthanthebestfitnessseensofar(pBest)•  pBest:=fitness;•  End

•  SetgBesttothebestpBest•  Foreachpar\cle•  computethespeedofpar\clebyequa\on(a)•  updateposi\onofpar\clebyequa\on(b)•  End•  Whilemaximumitera\onsorminimumerrornotsa\sfied


PSOmovementequa\ons

•  v:=v++c1*rand()*(pbest-present)++c2*rand()*(gbest-present)(a)

•  present=persent+v(b)•  vispar\clespeed,presentispar\cleposi\on•  pbestbestposi\onofapar\cleinhistory•  gbestbestglobalposi\oninhistory•  rand()randomnumberfrom(0,1).•  c1,c2constants(learningrates)o�enc1=c2=2.


PSOdiscussion

•  CommonwithGA:– Startwithrandomconfigura\on,haveafitness,usestochas\cupdatemethods

•  DifferentfromGA:– Nogene\coperators– Par\cleshavememories– Theexchangeofinforma\ongoesonlyfromthebe_erpar\clestotherest


EVOLUTIONARYMACHINELEARNINGMichiganvs.Pi_sburg,machinelearning,reinforcmentlearning


Machinelearning–asubset•  Learnrulesbasedonthetrainingexamples– Datamining– Expertsystems– Agent,robotslearning(reinforcementlearning)

•  Basicevolu\onaryapproaches:– Michigan(Holland):individualisonerule

•  HollandLCS:learningclassifiersystems

– Pi>sburgh:individualisasetofrules


Michigan•  Hollandin80s:learningclassifiersystems•  Theindividualisarule•  Thewholepopula\onworksasanexpertorcontrolsystem

•  Therulesaresimple:–  Le�-handside:featureistrue/not/don‘tcare(0/1/*)–  Right-handside:ac\oncodeorclassifica\oncategory

•  Ruleshaveweights(reflec\ngtheirsuccess)•  Theweightmakestheirfitness•  Theevolu\ondoesnothavetobegenera\onalROMANNERUDA:EVA1-2014/15

Michigan-LCS•  Evolu\onhappensonlyfrom\meto\meand/oronpartofpopula\on

•  Theproblemofreac\vness(lackofinnermemory)–  Theright-handsideoftherulecontains–besidestheac\on/classifica\oncode–otherinnerfeatures,called„messages“

–  Thele�-handsideoftherulehasspecialfeaturestointerceptthemessages,called„receptors“

–  Thesystemhasabufferofmessagesandithastorealizeanalgorithmtodistributearewardamongchainsofrules


LCS–bucketbrigade•  Onlysomerulesleadtoac\ons

thattriggerrewardfromtheenvironment,

•  Therewardshouldbedistributedtothechainofsuccessfulrulesleadingtothereward

•  Ruleshavetogiveuppartoftheirstrenght(likepayingmoneytotakepartintheac\on)iftheycompeteforachancetobeapplied

•  ThetechnicalwayitisdoneiscalledBucketbrigadealgoritm

•  Inprac\ceitisdifficulttoballancetheeconomyofrules,hardlyusedtoday


LCS – bucket brigade• Jen některá pravidla vedou

k akci, za kterou následuje odměna/trest od prostředí,

• Rozdělění odměny – pro celý řetěz úspěšných pravidel

• Pravidla musejí dát část své síly (jakoby peněz), když chtějí soupeřit o možnost být v cestě k řešení

• Bucket brigade algoritm, v praxi komplikované a těžkopádné, ekonomika odměn se těžko vyvažuje

ROMAN NERUDA: EVA1 - 2013/14 74

Prostředí

Detektory Efektory

Matchující pravidla Množina akcí

Populace pravidel

GA BucketBrigade

Buffer zprávExterní stavy | Interní zprávy | Akce

Výběr akce

Odměna

Z(ero)CS

•  (Wilson,1994)simplifyLCS– Nointernalmessages– Nocomplicatedmechanismofrewardredistribu\on

•  Rulesarejustbitmap(and*)representa\ons:–  IF(inputs)THEN(outputs)

•  Coveroperator:–  Ifthereisnoruleforcurrentsitua\on/example,itisgeneratedadhoc

–  Randomlysome*areaddedandarandomoutputisselected


ZCScontd.

•  Howtherewardisdisr\buted/thestrengthofrulesismodified:–  Rulesnotapplicabletogivensitua\on:nothing–  Rulesapplicabletoinputbutwithdifferentoutput:decresethestrenghtbymul\plyingbyconstant0<T<1

–  Allrules’strenghtsaredecreasedbyasmallconstantB–  Thisamountiddistributeduniformlyamongtherulesthatansweredcorrectlyinthepreviousstep(decreasedbyafactor0<G<1)

–  Finally,theanswerofthesystemisdecreasedbyBanduniformelydistributedamongrulesthatanswerdcorrectlyinthisstep


XCS–improvedZCS

•  ConsofZCS:–  ZCSdoesnottendtoevolveacompleterulesystemcoveringallcases

–  Rulesatthebeginningofthechainsareseldomrewardedandtheyarenotsurviving

–  Rulesleadingtoac\onswithsmallrewardscandieofftoo,althoughtheyareimportant

•  XCS:–  Separatefitnessfromexpectedoutcome/rewardoftherule

–  Basefitnessonthespecificityoftherule


Pi_

•  Individualsaresetsofrules,completesystems•  Theevalua\onismorecompicated– Rulepriori\es,conflicts– Falseposi\ves,falsenega\ves

•  Gene\coperatorsaremorecomplicated– Typically,dozenormoreoperatorsworkingossetsofrules,individualrules,termsintherules,...

•  Emphasisonrichdomainrepresenta\on(sets,enumera\ons,intervals,...)


GIL,exampleofPi_aproach•  Binaryclassifica\ontasks•  Theindividualclassifiesimplicitlytooneclass(noright-handsideoftherules)

•  Eachindividualisadisjunc\onofcomplexes•  Complexisconjunc\onofselectors(from1variable)

•  Selectorisadisjunc\onofvaluesfromthevariabledomain

•  Representa\onbyabitmap:•  ((X=A1)AND(Z=C3))OR((X=A2)AND(Y=B2))•  [001|11|0011OR010|10|1111]


GILcontd.•  Operatorsontheindividuallevel:–  Swapofrules,copyofrules,generaliza\onofrule,dele\onofrule,specializa\onofrule,inclusionofoneposi\veexampletotherule

•  Operatorsonthecomplexlevel:–  Splitofcomplexon1selector,generaliza\onofselector(replacingby11...1),specializa\onofgeneralizedselector,inclusionofonenega\veexample

•  Operatorsonselectors:– Muta\on0<->1,extension0->1,reduc\on1->0,


MULTI-OBJECTIVEOPTIMIZATIONMul\-Objec\veEvolu\onaryAlgorithms(MOEA),Paretovafronta,NSGAII


Problem•  Insteadofonefitness(objec\vefunc\on),thereisavector

ofthemfi,i=1...n•  Forthesakeofsimplicity,weconsiderminimiza\oncase,

sowetrytoachieveminimalvaluesofallfi,whichisdifficult

•  Defini\onsofdominance(ofindividual,orasolu\on):–  Individualxweaklydominatesindividualy,ifffi(x)<=fi(y),proi=1..n

–  xdominatesy,iffitweaklydominateshim,andthereexistsj:n(x)<n(y)

–  xandyareuncomparable,whenneitherxdominatesy,norydominatesx

–  xdoesnotdominatey,ifeitherweaklydominatesx,ortheyareuncomparable


Paretofront

•  Paretofrontisasetofindividualsnotdominatedbyanyotheridividual


Thesimpleway

•  HowtosolveMOEAinasimple(simplis\c?)way:•  Aggregatethefitness:–  i.e.weightedsumofallfi,resul\nginonevalueoff– Andsolveitasastandardone-objec\veop\miza\on–  Thisoneissome\mes,inthecontextofMOEA,calledSOEA(singleobjec\veEA),butisisnothingnewtous,actuallyweweredoingonlySOEAsofar

•  Nevertheless,wedonotknowhowtosetweightsforindividualfi‘s.


VEGA(VectorEvaluatedGA)•  OneofthefirstMOEAs,1985•  Idea:–  Popula\onofNindividualsissortedaccordingtoeachofthenobjec\vefunc\ons

–  ForeachiweselectN/nbestindividualsw.r.t.fi–  Thesearecrossedover,mutatedandselectedtonextgenera\on

•  Thisapproachinfact,haslotsofdisadvantages:–  Itisdifficulttopreserveadiversityofthepopula\on–  Ittendstoconvergetoop\malsolu\onsforindividualobjec\vesfi


NSGA(non-dominatedsor\ngGA)•  1994,anideaofdominanceisusedforfitness•  Thiss\lldoesnotguaranteesufficientspredofpopula\on,itmustbedealtwithsomeotherway(niching)

•  Algorhitm:–  Popula\onPisdividedintoconsequentlycontructedfrontsF1,F2,...•  F1isasetofallnon-dominatedindividualsfromP•  F2isasetofallnon-dominatedindividualsfromP-F1•  F3…fromP-(F1disjunctedwithF2)•  ...


NSGAcontd.•  Foreachindividualwecomputeanichingfaktor,asasum

ofsh(i,j)overallindividualsjfromthesamefront,where:–  sh(i,j)=1-[d(i,j)/dshare]^2,ford(i,j)<dshare–  sh(i,j)=0otherwise

•  d(i,j)isdistanceifromj•  dshareisaparameterofthealgorithm

•  Individualsfromthefirstfrontreceivesome„dummy“fitness,thatisdividedbyanichingfactor

•  Individualsfromthesecondfrontrecieveadummyfitnesssmallerthatthefitnessoftheworstindividualfromthefirstfront,anditisagaindividedbytheirnichingfactor

•  ...Forallfronts


NSGAII

•  2000,repairingsomedrawbacksofNSGA:–  Necessitytosettherightdsharevalue–  Non-existenceofeli\sm

•  Niching–  Dshareanichecountisreplacedbyacrowdingdistance:–  Thisisasumofdistancestothenearestneighbours–  Thebestindividualsw.r.t.eachfi‘shavecrowdingdistancesettoinfinity

•  EliDsmus–  Oldandnewpopula\onsarejoined,sorted,thebe_erpartgoestonextgenera\on


NSGAIIcontd.

•  Fitness:–  Eachindividualhasanumberofnon-dominatedfrontitisin,andacrowdingdistance

– Whencomparingtwoindividuals,firstafrontisconsidered(smallerisbe_er),andincaseofthesamefront,theircrowdingdistanceisconsidered(biggerisbe_er)

– Andinfact,nofitnessisreallycomputed,justthesetwonumbersarecomparedinatournamentselec\on

•  Andnowwehaveanimprovement–NSGAIIIROMANNERUDA:EVA1-2014/15

COMBINATORIALOPTIMIZATIONEVAsolvesNP-hardproblems,TSP,permutaionrepresenta\ons


EVAsolveshardtasks

•  0-1knapsackproblem–  Simpleencoding–  Problema\cfitness–  Standardoperators

•  TravellingSalesmanproblem(TSP)–  Simplefitness–  Problema\cencodingandopertators(crowwover,really)

•  Scheduling,planning,transporta\onproblems...


Knapsack

•  Given:– AknapsackofcapacityCMAX– Nitems,– eachhaveapricev(i)– andavolumec(i)

•  Thetaskistochooseitemssuchthat:– Maximizeasumv(i)– Atthesame\mewesqueezethemintoaknapsack,i.e.Sumofc(i)<=CMAX


Knapsack

•  Encoding–abitmap:– 0110010–takeitems2,3and6– Trivialalmost– Buttheindividualsmightnotsa\sfytheCMAXcondi\on

•  Operators:– Simplecrossover,muta\on,selec\on

•  Fitness:hastwoparts:– max[sumofv(i)]vs.min[CMAX–sumofc(i)]


Knapsack

•  So,wehaveamul\-objec\veop\miza\on:– Eitherweightemandaddem– OruseyourfavouriteMOEAfrompreviouschapter

– Or,changetheencodinginacleverway:•  1means:PUTtheitemintheknapsackUNLESSthecapacityisnotexceeded•  Thiswayweachieveanicepropertythatwithsuchadecoderallstringsinfactrepresentavalidsolu\on


Travellingsalesman

•  Nci\es,tourthemwithminimalcost•  Fitnessisclearcostofthetrip•  Reprezenta\onsaremany– Variantsofvertex-based– Edge-based,...

•  Operatorsareheavilydependentonrepresenta\on– Crossoverallowstouseheuris\cswemighthavetosolvetheTSP


Adjacencyrepresenta\on•  Pathisalistofci\es,cityjisatposi\onIiffthereisanedgefromitoj

•  Ex:–  (248397156)correspondsto1-2-4-3-8-5-9-6-7

•  Eachpathhas1representa\on,somelistsdonotgeneratevalidpaths

•  Notveryintui\ve•  Classicalcrowwoverdoesnotwork•  Butschematado:–  E.g.(*3*...)meansallpathswith2-3edge

•  Donotuseit.


Ordinal(orbuffer)representa\on

•  Mo\vova\onwastousethestandard1-pointcrossover–  Letushaveabufferofver\ces,maybejustordered,theencodingisinfactaposi\onofacityinthisbuffer

– Whenacityisused,itisdeletedfromabuffer

•  Ex:–  Buffer(123456789),andpath1-2-4-3-8-5-9-6-7isrepesentedas(112141311)

•  Donotuseiteither.


Path(orpermuta\on)representa\on

•  Probablyafirstideaofmostpeople•  Permuta\onrepresenta\onisimportantasnaturalformanyothertasks,aswell.–  path5-1-7-8-9-4-6-2-3isrepresentedas(517894623)

•  Thecrossoverdoesnotwork•  So,themainproblemwiththisrepresenta\onistoproposeacrossoveroperatorthatproducescorrectindividuals,andrepresentssomeideaabouthowagoodsolu\onshouldlooklike.–  PMX,CX,OX,...


PMX•  Par\allymappedcrossover(Goldberg)•  Preserveasmanyci\esontheirposi\onsfromtheindividualsasyoucan.

•  2-point•  (123|4567|89)PMX(452|1876|93):–  (...|1876|..)(...|4567|..)–  andamapping1-48-57-66-7–  Canbeaddedd(.23|1876|.9)(..2|4567|93)– Accordingtothemapping

•  (423|1876|59)(182|4567|93)


OX

•  Ordercrossover(Davis)•  Preserverela\veorderofci\esintheindividuals•  (123|4567|89)OX(452|1876|93):–  (...|1876|..)(...|4567|..)rearrangethepathfromthesecondcrossoverpoint

–  9-3-4-5-2-1-8-7-6– Deletecrossedoverci\esfrom1,remains:9-3-2-1-8–  Fillthefirstchild:(218|4567|93)–  Similarly,thesecondchild:(345|1876|92)


CX

•  Cycliccrossover(Oliver)•  Preservetheabsoluteposi\oninthepath•  (123456789)CX(412876935)–  Firstposi\onatrandom,maybefromthfirstparent:P1=(1........),

– Nowwehavetotake4,P1=(1..4....),then8,3a2–  P1=(1234...8.),can’tcon\nue,wefillfromthesecondparent

–  P1=(123476985)–  SimilarlyP2=(412856739)


ER

•  Edgerecombina\on(Whitleyetal)•  Observa\on:allpreviouscrossoverspreserveonlyabout60%ofedgesfrombothparents

•  TheERtriestopreserveasmanyedgesaspossible.–  Foreachcitymakealistofedges–  Startsomewhere(thefirstcity),–  Chooseci\eswithlessedges,–  Incaseofthesamenumberofedges,chooserandomly


(123456789)ER(412876935)

•  1:924•  2:138•  3:2495•  4:351•  5:463•  6:579•  7:68•  8:792•  9:8163

•  Startin1,successorsare9,2,4•  9looses,has4succ.,from2and4

choosingatrandom4•  succ.of4are3and5,take5,•  Nowwehave(145......),andcon\nue•  ...(145678239)•  Itispossiblethatwecannotchoosean

edgeandthealgorithmfails,butitisveryrare(1-1.5%případů)


(123456789)ER2(412876935)

•  1:9#24•  2:#138•  3:2495•  4:3#51•  5:#463•  6:5#79•  7:#6#8•  8:#792•  9:8163

•  ER2–improvingER•  Preservingmorecommonedges•  Markedgesthatexisttwiceby-#•  Theyarepriori\zedwhenchoosing

wheretogo.


Ini\aliza\onforTSP

•  Nearestneighbours:–  Startwitharanodmcity,–  Choosenextastheclosestfromthenotchosenyet

•  Edgeinser\on:–  ToapathT(startwithanedge)choosethenearestcitycnotinT

–  Findanedgek-jinTsoitminimizesthedifferencebetweenk-c-jandk-j

– Deletek-j,insertk-candc-jtoT


Muta\onforTSP

•  Inversion(!)•  Insertacityintoapath•  Shi�subpath•  Swap2ci\es•  Swapsubpaths•  Heuris\cssuchas2-optetc.

•  Taketwoedges,fourci\es,chooseothertwoedgesconnec\ngthese4ci\es


Otherapproaches

•  (Binary)matrixrepresenta\on:•  Either1onposi\on(i,j)meansanedgefromitoj•  Oritmeansthatiisbeforejinapath(morecommon)

–  Specificoperatorsofmatrixcrossover:•  Conjunc\on–bitwiseANDandrandominser\onofedges•  Disjunc\on–dissectintoquadrants,2ofthemdelete,removecontradic\ons,insertedgesatrandom

•  Combina\onwithlocalheuris\cs–  Evolu\onarystrategywhichimprovespathsby“smartmuta\ons”–heuris\cslike2-opt,3-opt


Othertasks-scheduling

•  SchedulingisNP-hard:–  Individualisasechedule,directmatrixencoding

•  Rowsareteachers,columnsclasses,valuesarecodesofsubjects•  Muta\on–mixthesubjects•  Crossover–swapbe_errowsfromindividuals

–  Fitness•  Fitnessofarow(howateacherissa\sfied)•  Otherso�criteriaandconstrainsabouttheschedulequality

–  Hardconstrains•  Mustrespectinoperators,otherwisetoomanyinadmissablesolu\onsaregenerated

•  Teachersconstrains,when,wherewhattoteach,…


Othertasks–jobshopscheduling•  Produc\onplanning

•  productso1…oN,frompartsp1…pK,foreachpartmoreplanshowtoproduceitonmachinesm1…mM,machineshavedifferent\mesforsetuptoadifferentproduct

•  Fitness–produc\on\me•  Encodingiscri\cal:–  Permuta\on–planisjustapermuta\nofproductsorder.Decodermustchooseplansforparts.Simplerepresenta\on,canuseTSP-inspiredcrossovers.Butshowsnotveryefficient,decodersolvesthecomplicatedpart,TSPoperatorsnotsuitable.

–  Directrepresenta\onofindividualasthecompleteplan–specializedandcomplexevolu\onaryoperators.


NEUROEVOLUTIONJustanintroduc\on,thecoolapproachesareintheEVAII


LearnneuralnetworksbyEVA

•  Firstexperimentsin80s•  Learntheparameters(weights)•  Learnthestructure(architecture,connec\ons)•  Learnweightsandstructuretogether•  Reinforcementlearningtasks–whenthereisnosupervisedalgorithm(robo\cs)

•  Hybridmethods–combina\onofEVAwithlocalsearchetc


Learntheweights•  Direct:–  Encodetheweightstoa(floa\ngpoint)vector,–  floa\ngpointGA,standardoperators–  Evolu\onarystrategies,…

•  Usuallyslowerthanspecializedgradientbasedlocalalgorithms,butcanberobust

•  Useminibatches•  Canbeparallelizedeasily•  Canbeusedforreinforcementlearningwheretheresnogradient(robo\cs)


Learnthestructure

•  Fitness=buildthenetwork,ini\alizeatrandom,train,several\mes

•  Directencoding–  Representthestructureasbinarymatrix–  Linearizetherelevantpartofthematrixintoabinaryvector

•  Grama\calencoding,Kitano–  Individualisarepresenta\onof2Dformalgrammarthatareaprogramtocreatethebinarymatrixrepresen\ngthestrucrureoftheneuralnet

–  Boldbuttoheavy-weightsolu\on,notusedinprac\ce


EVA I - Department of Theoretical Computer Science and...

Documents

Transcript of EVA I - Department of Theoretical Computer Science and...