Modeling with IRENE
Embed Size (px)
Transcript of Modeling with IRENE
Modeling with IRENEIntegrated R-code for Engineered Neural EvolutionTrevor Grant and Olcay AkmanDepartment of MathematicsIllinois State UniversityNOTE: N is for Neuroevloution. pretty sure we need neuroevolution in the title somewhere.
I like this title more, the more I think about it. It works on multiple levels. It implies traditional methods of statistical analysis as a sort of theocracy, which this method diverges from. The mainstream who worships one God the linear model, versus this with its multiple gods an entire rainbow of non-linear functions. 1OverviewNeural EvolutionWhat is a Neural Network?Using genetic algorithms to find optimal parameters to nonlinear functionsNeural evolutionSpecial Population AttributesJump ConnectionsUser defined libraries and learning functionsMutating learning functionsEngineered Genetic AlgorithmsWhat is a Neural Network?Starting out simpleWe begin by modeling the data with a simple linear model. We then look at the sum of the squared residuals (SSR). A value is assigned to the model based on this SSR. 0123Inputs (X1, X2, , Xn)Output (Y)Example1974 Statistics regarding Income
Income:per capita income (1974)
Life Exp:life expectancy in years (196971)
Murder:murder and non-negligent manslaughter rate per 100,000 population (1976)HS Grad:percent high-school graduates (1970)
Frost:mean number of days with minimum temperature below freezing (19311960) in capital or large city 0123Inputs (X1, X2, , Xn)Output (Y)IncomeLife ExpectancyMurder RateHS Grad %Frost
ResidualsThe difference between the estimated value and the fitted value is known as the residualSum of Squared ResidualsHeightAgeA linear model is estimatedwhich minimizes the sum of squared residuals (SSR). The distance between theestimates and the actualdata points. RelationshipLinearTraditionally we estimate linear relationships.
NonlinearTrue relationship may be (often is) non-linearSometimes we know relationship and can use nonlinear regression methods such as Neural NetworksNonlinear least squaresSometimes we dont know the functional form of the relationship. IRENE explores functional forms while estimating parameters.
Sum of Squared ResidualsHeightAgeA nonlinear model reducesthe sum of squared residualsand better models theactual data. Anatomy of a neural network LayersNodesWhats in a node? A node contains a learning functionThe learning function takes input and parameters converts it to output. A model has parameter values 11121314Lets pretend the first observation contains these values 222-14.1Now say a model has these parameters: 222-14.15-4.122And the learning function on this node is exponential 12-14.15-4.122h1
So the value for node h1 for the first observation is .1108 12-14.15-4.122h1
.1108This is repeated for each observationEach model has its own unique set of . The fitted values of the output are functions of Observationh11.110821.5243.52941.011n1.752After this is complete a linear model is estimated. The values of the nodes in the last layer are regressed on the output. The sum of the squared residuals is assigned as the models value.The linear model estimatedThe sum of the squared residuals of the model (SSR) is referred to as the value of the model. We want a model that minimizes sum of squared residuals (or value).
Linear model estimated in a more complex neural network h11h12h13h21h22
NOTE: h11, h12, h13 are not included in the final linear model. Only the nodes in the final layer are included in the linear model Optimizing Parameters with Genetic AlgorithmsStep 1: A population of models is created each with randomly assigned parametersStep 2: Models mate in the hope of creating children models with better value (lower SSR).
From now on we will refer to each unique set of parameters in a model as a creature. A collection of creatures, models with identical topology but different parameters, is referred to as a species.Copy this model 200 times, each copy has randomly assigned parameter values Each individual collection of parameters is referred to as a creature. The collection of creatures for a given topology (arrangement of layers and nodes) is referred to as a species. CreatureSpeciesTalk about they are the same species because they have same topology / learning function. make slide with different learning function color and say this is different species21Species A species has a unique arrangement of nodes, layers and learning functions. Even though these creatures have the same arrangement of layers and nodes, they have a different learning function and so they are different species
Sigmoid Learning FunctionExponential Learning FunctionTalk about they are the same species because they have same topology / learning function. make slide with different learning function color and say this is different species22Then each creature has a different computed value (SSR), and assigned ID#, this is saved in a table. ID # 001ID # 002ID # 00341,240215,6353,612Model ID Sum Squared Resid. (SSR)Two creatures are selected with probability weighted according to model fitness. ID # 001ID # 002ID # 00341,240215,6353,612Model ID Sum Squared Resid. (SSR)Each creature can be represented by DNA 2.512.10551.25-15.211Model Structure121314Two methods of matingAverageThe average of each parameter in the mothers and fathers DNA is averaged in the childs DNACrossoverA cut point is randomly determined, every parameter before the cut point is inherited from the father, after the cut point each parameter is inherited from the motherDNA is selected from the two creatures chosen to mate. ID # 001ID # 002ID # 00341,240215,6353,612Model ID Sum Squared Resid. (SSR)11=2.512Mother12=.10513=51.2514=-15.211=3.613 Father12=26.25213=-25.1214=104.4Average Method11=3.613 Father12=26.25213=-25.1214=104.411=2.512Mother12=.10513=51.2514=-15.211=(3.613+2.512)/2
=44.6Average Method11=3.613 Father12=26.25213=-25.1214=104.411=2.512Mother12=.10513=51.2514=-15.211=3.0625 Child12=13.178513=13.06514=44.6Crossover MethodA random number between one and the length of the parameter sequence is chosen. This is the cut point. The child inherits parameters from the father before this point, from the mother after.
Crossover Method: Cut point at position two11=3.613Father12=26.25213=-25.1214=104.411=2.512Mother12=.10513=51.2514=-15.2Child11=3.61312=26.25213=51.2514=-15.2The least fit creatures are killed to make room for the new children ID # 001ID # 002ID # 00341,2403,289215,635Model ID Sum Squared Resid. (SSR)The least fit creatures are killed to make room for the new children ID # 001ID # 00241,2403,289Model ID Sum Squared Resid. (SSR)The least fit creatures are killed to make room for the new children ID # 001ID # 00241,2403,289Model ID Sum Squared Resid. (SSR)11=3.0625Model Structure12=13.178513=13.06514=44.6The children are assigned new ID numbers and their value (SSR) is computed ID # 001ID # 00241,2403,289Model ID Sum Squared Resid. (SSR)ID # 0046,755This process repeats several times ID # 001ID # 00241,2403,289Model ID Sum Squared Resid. (SSR)ID # 0046,755This process repeats several times ID # 005ID # 0024,2423,289Model ID Sum Squared Resid. (SSR)ID # 0046,755This process repeats several times ID # 005ID # 0024,2423,289Model ID Sum Squared Resid. (SSR)ID # 0073,111This process repeats several times ID # 008ID # 0024,8413,289Model ID Sum Squared Resid. (SSR)ID # 0073,111Eventually there is convergence at an optimum (either local or global) ID # 239ID # 1592,0152,015Model ID Sum Squared Resid. (SSR)ID # 4122,015At convergence we kill all the extra creatures in the species (to free up memory) ID # 239ID # 1592,0152,015Model ID Sum Squared Resid. (SSR)ID # 4122,015What is neural evolution?Neural evolution: simultaneously explore new topologies while optimizing existing topologies. New species are born out of old species.Growing new nodes (We dont always wait for convergence to add new layers and nodes)We call each arrangement of layers, nodes and learning functions a species. Who lives? Who dies?After each generation a roster of all creatures is created and ordered according to value. Species IDCreature IDValue (SSR)0030431212300302112552002231132410030541512500115220150005024251240031223510200210553039001412124310151Who lives? Who dies?If there is at least one creature of species in the top 60%* of a list of all creatures the species survives. Otherwise the entire species is eradicated.*60% is arbitrary. We can set that to other proportions. Well talk about this more in engineered genetic algorithms.Example:Species2233223113 Species 2Species 1Species 360%Survivors: No creature of Species 1 is among themWhile each species searches for optimums, new ones are born and others dies out. We could search forever, but we stop our search based on time or generations elapsed. Special Population AttributesJump connections, user defined libraries and learning functions, and mutating functional forms. Jump ConnectionsWith jump connections, all nodes and input are regressed on the output. In a standard neural network, only the nodes in the final layer is regressed on the output.
Jump Connections h11h22h23x1h12h21x2x3x4
ColinearityIf jump connections are used and the learning function is linear then the final linear model will have perfect colinearity. (The computer wont be able to estimate the final model, this is bad and a failsafe is built in to prevent this from happening)Colinearity
Libraries of Learning FunctionsEach time a node is creat