Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the...
Transcript of Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the...
![Page 1: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/1.jpg)
1
SupplementtoDARPAQuarterlyReportQ2
Task 1.2 Biologically motivated spike-timing dependent plasticity (STDP)learning
1.SpikingNeuralNetworkModelandSTDPRules
We first re-implemented the spiking neural network (SNN) model for MNIST handwritten digitclassification from [1]. We tried models of various sizes (100 and 400 neurons), and found thatclassificationresultsweresimilartothosereportedin[1].TolearnthesynapticweightsoftheSNN,weimplemented four different spike time-dependent (STDP) rules. All were implemented in an “online”fashion,whereweonlykeeptrackofa“trace”,oramemory,ofthemostrecentspike(temporalnearestneighbor), which has arrived at a post- and presynaptic neurons. The first was online STDP withoutpresynapticweightmodification,thesecondaddedanexponentialdependenceontheprevioussynapticweight, the third includeda rule fordecreasing synapticweightonpresynaptic spikes,and the fourthwasacombinationofthesecondandthirdrulestogether.
TheSNNmodelfrom[1]isillustratedbelow(Figure1).Thetrainingofthemodelisasfollows:Atfirst,allneuron activations in the excitatory layer are set below their resting potentials. For 350ms (or 700computation steps,with a simulation time stepof 0.5ms), a single training image is presented to thenetwork.FortheMNISTdigitdataset,eachdatasampleisa28x28grayscalepixelatedimageofahand-drawndigitfrom0to9.ThegrayscalevalueofeachpixeloftheimagedeterminesthemeanfiringrateofaPoissonspike train it corresponds to.We includea synapse fromall inputPoissonspike trains toeachneuronintheexcitatorylayer.Therearethen784xnsynapsesfrominputtotheexcitatorylayer,iftherearenneurons in theexcitatory layer.Thegoalof the trainingprocedure is to learna settingoftheseweightsthatwillcausedistinctexcitatoryneuronstofirefordistinctdigitclasses.Thoseneuronswhichfiremostforacertaindigitinthetrainingphasewelabelwiththatdigit’svalue.Inthetestphase,weclassifynewdataexamplesbytakingthemajorityvoteofthelabelsofthoseneurons,whichfireforatestdatasample.
2.SpikingNeuralNetworkforMNISTHandwrittenDigitClassification
2.1-DescriptionofETHSNNModel
Theexcitatoryneuronsareconnectedinaone-to-onefashionwithanequally-sizedlayerofinhibitoryneurons,eachofwhichlaterallyinhibitallexcitatoryneuronsbuttheonewhichitreceiveditssynapticconnectionfrom.Thelateralinhibitionmechanismtypicallycausesasingleneurontofirestronglyforagivendatasampleina“winnertakeall”fashion,allowingtheneuronwhichfirestoadjustitsweightstorememberdatasamplesofasimilarshapeandpixelintensity.
![Page 2: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/2.jpg)
2
Figure1:ETHSpikingNeuralNetworkModelforMNISTDigitRecognition
2.2-OnlineSTDPRules
ToimplementSTDPinanonlinefashion,apre-andpostsynaptictraceiskeptasabriefrecordofspike
activity.Onaspike fromapresynapticneuron, thepresynaptic trace, , is set to1,andonaspike
fromapostsynapticneuron,thepostsynaptictrace, ,issetto1.Bothexponentiallydecaytowardszerootherwise,asgivenby:
, ,
where , aretimeconstantsgivenas1msand2ms,respectively.
TheonlineSTDPrulewithoutpresynapticweightmodificationisgivenbythefollowingequation,wheredenotestherelativeweightchangeontheaffectedsynapse, denotesalearningrate(typicallyset
to0.01), indicates themaximumvalueaweightcanattain,and denotesaweightdependenceterm,whichissetto1forbestresults:
Thislearningruleisillustratedbelow(Figure2).
![Page 3: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/3.jpg)
3
Figure2:STDPwithoutpresynapticmodification
The online STDP rule without presynaptic weight modification and exponential weightdependenceisgivenby
,
where determinesthestrengthoftheweightdependence,andistypicallychosentobe1.TheonlineSTDPrulewithpresynapticweightmodificationisgivenbythefollowingtworules:
,
where and are post- and presynaptic learning rates typically set to 0.01 and 0.0001,respectively.Finally,theonlineSTDPrulewhichisacombinationofthesecondandthirdSTDPrulesisgivenbythefollowingtworules:
,
Aftereachtrainingstep,thesynapticweightsofeachexcitatoryneuronarenormalizedsothattheydonotgrowsolargeastofireindiscriminatelyforeverypossibleinputsample.Furthermore,thenetworkisrunfor150ms(300computationstepswith0.5mssimulationtimestep)aftereachinputsampletoallowtheactivationsoftheexcitatoryneuronstodecaybacktotheirrestingpotential.Wealsoimplementanadaptivemembranethresholdpotentialforeachexcitatoryneuron,whichstipulatesthattheneuron’s
thresholdisnotonlydeterminedbyafixedthresholdvoltage ,butbythesum ,whereincreases by a small fixed constant each time the -th neuron fires, and is decaying exponentiallyotherwise.
Onthenextpage,weshowaplotoftheactivationsofaspikingneuralnetworkasabove,withonly4excitatoryandinhibitoryneurons,andaplotofthechangeintheweightsofthesynapsesfromtheinputtothe“0”-indexedexcitatoryneuron,bothoverasingle inputpresentationandrestperiod(500msor1000timestepsare0.5mssimulationtimestep).Asyoumayobserve,duringthisiteration,onlythe“0”-
![Page 4: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/4.jpg)
4
thneuronever reaches its thresholdandspikes, laterally inhibitingallotherexcitatoryneurons in theprocess.
Noticethattheincreaseinsynapticweightcorrespondsintimetothefiringoftheexcitatoryneuron(inthis case, the postsynpatic neuron), and are larger when the difference between presynaptic andpostsynapticspikesissmaller.Thosesynapticweights,whichdonotchangecorrespondtopresynapticneuronsthathaven’tfired;i.e.,theirmeanfiringrateiseitherzeroorcloseenoughtozerosothattheyhadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potentialmechanism,onecanalsoobservetheincreaseinthethresholdparametereachtimethe“0”-thneuronfires a spike. After the 350ms training portion (or 700 timesteps), the 150ms resting portion (or 300timesteps)allowstheactivationsofeachoftheexcitatoryneuronstosettletotheirrestingpotential.
Figure3:Neuronactivations(left)andcorrespondingweightchanges(right)
Weproposedthetrainingtechniquesof inputsmoothing, inwhichthemeanfiringrateofeachoftheinputPoissonspiketrainswasaveragedwitheachofitsneighbors’totheimmediateleft,right,top,andbottom, and weight smoothing, in which, during the training phase, after each input example waspresentedtothenetworkandnetworkweightshadbeenmodified,weaveragedtheweightsfromtheinputtoasingleneuronintheexcitatorylayerusingtheneighborhoodsmoothingdescribedabove.
Tocomparethe fourSTDPrulesandthesmoothingoperations,wetraineda400excitatory-inhibitoryneuron SNN on the MNIST handwritten digit training dataset (60K examples), and tested it on thecorresponding test dataset (10Kexamples), for all possible combinationsof STDP rule and smoothingoperation.Theresultsarebelow(Table1).
Table1:Classificationresults:ETHmodelandmodifications
StandardOnlineSTDP
ExponentialWeight
Dependence
WithPresynapticWeight
Modification
Both
Diehl&CookAlgorithm
89.35 89.28 90.54 88.51
WithInputSmoothing
88.42 88.75 90.23 89.23
WithWeightSmoothing
88.90 89.85 90.02 89.54
![Page 5: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/5.jpg)
5
3.NovelApproaches:ConvolutionalSpikingNeuralNetworks
3.1-ProposedNetworkArchitecture
To extend the spiking neural network (SNN) model from [1], we first implemented a single layer ofconvolution“features”or“patches”.Thislayerisadoptedfromthewidely-adoptedconvolutionalneuralnetworkoriginatingfromdeeplearningresearch[2].Toaccomplishthis,weuseconvolutionwindowsofsize ,withastrideofsize . Inourcase,aconvolutionwindowcorrespondstoafunctionwhichmapsasectionofinputtoasingleexcitatoryneuron.Asingleconvolutionfeaturemapsregularlyspacedsubsectionsoftheinputtoa“patch”ofexcitatoryneurons,allviasharedsynapticweights(asinglesetofweights is learnedforasingleconvolutionfeature).Weusemultipleof theseconvolutionfeatures,mappingtodistinctexcitatoryneuronpatches,inordertolearnmultipledifferentfeaturesoftheinputdata.Ourgoal is to learnasettingof theseweightssuchthatwemayseparatethe inputdata (in thecurrentstudy,theMNISThandwrittendigitdataset)intosalientandfrequentcategoriesoffeatures,asevidencedbythefiringpatternoftheexcitatoryneurons,sothatwemightlatercomposetheminordertoclassifythedigitswiththeirrespectivecategoricallabels.
AswiththeSNNfrom[1],theweightsfrominput(convolutionwindows)toexcitatorylayer(subdividedinto convolution patches) are learned in an unsupervised manner using spike-timing dependentplasticity (STDP) in the trainingphase. In the testphase, theseweightsareheldconstant,andweusethem to predict the labels of a held-out test dataset. The labels of newdata are determined first byassigninglabelstotheexcitatoryneuronsbasedontheirspikingactivityontrainingdata,andthenusingtheselabelsinamajority“vote”fortheclassificationofnewdatasamples.
Thenumberofexcitatoryneuronsperconvolutionalpatchcanbedeterminedviatheconstants and,andisgivenby
,
where isthedimensionalityoftheinput,assuming isasquarenumber.Thenumberofconvolutionfeaturesisthenetworkdesigner’schoice,whichwedenoteby ,butwithmorefeaturescomegreaterrepresentation capacity and computational cost.As a first pass,we chose touse exactly inhibitoryneurons,eachofwhichcorrespondtoasingleconvolutionpatch.Eachneuron inaconvolutionpatchconnectstothisinhibitoryneuron,whichisconnectedtoallotherneuronsofallotherpatchesbuttheone it receives its connections from. This has the effect of allowing a particular convolution patch tospikewhiledampingouttheactivityofallothers,creatingcompetitiontorepresentcertainfeaturesoftheinputdata.Below,wehaveprovidedadiagramofthenetworkarchitecture(Figure4).
![Page 6: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/6.jpg)
6
Figure4:ConvolutionalSpikingNeuralNetwork(SingleConvolutionalLayer)
3.2-Experiments
Weexperimentwithdifferentchoicesof and andobservehowtheclassificationperformanceofthenetworkdegradesfromthatoftheSNNmodelfrom[1].Weusethesametrainingandtestprocedurefrom[1],butuseonlyasingleiterationthroughtheentiretrainingdataset(60,000examples),becausetrainingthenetworkisratherslow.Wechosetoset =27and =1forourfirstexperiments,causingthe convolution windows to cover almost the entire input space, and we therefore expect that theclassification performance of the networkwill not degrade toomuch from themodel from [1] givensufficient .We thengraduallydecreased , keeping and constant,expecting to seeagradualdegradation in classification performance. The results are below in Figure 5; for reference, the testclassificationperformanceoftheSNNmodelfrom[1]with100excitatoryneuronsisalsoplotted.
Figure5:Classificationaccuracybyconvolutionsize,numberofconvolutionpatches
![Page 7: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/7.jpg)
7
3.3-WeightSmoothing
Weexperimentedwith aweight smoothing technique,which canbedescribed as follows:After eachtrainingsamplehasbeenrunthroughthenetwork(forthe500msor1,000timestepsat0.5mseach),foreachconvolutionpatch,wetakeaweightedsumoftheweightsoftheneuronwhichspikedthemostduringthetrainingiterationanditsneighborsinahorizontal/verticallatticeintheconvolutionpatch,and copy this across all neurons in the same patch. For example, if the -th neuron is the “winner”(havingthemostspikeswithinitspatch),wecompute
,
where denotesthesetofweightsconnectedaconvolutionwindow(aportionoftheinput)tothe-thneuronintheconvolutionpatch,and isthesetofindicesofneighboringneuronsof -
thneuroninthelattice,inthesameconvolutionpatch.SeeFigure6foranillustrationofthisprocedure.This approach does not appear to improve the algorithm’s performance, indeed, it often harms theclassificationperformanceoftheoriginalalgorithm,sowedonotreportperformanceresultsfromthenetworks we tested.
Figure6:Weightsmoothingaftereachtrainingexample
3.4-ConvolutionalSpikingNeuralNetworkWithBetween-PatchConnectivity
Wedescribeamodificationtoourconvolutionalspikingneuralnetwork(SNN)modelwhich includesalatticeconnectivitybetweenpairsofneighboringconvolutionalpatches.Theweightsconnectivitypairsof convolutionalpatchesare learnedvia the sameSTDP rulewhich isused to learn theweights frominputtothepatches.Wealsodescribeanewprocedureforinhibitingexcitatoryneuronsandassigninglabelstonewdatawhichallowsforthenetworktoquicklylearnfeaturesofinputdata.
AddedConnectivity
![Page 8: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/8.jpg)
8
Suppose there are convolutional patches (or “features”) of convolutionneurons, and let be the
patch indexedby the integer . If iseven, thenweconnect to ;otherwise,weconnect it to
. The connectivity pattern is as follows: for every neuron in a patch, we connect it to allneurons intheneighboringpatch,aslongastheindicesremainwithinthe patch. This gives a horizontal and vertical lattice connectivity pattern between neighboring
convolution patches.Note that there are only such lattice connections; and are connected,
and are connected, up to and . The connectivity pattern is illustrated, for a singleneuron in a single pair of patches, in Figure 7. The rest of the synapses are omitted for ease ofvisualization.
Figure7:Connectivitybetweenconvolutionalpatches(“i”iseven)
Thisisafirstattemptataddingconnectivitybetweenpatches.Wemayexperimentwithdifferentlatticeneighborhoods (to encompass more of the nearby input space), and with a different overarchingconnectivity pattern; e.g., instead of connectivity certain neighboring patches, we could connect allneighboringpatches,orconnecteachpatchtoallothers.
Inhibition
Sincewewishtolearnweightsbetweenneighboringneuronsinneighboringpatches,itnolongermakessenseforasingleinhibitoryneurontobefiredbyallneuronsinasinglepatch,whichwouldtheninhibitall other patches totally. Instead, to allow for the learningof correlationsbetween thepatches, eachexcitatory neuron projects to its own inhibitory neuron. This inhibitory neuron, when it reaches itsthresholdandspikes,willinhibittheexcitatoryneuroninallotherexcitatorypatcheswhichresideinthesamespatiallocationastheneuronfromwhichitreceivesitssynapse.ThisinhibitionschemeisdepictedinFigure8.
![Page 9: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/9.jpg)
9
Figure8:ConvolutionalSNNwithbetween-patchconnectivity
LabelingNewData
Forthisarchitecturemodification,weimplementedadifferentlabelingschemefortheclassificationofnewinputdata.Recallthat,fortheoriginalSNNmodel,welabelanewinputdatumwiththelabelwhichcorrespondstothehighnumberofspikesfromtheexcitatorylayer.Theneurons,inturn,areassigneddigitlabelsbasedonthedigitclassforwhichtheyspikethemost.Sinceeachconvolutionpatchlearnsasingle “feature” of the input (theweights on the connections from input to all neurons in the samepatchareidentical),wechoosetoincludeonlythecountsofspikesonanewinputdatumfromonlytheneuronineachpatchwhichspikedthemostwhileitwaspresentedtothenetwork.Thejustificationforthiswasthatmanyoftheneuronsinthenetworkwouldspikeinfrequentlyonanygiveninput,butonlya few neuronswould spike consistently. Removing the counts from the infrequently spiking neuronsallowsustoremovethe“junk”countsfromthelabelingscheme.
3.5-PreliminaryResults
We are currently running a number of simulations of the above described network, for a variety ofsettingsofconvolutionsize,stridesize,andnumberofconvolutionpatches.WeincludeinFigure9,10,and 11 plots of the accuracy of the network with the added pairwise lattice connectivity, evaluatedevery100trainingexamples.Wesetthe labelsof theneuronsusingthespikesresulting fromthe last100examples,andusetheselabelstoclassifythenext100examples.Sinceweareevaluatingonsofewdata at each iteration, the accuracy curve exhibits high variance. Each of these networks have 50convolutionfeatures.
![Page 10: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/10.jpg)
10
Figure9:27x27convolutions,stride1,50features
(62.3%averageaccuracyvs.61.4%withoutlatticeconnections)
Figure10:18x18convolutions,stride2,50features
(50.3%averageaccuracyvs.43.25%withoutlatticeconnections)
Figure11:10x10convolutions,stride6,50features
(40.1%averageaccuracyvs.21.08%withoutlatticeconnections)
![Page 11: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/11.jpg)
11
4.DiscussiononSNNComputationalComplexity
Wedescribethecomputationalrequirementsofourspikingneuralnetworkmodelsandcompare ittothat required by a convolutional neural network in the deep learning literature. Our spiking neuralnetworkmodelsareprogrammedinPython,usingtheBRIANneuralnetworksimulator[3].
4.1-SpikingNeuralNetworkComputation-MNIST
Weconsidera single training “iteration”of the spikingneuralnetworkmodel from [2] (seeFigure1).Recall that, for each input example,we “present” the input to the network for 350ms and allow thenetworktorelaxbacktoequilibriumfor150ms,foratotalof500msperiteration(or1,000integrationsteps with a 0.5ms time step). The input is encoded as Poisson spike trains [3], as described thedocument“SpikingNeuralNetworkModelandSTDPRules”,andeachareconnectedtoeveryneuroninan arbitrarily-sized layer of excitatory neurons. ThesePoisson spike trains emit a spikewith a certainprobabilityateachtimestep,whichincreaseswiththetimeelapsedsinceitlastemittedaspike.So,fortheMNISTdigitdataset,weconsider28*28=784independentPoissonprocesses,eachofwhichmustbesolvedforineachtimestepofeachtrainingiteration.
The neurons in the excitatory and inhibitory layers of the network are modeled with leakyintegrate-and-fireneurons,whosemembranepotentialsaredescribedby
,
Where is the resting potential of the neuron, and are the equilibrium potentials of
excitatoryandinhibitoryneurons,respectively, isabiologicallyplausibletimeconstant,and andare theconductancesofexcitatoryand inhibitorysynapses, respectively.Theexcitatoryand inhibitoryconductancesupdateruleisgivenby
, ,
Where , arebiologicallyplausibletimeconstants.Foreachexcitatoryand inhibitoryneuron,we
solve the above equations at each time step. Assuming there are excitatory neurons, and
inhibitoryneurons,wemustsolve equationsateachtimestep,accountingforall inputPoisson spike trains spikes and excitatory, inhibitory neurons’ membrane potentials and synapseconductance.
Atthebeginningofthetrainingphase,theweightsonthesynapsesfrominputtothe layerofexcitatoryneuronsarechosenuniformlyatrandomintherange[0.01,0.3],andthroughoutthetrainingphase,theseareupdatedusingourchosenSTDPrule.ForeachinputPoissonspiketrainandexcitatoryneuron,a“trace”tellsushowsooninthepastthisneuronhasfired.It’sgeneralformisgivenby
,
![Page 12: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/12.jpg)
12
where is the trace,which is set to1whentheneuron fires,andupdatesaccording to theequation
aboveotherwise,and isatimeconstant,whichmaybechosentobedifferentforpresynaptic(inputPoissonspiketrains)orpostsynaptic(excitatorylayer)neurons.Whenaninputspiketrainemitsaspike,weupdatethesynapseconductanceofthoseneuronsitisconnectedto(allexcitatoryneurons)bytheweightalongitsconnectiontoit.Whenanexcitatoryneuronemitsaspike,weapplytheSTDPupdaterule to the connections from the input layer to itself, and update the synapse conductance of theinhibitory neuron to which it connects. Since there are approximately 15 spikes emitted from theexcitatorylayerpertrainingiteration,thereareapproximately weightupdatesperiteration.
Alltogether,overonetrainingiteration(inwhichasingletrainingexampleispresentedtothenetwork),thereisontheorderof operationstocompute(weonlycalculateweightupdatesduringtheactive inputperiod; thenetwork isat“rest” for the last300timesteps).Forexample,withanetworkof400excitatoryandinhibitoryneurons(whichgivesapproximately90%classificationaccuracy,dependingon theSTDP ruleutilized), thisamounts toapproximately10.6millionoperationspertrainingexample.
SincetheSTDPlearningruleisonlyappliedlocally(i.e.,betweenanytwogivenneurons),wedon’thavetowait to propagate signals through the network before computing a backward pass in order, as ingradient-based deep learningmethods. Additionally, very littlememory is required to simulate thesenetworks. We keep track of matrices of weights, membrane potentials, synapse conductances, andsynaptic traces only, and even with large networks, the memory overhead for the network ismanageableonlaptopmachines.
4.2-ConvolutionalNeuralNetworkComputation–MNIST
WeestimatethecomputationalburdenimposedbyasimpleconvolutionalnetworkusedtoclassifytheMNIST handwritten digit dataset. The described network achieves approximately 99% classificationaccuracyonthetestdatasetaftertrainingonsome500,000images(wesubsample50imagesfromthetrainingdataset10,000times).Allconvolutional layersofthenetworkutilizezero-paddingofsize2onalledgesoftheimages.Thenetworkconsistsofaconvolutionallayer,withpatchsize5x5andstride1,with32convolutionfeatures(giving28x28x32dimensionality),thenamax-poolinglayerwithpatchsize2x2andstride2(giving14x14x32dimensionality),anotherconvolutionallayer,againwithpatchsize5x5andstride1,with64convolutionfeatures(giving14x14x64dimensionality)anothermax-poolinglayer,again with patch size 2x2 and stride 2 (giving 7x7x64 dimensionality), a fully-connected layer of size1024,andalogisticregressionclassifierofdimensionality10,fromwhichweextractthedigitlabelusingthesoftmaxfunction.
Forasingletrainingexample,thefirstconvolutional layerofthisnetworkrequiresthecomputationof28*28*32 ReLU() (rectified linear unit) functions, each on the sum of 5*5 entries in the input layermultipliedbytheirconnectionweights,foratotalofontheorderof627,200operations.Thefirstmax-pooling layer simply requires computing themax operation over some 14*14*32 2x2matrices, for atotal of 6,272operations. Similarly, the second convolutional layer and the secondmax-pooling layerrequiresome313,600and3,136operations,respectively.Thefully-connectedlayerrequirescomputing1,024ReLU functions, eachon the sumofproductsof theentries in theprevious layerby their edgeweights,resultinginontheorderof operations.Finally,thelogisticregressionrequires another1024*10operations, and then simply takes theargmax() functionof theoutput 10-
![Page 13: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/13.jpg)
13
dimensional vector. Therefore, the forward pass of the network takes approximately operations. Thebackwardpass,ontheotherhand,requiringcomputingpartialderivativesforeachparameterineachofthesefunctions,butitscomplexityisthesameasthecomplexityoftheforwardpass.Tounderestimate,we say that the backward pass requires, again, 4,171,712 operations, and so we estimate that thisnetworkrequiresapproximately8.3millionoperationspertrainingexample.
4.3-ClassificationResults
We include a plot of the performance of the spiking neural network model from [1] on the MNISThandwrittendigit testdataset inFigure2.Thisplotgives theclassificationaccuracyof themodelasafunctionofthenumberofexcitatoryandinhibitoryneuronsinthenetwork.Eachnetworkwastrainedusing a single pass through the entire 60,000 digit dataset. We have also include the classificationaccuracyoftheabovedescribedconvolutionalneuralnetwork,alsoonthetestdataset.WetrainedandtestedtheSNNmodelusingexcitatory,inhibitorylayersizesof25,50,100,200,400,800,1600,3200,6400.Theclassificationaccuraciesfortheseparticularrunsincreasemonotonically,ascanbeseeninthefigure.
Figure12:ClassificationaccuracyofETHSNNmodelbynetworksize
4.4-PossibleImprovements-SpikingNeuralNetworkOnesimplewaytoreducethetimecomplexityofthespikingneuralnetworkmodelfrom[1]istoreducethenumberoftimestepsusedperiterationofthetrainingandtestphases.Currently,we“present”thenetworkwithaninputfor350ms(700timestepsata0.5mstimeincrement),andthen“turnoff”theinputfor150ms(300timesteps),allowingthemembranepotentialsandsynapseconductancesoftheexcitatoryandinhibitoryneuronstodecaybacktowardstheirrestingvalues.Reducingthenumberoftimestepswouldrequireadjustingtheparameteroftheequationswhichgoverntheneuronsinthenetworkinorderfortheneuronstorespondtotheinputdatamorequickly.
![Page 14: Supplement to DARPA Quarterly Report Q2 Task 1.2 ... · hadn’t emitted a spike during the training iteration. Thanks to the adaptive membrane potential mechanism, one can also observe](https://reader033.fdocuments.in/reader033/viewer/2022042401/5f0fed497e708231d446940c/html5/thumbnails/14.jpg)
14
AcurrentseverelimitationoftheBRIANneuralnetworksimulationsoftware(usingversion1.4.3)isthatitworksmost efficiently as a single-core process. Upgrading to version 2.2.x should allow us to takeadvantage ofmulti-core andGPU processing in order to speed up our experiments, butwemust becarefultoreplicatethecodecorrectlyandconsideralltheparallelprocessingoptionsavailable.
5.References
[1]Diehl,P.&Cook,M.(2015).Unsupervisedlearningofdigitrecognitionusingspike-timing-dependentplasticity.FrontiersInComputationalNeuroscience,9.doi:10.3389/fncom.2015.00099
[2]Krizhevsky, A., Sutskever, I.,&Hinton,G. (2012). ImageNet ClassificationwithDeep ConvolutionalNeuralNetworks.Papers.nips.cc.
[3] Goodman DF and Brette R (2009). The Brian simulator. Front Neuroscidoi:10.3389/neuro.01.026.2009
[4]http://www.cns.nyu.edu/~david/handouts/poisson.pdf