Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides...
Transcript of Ar#ficial Intelligence - University of Adelaidedsuter/Harbin_course/Perceptron.pdf · [Many slides...
Ar#ficialIntelligencePerceptrons
Instructors:DavidSuterandQinceLi
CourseDelivered@HarbinIns#tuteofTechnology[ManyslidesadaptedfromthosecreatedbyDanKleinandPieterAbbeelforCS188IntrotoAIatUCBerkeley.SomeothersfromcolleaguesatAdelaide
University.]
Error-DrivenClassifica#on
WhattoDoAboutErrors
§ Problem:there’ss#llspaminyourinbox
§ Needmorefeatures–wordsaren’tenough!§ Haveyouemailedthesenderbefore?§ Have1MotherpeoplejustgoYenthesameemail?§ Isthesendinginforma#onconsistent?§ IstheemailinALLCAPS?§ DoinlineURLspointwheretheysaytheypoint?§ Doestheemailaddressyouby(your)name?
§ NaïveBayesmodelscanincorporateavarietyoffeatures,buttendtodobestinhomogeneouscases(e.g.allfeaturesarewordoccurrences)
LinearClassifiers
FeatureVectors
Hello, Do you want free printr cartriges? Why pay more when you can get them ABSOLUTELY FREE! Just
# free : 2 YOUR_NAME : 0 MISSPELLED : 2 FROM_FRIEND : 0 ...
SPAMor+
PIXEL-7,12 : 1 PIXEL-7,13 : 0 ... NUM_LOOPS : 1 ...
“2”
Some(Simplified)Biology
§ Verylooseinspira#on:humanneurons
LinearClassifiers
§ Inputsarefeaturevalues§ Eachfeaturehasaweight§ Sumistheac#va#on
§ Iftheac#va#onis:§ Posi#ve,output+1§ Nega#ve,output-1 Σ
f1
f2
f3
w1
w2 w3
>0?
Weights§ Binarycase:comparefeaturestoaweightvector§ Learning:figureouttheweightvectorfromexamples
# free : 2 YOUR_NAME : 0 MISSPELLED : 2 FROM_FRIEND : 0 ...
# free : 4 YOUR_NAME :-1 MISSPELLED : 1 FROM_FRIEND :-3 ...
# free : 0 YOUR_NAME : 1 MISSPELLED : 1 FROM_FRIEND : 1 ...
Dot product positive means the positive class
DecisionRules
BinaryDecisionRule
§ Inthespaceoffeaturevectors§ Examplesarepoints§ Anyweightvectorisahyperplane§ OnesidecorrespondstoY=+1§ OthercorrespondstoY=-1
BIAS : -3 free : 4 money : 2 ... 0 1
0
1
2
free
mon
ey
+1=SPAM
-1=HAM
WeightUpdates
Learning:BinaryPerceptron
§ Startwithweights=0§ Foreachtraininginstance:
§ Classifywithcurrentweights
§ Ifcorrect(i.e.,y=y*),nochange!
§ Ifwrong:adjusttheweightvector
Learning:BinaryPerceptron
§ Startwithweights=0§ Foreachtraininginstance:
§ Classifywithcurrentweights
§ Ifcorrect(i.e.,y=y*),nochange!§ Ifwrong:adjusttheweightvectorbyaddingorsubtrac#ngthefeaturevector.Subtractify*is-1.
Examples:Perceptron
§ SeparableCase
RealData
Mul#classDecisionRule
§ Ifwehavemul#pleclasses:§ Aweightvectorforeachclass:
§ Score(ac#va#on)ofaclassy:
§ Predic#onhighestscorewins
Binary=mul,classwherethenega,veclasshasweightzero
Learning:Mul#classPerceptron
§ Startwithallweights=0§ Pickuptrainingexamplesonebyone§ Predictwithcurrentweights
§ Ifcorrect,nochange!§ Ifwrong:lowerscoreofwronganswer,
raisescoreofrightanswer
§ Theconceptofhavingaseparatesetofweights(oneforeachclass)canbethoughtofashavingseparate“neurons”–alayerofneurons–oneforeachclass.Asinglelayernetwork.Ratherthantakingclass“max”overtheweights–onecantraintolearnacodingvector…
E.G.LearningDigits0,1….9
Proper#esofPerceptrons
§ Separability:trueifsomeparametersgetthetrainingsetperfectlycorrect
§ Convergence:ifthetrainingisseparable,perceptronwilleventuallyconverge(binarycase)
Separable
Non-Separable
ImprovingthePerceptron
ProblemswiththePerceptron
§ Noise:ifthedataisn’tseparable,weightsmightthrash§ Averagingweightvectorsover#me
canhelp(averagedperceptron)
§ Mediocregeneraliza#on:findsa“barely”separa#ngsolu#on
§ Overtraining:test/held-outaccuracyusuallyrises,thenfalls§ Overtrainingisakindofoverfiong
FixingthePerceptron
§ Lotsofliteratureonchangingthestepsize,averagingweightupdatesetc….
LinearSeparators
§ Whichoftheselinearseparatorsisop#mal?
SupportVectorMachines
§ Maximizingthemargin:goodaccordingtointui#on,theory,prac#ce§ OnlysupportvectorsmaYer;othertrainingexamplesareignorable§ Supportvectormachines(SVMs)findtheseparatorwithmaxmargin§ Basically,SVMsareMIRAwhereyouop#mizeoverallexamplesatonce
SVM
Classifica#on:Comparison
§ NaïveBayes§ Buildsamodeltrainingdata§ Givespredic#onprobabili#es§ Strongassump#onsaboutfeatureindependence§ Onepassthroughdata(coun#ng)
§ Perceptrons/SVN:§ Makeslessassump#onsaboutdata(?–linearseparabilityisabigassump#on!ButkernelSVN’setcweakenthatassump#on)
§ Mistake-drivenlearning§ Mul#plepassesthroughdata(predic#on)§ Oqenmoreaccurate