Perceptrons - University of Texas at · PDF fileAll CS188 materials are available at .]...

CS343:ArtificialIntelligencePerceptrons

Prof.ScottNiekum—TheUniversityofTexasatAustin[TheseslidesbasedonthoseofDanKleinandPieterAbbeelforCS188IntrotoAIatUCBerkeley.AllCS188materialsareavailableathttp://ai.berkeley.edu.]

Error-DrivenClassification

Errors,andWhattoDo

▪ ExamplesoferrorsDear GlobalSCAPE Customer,

GlobalSCAPE has partnered with ScanSoft to offer you the latest version of OmniPage Pro, for just $99.99* - the regular list price is $499! The most common question we've received about this offer is - Is this genuine? We would like to assure you that this offer is authorized by ScanSoft, is genuine and valid. You can get the . . .

. . . To receive your $30 Amazon.com promotional certificate, click through to

http://www.amazon.com/apparel

and see the prominent link for the $30 offer. All details are there. We hope you enjoyed receiving this message. However, if you'd rather not receive future e-mails announcing new store launches, please click . . .

WhattoDoAboutErrors

▪ Problem:there’sstillspaminyourinbox

▪ Needmorefeatures–wordsaren’tenough!▪ Haveyouemailedthesenderbefore?▪ Have1Motherpeoplejustgottenthesameemail?▪ Isthesendinginformationconsistent?▪ IstheemailinALLCAPS?▪ DoinlineURLspointwheretheysaytheypoint?▪ Doestheemailaddressyouby(your)name?

▪ NaïveBayesmodelscanincorporateavarietyoffeatures,buttendtodobestinhomogeneouscases(e.g.allfeaturesarewordoccurrences)

LinearClassifiers

FeatureVectors

Hello,

Do you want free printr cartriges? Why pay more when you can get them ABSOLUTELY FREE! Just

# free : 2 YOUR_NAME : 0 MISSPELLED : 2 FROM_FRIEND : 0 ...

SPAMor+

PIXEL-7,12 : 1 PIXEL-7,13 : 0 ... NUM_LOOPS : 1 ...

“2”

Some(Simplified)Biology

▪ Verylooseinspiration:humanneurons

LinearClassifiers

▪ Inputsarefeaturevalues▪ Eachfeaturehasaweight▪ Sumistheactivation

▪ Iftheactivationis:▪ Positive,output+1▪ Negative,output-1

Σf1f2f3

w1w2w3

>0?

Weights▪ Binarycase:comparefeaturestoaweightvector▪ Learning:figureouttheweightvectorfromexamples


# free : 4 YOUR_NAME :-1 MISSPELLED : 1 FROM_FRIEND :-3 ...


Dot product positive means the positive class

DecisionRules

BinaryDecisionRule

▪ Inthespaceoffeaturevectors▪ Examplesarepoints▪ Anyweightvectorisahyperplane▪ OnesidecorrespondstoY=+1▪ OthercorrespondstoY=-1

BIAS : -3 free : 4 money : 2 ...

0 10

1

2

freemon

ey

+1=SPAM

-1=HAM

WeightUpdates

Learning:BinaryPerceptron

▪ Startwithweights=0▪ Foreachtraininginstance:

▪ Classifywithcurrentweights

▪ Ifcorrect(i.e.,y=y*),nochange!

▪ Ifwrong:adjusttheweightvector

Learning:BinaryPerceptron

▪ Startwithweights=0▪ Foreachtraininginstance:▪ Classifywithcurrentweights

▪ Ifcorrect(i.e.,y=y*),nochange!▪ Ifwrong:adjusttheweightvectorbyaddingorsubtractingthefeaturevector.Subtractify*is-1.

Examples:Perceptron

▪ SeparableCase

MulticlassDecisionRule

▪ Ifwehavemultipleclasses:▪ Aweightvectorforeachclass:

▪ Score(activation)ofaclassy:

▪ Predictionhighestscorewins

Binary=multiclasswherethenegativeclasshasweightzero

Learning:MulticlassPerceptron

▪ Startwithallweights=0▪ Pickuptrainingexamplesonebyone▪ Predictwithcurrentweights

▪ Ifcorrect,nochange!▪ Ifwrong:lowerscoreofwronganswer,

raisescoreofrightanswer

Example:MulticlassPerceptron

BIAS : 1 win : 0 game : 0 vote : 0 the : 0 ...



“winthevote”

PropertiesofPerceptrons

▪ Separability:trueifsomeparametersgetthetrainingsetperfectlycorrect

▪ Convergence:ifthetrainingisseparable,perceptronwilleventuallyconverge(binarycase)

▪ MistakeBound:themaximumnumberofmistakes(binarycase)relatedtothemarginordegreeofseparability

Separable

Non-Separable

Examples:Perceptron

▪ Non-SeparableCase

ImprovingthePerceptron

ProblemswiththePerceptron

▪ Noise:ifthedataisn’tseparable,weightsmightthrash▪ Averagingweightvectorsovertimecan

help(averagedperceptron)

▪ Mediocregeneralization:findsa“barely”separatingsolution

▪ Overtraining:test/held-outaccuracyusuallyrises,thenfalls▪ Overtrainingisakindofoverfitting

FixingthePerceptron

▪ Idea:adjusttheweightupdatetomitigatetheseeffects

▪ MIRA*:chooseanupdatesizethatfixesthecurrentmistake…▪ …but,minimizesthechangetow

▪ The+1helpstogeneralize

*MarginInfusedRelaxedAlgorithm

MinimumCorrectingUpdate

minnotτ=0,orwouldnothavemadeanerror,sominwillbewhereequalityholds

MaximumStepSize

▪ Inpractice,it’salsobadtomakeupdatesthataretoolarge▪ Examplemaybelabeledincorrectly▪ Youmaynothaveenoughfeatures▪ Solution:capthemaximumpossiblevalueofτ withsomeconstant

C

▪ Correspondstoanoptimizationthatassumesnon-separabledata▪ Usuallyconvergesfasterthanperceptron▪ Usuallybetter,especiallyonnoisydata

LinearSeparators

▪ Whichoftheselinearseparatorsisoptimal?

SupportVectorMachines

▪ Maximizingthemargin:goodaccordingtointuition,theory,practice▪ Onlysupportvectorsmatter;othertrainingexamplesareignorable▪ Supportvectormachines(SVMs)findtheseparatorwithmaxmargin▪ Basically,SVMsareMIRAwhereyouoptimizeoverallexamplesatonce

MIRA

SVM

Classification:Comparison

▪ NaïveBayes▪ Buildsamodeltrainingdata▪ Givespredictionprobabilities▪ Strongassumptionsaboutfeatureindependence▪ Onepassthroughdata(counting)

▪ Perceptrons/MIRA:▪ Makeslessassumptionsaboutdata▪ Mistake-drivenlearning▪ Multiplepassesthroughdata(prediction)▪ Oftenmoreaccurate

Apprenticeship

PacmanApprenticeship!

▪ Examplesarestatess

▪ Candidatesarepairs(s,a)▪ “Correct”actions:thosetakenbyexpert▪ Featuresdefinedover(s,a)pairs:f(s,a)▪ Scoreofaq-state(s,a)givenby:

▪ HowisthisVERYdifferentfromreinforcementlearning?

“correct”actiona*

VideoofPacmanApprentice

WebSearch

Extension:WebSearch

▪ Informationretrieval:▪ Giveninformationneeds,produceinformation▪ Includes,e.g.websearch,questionanswering,andclassicIR

▪ Websearch:notexactlyclassification,butratherranking

x=“AppleComputers”

Feature-BasedRanking

x=“AppleComputer”

x,

x,

PerceptronforRanking

▪ Inputs▪ Candidates▪ Manyfeaturevectors:▪ Oneweightvector:▪ Prediction:

▪ Update(ifwrong):

Perceptrons - University of Texas at · PDF fileAll CS188 materials are available at .]...

Documents

Transcript of Perceptrons - University of Texas at · PDF fileAll CS188 materials are available at .]...