Active Learn

7/31/2019 Active Learn

1/45

Ac#veLearning

Aarti Singh

Machine Learning 10-601

Dec 6, 2011

Slides Courtesy: Burr Settles, Rui Castro, Rob Nowak

1


2/45

Semi-supervised learning:Design a predictor based on iid unlabeled andfew randomlylabeled examples.

Learning algorithm

Assumption:Knowledge of marginal density can simplify predictione.g. similar data points have similar labels

Learningfromunlabeleddata


3/45

Active learning:Design a predictor based on iid unlabeled and selectivelylabeled examples

Learning algorithm

Selectivelabeling

Assumption: Some unlabeled examples are more informative than others forprediction.

Learningfromunlabeleddata


4/45

35

8

7

9

4

2

1

many unlabeled data

plus a few labeled examples

knowledge of clusters + a few labels in each is sufficient to

design a good predictor Semi-supervised learning

Example:Hand-wri


5/45

Not all examples are created equal

Labeled examples near boundaries of clusters

are much more informative Active learning

Example:Hand-wri


6/45

PassiveLearning


7/45

Semi-supervisedLearning


8/45

Ac#veLearning


9/45

The eyes focus on the interesting and relevant features,and do not sample all the regions in the scene in the sameway.

Feedbackdrivenlearning


10/45

Feedbackdrivenlearning


11/45

Is the personwearing a hat ?

Does the personhave blue eyes ?

Active Learning works very well in simple conditions

TheTwentyques#onsgame

Focus on mostinformative questions


12/45

ThoughtExperiment

supposeyouretheleaderofanEarthconvoysenttocolonizeplanetMars

people who ate the roundMartian fruits found them tasty!

people who ate the spikedMartian fruits died!


13/45

Poisonvs.YummyFruits

problem:theresarangeofspiky-to-roundfruitshapesonMars:

youneedtolearnthethresholdof

roundnesswherethefruitsgofrom

poisonoustosafe.

andyouneedtodeterminethisrisking

asfewcolonistslivesaspossible!


14/45

Tes#ngFruitSafety

thisisjustabinarybisec#onsearch

YourfirstacFvelearningalgorithm!


15/45

Ac#veLearning

keyidea:thelearnercanchoosetrainingdataonthefly

onMars:whetherafruitwaspoisonous/safe

ingeneral:thetruelabelofsomeinstance goal:reducethetrainingcosts

onMars:thenumberoflivesatriskingeneral:thenumberofqueries


16/45

Goal: Given a budget of n samples, learn thresholdas accurately as possible

step function

Learningachange-point

Locate a change-point or threshold (poisonous/yummy fruit,

contamination boundary)


17/45

Sample locations must be chosen before any observationsare made

PassiveLearning


18/45

Sample locations must be chosen before any observationsare made

Too many wasted samples. Learning is limited by sampling

resolution

PassiveLearning


19/45

Sample locations are chosen based on previous observations

ActiveLearning


20/45

Sample locations are chosen based on previous observations

The error decays much faster than in the passivescenario. No wasted samples Exponential improvement!

Works even when labels are noisy though improvement depends on

amount of noise

ActiveLearning


21/45

Prac#calLearningCurves

text classification:baseballvs. hockey

active learning

passive learning

better


22/45

Probabilis#cBinaryBisec#on

letstrygeneralizingourbinarysearchmethodusingaprobabilis.cclassifier:

0.50.0

1.00.50.5


23/45

UncertaintySampling

queryinstancesthelearnerismostuncertainabout

400 instances sampledfrom 2 class Gaussians

random sampling30 labeled instances

(accuracy=0.7)

active learning30 labeled instances

(accuracy=0.9)

[Lewis & Gale, SIGIR94]

Using logistic regression


24/45

GeneralizingtoMul#-ClassProblems

least confident [Culotta & McCallum, AAAI05]

smallest-margin [Scheffer et al., CAIDA01]

entropy [Dagan & Engelson, ICML95]

note:forbinarytasks,theseareequivalent


25/45

Query-By-Commi


26/45

VersionSpaceExamples


27/45

QBCExample


28/45

QBCExample


29/45

QBCExample


30/45

QBCExample


31/45

QBCGuarantees

[Freund et al.,97] theoreFcalguaranteesd VCdimensionofcommiKeeclassifiers

UndersomemildcondiFons,theQBCalgorithmachievesa

predicFonaccuracyofandw.h.p.

#unlabeledexamplesgenerated O(d/)

#labelsqueried O(log2d/)

Exponen#alimprovement!!


32/45


33/45

Batch-basedac#velearning

wireless sensor networks/mobile sensing

Active sensing


34/45

Coarse sampling (Low variance, bias limited)

Refine sampling (Low variance, low bias)

Batch-basedac#velearning


35/45

Ac#veLearningforTerrainMapping


36/45

Whendoesac#velearningwork?

Passive = Active

Active learning is useful if complexity of target function is localized labels of some data points are more informative than others.

Passive

Active

[Castro et al.,05]1-D 2-D


37/45

Ac#vevs.Semi-Supervised

uncertainty sampling

query instances the modelis least confident about

query-by-committee (QBC)

use ensembles to rapidlyreduce the version space

Generative modelexpectation-maximization (EM)

propagate confident labelings

among unlabeled data

co-trainingmulti-view learning

use ensembles with multiple views

to constrain the version space

bothtrytoa


38/45

Problem:Outliers

aninstancemaybeuncertainorcontroversial(forQBC)simplybecauseitsanoutlier

queryingoutliersisnotlikelytohelpusreduceerroronmoretypicaldata


39/45


40/45

Solu#on2:Es#matedErrorReduc#on

minimizetheriskR(x)ofaquerycandidateexpecteduncertaintyoverUifxisaddedtoL

[Roy & McCallum, ICML01; Zhu et al., ICML-WS03]

sum overunlabeled instances uncertainty of u

after retraining with x

expectation overpossible labelings of x


41/45

TextClassifica#onExamples

[Roy & McCallum, ICML01]


42/45

TextClassifica#onExamples

[Roy & McCallum, ICML01]


43/45

Ac#veLearningScenarios

Query synthesis: construct desired query/questions

Stream-based selective sampling: unlabeled data presentedin a stream, decide whether or not to query its label

Pool-based active learning: given a pool of unlabeled data,

select one and query its label


44/45

AlternateSe]ngs

Sofarwefocusedonqueryinglabelsforunlabeleddata.

Otherquerytypes:

Ac#vefeatureacquisi#ondecidingwhetherornottoobtainaparFcularfeature,e.g.featuressuchasgeneexpressionsmightbecorrelated.

Mul#pleInstanceac#velearning-onelabelforabagofinstances,e.g.labelforadocument(bagofinstances)butcanquerypassages(instance)coarse-scalelabelsarecheaper

Othersengs:

Cost-sensi#veac#velearningsomelabelsmaybemoreexpensivethan

others,e.g.collecFngpaFentvitalsvs.complexandexpensivemedicalproceduresfordiagnosis.

Mul#-taskac#velearningifeachlabelprovidesinformaFonformulFpletasks,whichinstancesshouldbequeriedsoastobemaximallyinformaFveacrossalltasks,e.g.animagecanbelabeledasart/photo,nature/man-madeobjects,containsafaceornot.


45/45

Ac#veLearningSummary

BinarybisecFon Uncertaintysampling Query-by-commiKee DensityWeighFng EsFmatedErrorReducFon ExtensionsAcFveeatureacquisiFon,MulFple-instanceacFvelearning,Cost-sensiFveacFvelearning,MulF-taskacFvelearning

Active learning is a powerful tool if complexity of targetfunction is localized labels of some data points are moreinformative than others.

Active Learn

Documents

Transcript of Active Learn