Active Learn
Transcript of Active Learn
-
7/31/2019 Active Learn
1/45
Ac#veLearning
Aarti Singh
Machine Learning 10-601
Dec 6, 2011
Slides Courtesy: Burr Settles, Rui Castro, Rob Nowak
1
-
7/31/2019 Active Learn
2/45
Semi-supervised learning:Design a predictor based on iid unlabeled andfew randomlylabeled examples.
Learning algorithm
Assumption:Knowledge of marginal density can simplify predictione.g. similar data points have similar labels
Learningfromunlabeleddata
-
7/31/2019 Active Learn
3/45
Active learning:Design a predictor based on iid unlabeled and selectivelylabeled examples
Learning algorithm
Selectivelabeling
Assumption: Some unlabeled examples are more informative than others forprediction.
Learningfromunlabeleddata
-
7/31/2019 Active Learn
4/45
35
8
7
9
4
2
1
many unlabeled data
plus a few labeled examples
knowledge of clusters + a few labels in each is sufficient to
design a good predictor Semi-supervised learning
Example:Hand-wri
-
7/31/2019 Active Learn
5/45
Not all examples are created equal
Labeled examples near boundaries of clusters
are much more informative Active learning
Example:Hand-wri
-
7/31/2019 Active Learn
6/45
PassiveLearning
-
7/31/2019 Active Learn
7/45
Semi-supervisedLearning
-
7/31/2019 Active Learn
8/45
Ac#veLearning
-
7/31/2019 Active Learn
9/45
The eyes focus on the interesting and relevant features,and do not sample all the regions in the scene in the sameway.
Feedbackdrivenlearning
-
7/31/2019 Active Learn
10/45
Feedbackdrivenlearning
-
7/31/2019 Active Learn
11/45
Is the personwearing a hat ?
Does the personhave blue eyes ?
Active Learning works very well in simple conditions
TheTwentyques#onsgame
Focus on mostinformative questions
-
7/31/2019 Active Learn
12/45
ThoughtExperiment
supposeyouretheleaderofanEarthconvoysenttocolonizeplanetMars
people who ate the roundMartian fruits found them tasty!
people who ate the spikedMartian fruits died!
-
7/31/2019 Active Learn
13/45
Poisonvs.YummyFruits
problem:theresarangeofspiky-to-roundfruitshapesonMars:
youneedtolearnthethresholdof
roundnesswherethefruitsgofrom
poisonoustosafe.
andyouneedtodeterminethisrisking
asfewcolonistslivesaspossible!
-
7/31/2019 Active Learn
14/45
Tes#ngFruitSafety
thisisjustabinarybisec#onsearch
YourfirstacFvelearningalgorithm!
-
7/31/2019 Active Learn
15/45
Ac#veLearning
keyidea:thelearnercanchoosetrainingdataonthefly
onMars:whetherafruitwaspoisonous/safe
ingeneral:thetruelabelofsomeinstance goal:reducethetrainingcosts
onMars:thenumberoflivesatriskingeneral:thenumberofqueries
-
7/31/2019 Active Learn
16/45
Goal: Given a budget of n samples, learn thresholdas accurately as possible
step function
Learningachange-point
Locate a change-point or threshold (poisonous/yummy fruit,
contamination boundary)
-
7/31/2019 Active Learn
17/45
Sample locations must be chosen before any observationsare made
PassiveLearning
-
7/31/2019 Active Learn
18/45
Sample locations must be chosen before any observationsare made
Too many wasted samples. Learning is limited by sampling
resolution
PassiveLearning
-
7/31/2019 Active Learn
19/45
Sample locations are chosen based on previous observations
ActiveLearning
-
7/31/2019 Active Learn
20/45
Sample locations are chosen based on previous observations
The error decays much faster than in the passivescenario. No wasted samples Exponential improvement!
Works even when labels are noisy though improvement depends on
amount of noise
ActiveLearning
-
7/31/2019 Active Learn
21/45
Prac#calLearningCurves
text classification:baseballvs. hockey
active learning
passive learning
better
-
7/31/2019 Active Learn
22/45
Probabilis#cBinaryBisec#on
letstrygeneralizingourbinarysearchmethodusingaprobabilis.cclassifier:
0.50.0
1.00.50.5
-
7/31/2019 Active Learn
23/45
UncertaintySampling
queryinstancesthelearnerismostuncertainabout
400 instances sampledfrom 2 class Gaussians
random sampling30 labeled instances
(accuracy=0.7)
active learning30 labeled instances
(accuracy=0.9)
[Lewis & Gale, SIGIR94]
Using logistic regression
-
7/31/2019 Active Learn
24/45
GeneralizingtoMul#-ClassProblems
least confident [Culotta & McCallum, AAAI05]
smallest-margin [Scheffer et al., CAIDA01]
entropy [Dagan & Engelson, ICML95]
note:forbinarytasks,theseareequivalent
-
7/31/2019 Active Learn
25/45
Query-By-Commi
-
7/31/2019 Active Learn
26/45
VersionSpaceExamples
-
7/31/2019 Active Learn
27/45
QBCExample
-
7/31/2019 Active Learn
28/45
QBCExample
-
7/31/2019 Active Learn
29/45
QBCExample
-
7/31/2019 Active Learn
30/45
QBCExample
-
7/31/2019 Active Learn
31/45
QBCGuarantees
[Freund et al.,97] theoreFcalguaranteesd VCdimensionofcommiKeeclassifiers
UndersomemildcondiFons,theQBCalgorithmachievesa
predicFonaccuracyofandw.h.p.
#unlabeledexamplesgenerated O(d/)
#labelsqueried O(log2d/)
Exponen#alimprovement!!
-
7/31/2019 Active Learn
32/45
-
7/31/2019 Active Learn
33/45
Batch-basedac#velearning
wireless sensor networks/mobile sensing
Active sensing
-
7/31/2019 Active Learn
34/45
Coarse sampling (Low variance, bias limited)
Refine sampling (Low variance, low bias)
Batch-basedac#velearning
-
7/31/2019 Active Learn
35/45
Ac#veLearningforTerrainMapping
-
7/31/2019 Active Learn
36/45
Whendoesac#velearningwork?
Passive = Active
Active learning is useful if complexity of target function is localized labels of some data points are more informative than others.
Passive
Active
[Castro et al.,05]1-D 2-D
-
7/31/2019 Active Learn
37/45
Ac#vevs.Semi-Supervised
uncertainty sampling
query instances the modelis least confident about
query-by-committee (QBC)
use ensembles to rapidlyreduce the version space
Generative modelexpectation-maximization (EM)
propagate confident labelings
among unlabeled data
co-trainingmulti-view learning
use ensembles with multiple views
to constrain the version space
bothtrytoa
-
7/31/2019 Active Learn
38/45
Problem:Outliers
aninstancemaybeuncertainorcontroversial(forQBC)simplybecauseitsanoutlier
queryingoutliersisnotlikelytohelpusreduceerroronmoretypicaldata
-
7/31/2019 Active Learn
39/45
-
7/31/2019 Active Learn
40/45
Solu#on2:Es#matedErrorReduc#on
minimizetheriskR(x)ofaquerycandidateexpecteduncertaintyoverUifxisaddedtoL
[Roy & McCallum, ICML01; Zhu et al., ICML-WS03]
sum overunlabeled instances uncertainty of u
after retraining with x
expectation overpossible labelings of x
-
7/31/2019 Active Learn
41/45
TextClassifica#onExamples
[Roy & McCallum, ICML01]
-
7/31/2019 Active Learn
42/45
TextClassifica#onExamples
[Roy & McCallum, ICML01]
-
7/31/2019 Active Learn
43/45
Ac#veLearningScenarios
Query synthesis: construct desired query/questions
Stream-based selective sampling: unlabeled data presentedin a stream, decide whether or not to query its label
Pool-based active learning: given a pool of unlabeled data,
select one and query its label
-
7/31/2019 Active Learn
44/45
AlternateSe]ngs
Sofarwefocusedonqueryinglabelsforunlabeleddata.
Otherquerytypes:
Ac#vefeatureacquisi#ondecidingwhetherornottoobtainaparFcularfeature,e.g.featuressuchasgeneexpressionsmightbecorrelated.
Mul#pleInstanceac#velearning-onelabelforabagofinstances,e.g.labelforadocument(bagofinstances)butcanquerypassages(instance)coarse-scalelabelsarecheaper
Othersengs:
Cost-sensi#veac#velearningsomelabelsmaybemoreexpensivethan
others,e.g.collecFngpaFentvitalsvs.complexandexpensivemedicalproceduresfordiagnosis.
Mul#-taskac#velearningifeachlabelprovidesinformaFonformulFpletasks,whichinstancesshouldbequeriedsoastobemaximallyinformaFveacrossalltasks,e.g.animagecanbelabeledasart/photo,nature/man-madeobjects,containsafaceornot.
-
7/31/2019 Active Learn
45/45
Ac#veLearningSummary
BinarybisecFon Uncertaintysampling Query-by-commiKee DensityWeighFng EsFmatedErrorReducFon ExtensionsAcFveeatureacquisiFon,MulFple-instanceacFvelearning,Cost-sensiFveacFvelearning,MulF-taskacFvelearning
Active learning is a powerful tool if complexity of targetfunction is localized labels of some data points are moreinformative than others.