Active Learning and Selective...

47
Active Learning and Selective Sensing

Transcript of Active Learning and Selective...

Page 1: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Active Learning and

Selective Sensing

Page 2: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Can’t Learn W

ithout You

Sensing

Computing

Sensing

Computing

“Closing the Loop”

Page 3: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Learning to Discover

Sequential approach:select new samples/experiments that are

predicted to be maximally inform

ative in discriminating hypotheses

select

sensing

action

sample

/sense

observe

/ infer

Page 4: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Laplace

Discovery !

Decided to make new astronomical

measurements when “the discrepancy

between prediction and observation [was]

large enough to give a high probability that

there is something new to be found.”

Jaynes(1986)

selective

sensing

observe

/ infer

Page 5: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Learning a decision hyperplanein

+-

Selective sampling yields exponential speed-up in learning !

Y. Freund, H. S. Seung, E. Shamir, and

N. Tishby. Selective sampling using the

query by committee algorithm. Machine

Learning, 28(2-3):133–168, 1997.

Page 6: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Now you see it, now you don’t !

Weak signals/patterns are imperceptible without selective sensing !

sparse

signal

noise

J. Haupt, R. Castro, and R. Nowak,

"Distilled sensing: selective sampling for

sparse signal recovery," in Proceedings of

AISTATS 2009, pp 216-223.

Page 7: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Outline

1. Active Learning: selective sampling for binary prediction problems

2. Distilled Sensing: selective sensing for sparse signal recovery

Common theme: feedback between data analysis and data

collection can be crucial for effective learning and inference

Page 8: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

hypothesis

space

“Does the person

have blue eyes ?”

“Is the person

wearing a hat ?”

Binary Search

“Binary Search”works very well in simple conditions

Page 9: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Where is it shady vs. sunny ?

Binary Search and Threshold Functions

101

0x

y

Page 10: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Where is it shady vs. sunny ?

1/3 = 0…

1/3 = 01…

1/3 = 010…

1/3 = 0101…

Binary Search and Threshold Functions

101

0

0 * 1/2

1 * 1/40 * 3/8

1 *5/16

101

0*

**

**

**

**

**

**

**

**

11

11

0

11

00

00

00

00

00

active learning: sequentially select points for labeling

passive learning: all points are labeled

n samples �

n bits

n samples �

effectively log n bits

**

**

**

**

**

**

**

**

*x

y

x

y

Page 11: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Binary Search and Threshold Functions

101

0

0 * 1/2

1 * 1/40 * 3/8

1 *5/16

101

0*

**

**

**

**

**

**

**

**

11

11

0

11

00

00

00

00

00

Page 12: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Bounded and Unbounded Noise

“bounded noise”: strictly more/less probably 1 at all locations

more probably 0

more probably 1

“unbounded noise”: like the toss of a fair coin at threshold

Page 13: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Learning Rates for Multidimensional Thresholds

1

Compare with passive learning

Active Learning: Theorem (R. Castro and RN ’07)

Page 14: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

query

space

hypothesis

space

oracle

Learner

consider

hypotheses

select sample/

query that is highly

discriminative

query oracle

elim

inate or discount

inconsistent hypotheses

"With every m

istake, we m

ust surely be learning." G. Harrison

Page 15: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Generalized Binary Search (akaSplitting Algorithm)

Selects a query for which disagreementamong

viable hypotheses is maximal

hypothesis

space

hypothesis

space

query

space

oracle

Page 16: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Example: Two-Dimensional Thresholds

-1

+1

+1

Page 17: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

How well does GBS work ?

Page 18: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Geometry of GBS

Classic Binary Search is possible because the hypotheses are ordered with respect

to queries. W

e need a similar structure for more general hypothesis spaces.

To that end, note that the hypotheses induce a

partition of the query space into equivalence sets

Page 19: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

A

A’

Page 20: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Geometric Condition for GBS Convergence

Page 21: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Classic Binary Search Revisited

0 0 0 0 0 0 0 0 1 1 1 1 1 1

-1 -1-1

-1-1

-1

unknown correct threshold at i*/n

1/2

1/2

Page 22: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Theorem 1 Proof Sketch:

Page 23: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Theorem 1 Proof Sketch:

‘good’situation:

‘bad’situation:

Page 24: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

‘bad’situation:

x’

x+c

-c

-c +c

‘bad’situation:

Theorem 1 Proof Sketch:

‘good’situation:

Page 25: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate
Page 26: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Ex. Linear Thresholds in Two Dimensions

maximally inform

ative queries

Page 27: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Linear Thresholds in

Page 28: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate
Page 29: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate
Page 30: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

KimiParker

Example

Suppose we have a sensor network observing a binary activation pattern with a

linear boundary. How many sensors must be queried to determ

ine the pattern?

number of hypotheses vs. queries

log number of hypotheses vs. queries

100 sensors, 9900 possible linear boundaries

Correct boundary determ

ined after querying 12 sensors

Page 31: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Conclusions

Minim

axBounds

selective sampling/querying can accelerate the learning of threshold functions

Generalized Binary Search

multidimensional threshold functions can be learned at the optimal rate

by selecting maximally discriminative queries

R. Castro and RN, “M

inimaxBounds for Active Learning,”

IEEE Trans. Info. Theory, pp. 2339–2353, 2008

RN, “Generalized Binary Search,”Proceedings of the Forty-

Sixth Annual AllertonConference on Communication, 2008

Page 32: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Detection/Estimation of Sparse Signals

How reliably can we determ

ine sparsitypatterns ?

Distilled Sensing

Page 33: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Detection/Estimation of Sparse Signals

fMRI

Astrophysics

Genomics

How reliably can we infer sparsepatterns ?

Page 34: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Sparse Signal Model

signal support set

Example:

In this talk we will assume .

Page 35: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Noisy Observation Model

Suppose we want to locate just onesignal component:

Because of noise, even if no signal is present

How small can µ

*be so that we can still reliably

locate the signal components from the observations?

Page 36: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Sparse Support Recovery

Adaptive control of therelative proportion of errors

(Benjamini& Hochberg ’95)

When testing a large number of hypotheses

simultaneously we are bound to m

ake errors

Approaches:

Control the probability of perfect localization of the support

(Bonferronicorrection) –very conservative

Page 37: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

False Discovery Rate Control

Recall the definition of the signal support set

Goal:Estimate the support as well as possible. Let

be the outcome of a support estimation procedure.

# falsely discovered com

ponents

# discovered com

ponents

# missed com

ponents

# true components

False

Discovery

Proportion

Non D

iscovery

Proportion

Since nis typically very large it makes sense to

study asymptoticperform

ance, as n→∞

.

Page 38: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Known Results (Jin & Donoho’03)

Assume the signal is very sparse:

Example:β=3/4

n = 10000 ⇒

|Is|=10

n = 1000000 ⇒

|Is|=32

Number of signal

components

Page 39: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Known Results (Jin & Donoho’03)

Estimable

Signal

strength

Sparsity

Non-estimable

These asymptoticresults tell us how strong the

signals need to be for reliable signal localization

Page 40: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

A Generalization of the Sensing Model

Allow m

ultiple observations…

…subject to a sampling energy budget

are called the sensing vectors.

(Note:in the previous work a single observation was

considered, where )

Page 41: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Distilled Sensing

Proceeding in this fashion, gradually focuson the signal subspace

Page 42: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Enhanced Sensitivity by Selectivity

Theorem 2

(J. Haupt, R. Castro and RN ‘08)

Furtherm

ore if one does not allow active sensing, then the previous

results (equivalent to k=0) cannot be improved.

Page 43: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

original signal

(~0.8% non-zero components)

Noisy version of the

image (k=0)

Distilled sensing recovery

(FDR = 0.01)

Passive sensing recovery

(FDR = 0.01)

Noisy versions of the

image (k=5)

Distilled Sensing Example

Page 44: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Thresholds of Perceptibility

recoverypossible

from passive sensing

These results suggest we m

ight be able to estimate

signal with amplitudes growing slower than

Signal

strength

Sparsity

Page 45: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Universal Perceptibility

Theorem 3

(J. Haupt, R. Castro and RN ‘09)

Corollary

Page 46: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Proof Sketch (Theorem 2)

With high probability each distillation keeps almost all the non-zero

components and rejects about half of the non-signal components.

Energy in signal subspacedoublesat each step

Page 47: Active Learning and Selective Sensingweb.cse.ohio-state.edu/mlss09/mlss09_talks/4.june-THURS/MLSS_Nowak_talk.pdfConclusions MinimaxBounds selective sampling/querying can accelerate

Now you see it, now you don’t !

Weak signals/patterns are imperceptible without selective sensing !

sparse

signal

noise

J. Haupt, R. Castro and RN, “Distilled Sensing: Selective

Sampling for Sparse Signal Recovery,”AISTATS 2009