BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point...

17
Matthieu R Bloch Tuesday, March 3, 2020 BREAK POINTS BREAK POINTS 1

Transcript of BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point...

Page 1: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

Matthieu R Bloch Tuesday, March 3, 2020

BREAK POINTSBREAK POINTS

1

Page 2: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

LOGISTICSLOGISTICSTAs and Office hours

Tuesday: Dr. Bloch (College of Architecture Cafe) - 11am - 11:55amTuesday: TJ (VL C449 Cubicle D) - 1:30pm - 2:45pmThursday: Hossein (VL C449 Cubicle B): 10:45pm - 12:00pmFriday: Brighton (TSRB 523a) - 12pm-1:15pm

ProjectsProposals due Friday March 13, 2020

Midterm Thursday March 5thSample midterm posted (do not share)Open notes but not open electronics (no space on the desks :()

2

Page 3: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

RECAP: DICHOTOMYRECAP: DICHOTOMYThe union bound naturally depends on the size of , but this is not what matters

We will consider the number of hypotheses that lead to distinct labeling for a dataset

Definition. (Dichotomy)

For a dataset and set of hypotheses , the set of dichotomies generated by on is

By definition and in general

Definition. (Growth function)

For a set of hypotheses , the growth function of is

The growth function does not depend on the datapoints and

H

D ≜ {xi}N

i=1 H H D

H({ ) ≜ {{h( ) : h ∈ H}xi}N

i=1 xi }N

i=1

|H({ )| ≤xi}N

i=1 2N |H({ )| ≪ |H|xi}N

i=1

H H

(N) ≜ |H({ )|mH max{xi}

N

i=1

xi}N

i=1

{xi}N

i=1 (N) ≤mH 2N

3

Page 4: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

RECAP: EXAMPLESRECAP: EXAMPLESPositive rays:

Positive intervals:

H ≜ {h : R → {±1} : x ↦ sgn (x − a)|a ∈ R}

= N + 1mH

H ≜ {h : R → {±1} : x ↦ 1{x ∈ [a; b]} − 1{x ∉ [a; b]}|a < b ∈ R}

= ( ) + 1 = + N + 1mH

N + 1

2

1

2N

2 1

2

4

Page 5: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

EXAMPLESEXAMPLESConvex sets:

because we can generate all dichotomies

Definition. (Shattering)

If can generate all dichotomies on , we say that shatters

H ≜ {h : → {±1}|{x ∈ : h(x) = +1} is convex}R2 R2

(N) =mH 2N

H {xi}N

i=1 H {xi}N

i=1

5

Page 6: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

6

Page 7: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

EXAMPLESEXAMPLESLinear classifiers:

The growth function is a worst case measure, hence

4 points cannot always be shattered and

H ≜ {h : → {±1} : x ↦ sgn ( x + b)|w ∈ , b ∈ R}R2w

⊺ R2

(3) = 8mH

(4) = 14 <mH 24

7

Page 8: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

8

Page 9: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

WISHFUL THINKINGWISHFUL THINKINGThe growth function can sometimes be smaller than

What if we could use instead of in our analysis?

We know that with probability at least

What if we could show that with probability at least

Our ability to generalize then depend on the behavior of the growth functionIf is exponential in , we can’t say muchIf is polynomial in , then for large enough

mH |H|

mH |H|

1 − δ

R( ) ≤ ( ) +h∗

R̂N h∗ log

1

2N

2|H|

δ

− −−−−−−−−−√

1 − δ

R( ) ≤ ( ) + ?h∗

R̂N h∗ log

1

2N

2 (N)mH

δ

− −−−−−−−−−−−−√

(N)mH N

(N)mH N R( ) ≈ ( )h∗

R̂N h∗

N

9

Page 10: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

10

Page 11: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

BREAK POINTSBREAK POINTSDefinition. (Break point)

If not data set of size can be shattered by , then is a break point for

If is a break point then

ExamplesPositive rays: break point Postive Intervals: break point Convex sets: break point Linear classifiers: break point

We will spend some time proving the following result

Proposition.

If there exists any break point for , then is polynomial in

If there is no break point for , then

k H k H

k (k) <mH 2k

k = 2k = 3

k = ∞k = 4

H (N)mH N

H (N) =mH 2N

11

Page 12: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

BREAK POINTSBREAK POINTSDefinition.

Assume has break point . is the maximum number of dichotomies of points such thatno subset of size can be shattered by the dichotomies.

is a purely combinatorial quantity, does not depend on

Example.

Assume has break point . How many dichotomies can we generate of set of size 3?

By definition, if is a break point for , then

H k B(N, k) N

k

B(N, k) H

H 2

k H (N) ≤ B(N, k)mH

12

Page 13: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

13

Page 14: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

SAUER’S LEMMASAUER’S LEMMALemma.

, for ,

Lemma (Sauer's lemma)

is polynomial in of degree

Conclusion: if has a break point, is polynomial in

B(N, 1) = 1 B(1, k) = 2 k > 1∀k > 1 B(N, k) ≤ B(N − 1, k) + B(N − 1, k − 1)

B(N, k) ≤ ( )∑i=0

k−1N

i

B(N, k) N k − 1

H (N)mH N

14

Page 15: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

15

Page 16: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

16

Page 17: BREAK POINTS - Bloch€¦ · If not data set of size can be shattered by , then is a break point for If is a break point then Examples Positiv e rays: break point Postiv e Inter vals:

17