Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author:...

26
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 [email protected] http://www.cmpe.boun.edu.tr/~ethem/i2ml3e Lecture Slides for

Transcript of Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author:...

Page 1: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

INTRODUCTION

TO

MACHINE

LEARNING 3RD EDITION

ETHEM ALPAYDIN

© The MIT Press, 2014

[email protected]

http://www.cmpe.boun.edu.tr/~ethem/i2ml3e

Lecture Slides for

Page 2: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

CHAPTER 12:

LOCAL MODELS

Page 3: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Introduction 3

Divide the input space into local regions and learn simple (constant/linear) models in each patch

Unsupervised: Competitive, online clustering

Supervised: Radial-basis functions, mixture of experts

Page 4: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Competitive Learning

ijtj

ti

ij

t

ij

t

ti

t

tti

i

lt

li

tti

t i itt

i

k

ii

mxbm

Em

k

b

bk

b

bE

:means- Online

:means- Batch

otherwise

min if

xm

mxmx

mxm

0

1

1X

4

Page 5: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

5

Winner-take-all

network

Page 6: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Adaptive Resonance Theory

Incremental; add a new cluster if

not covered; defined by vigilance,

ρ

otherwise

if

min

it

i

it

k

lt

k

li

tti

b

b

mxm

xm

mxmx

1

1

6

(Carpenter and Grossberg, 1988)

Page 7: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Self-Organizing Maps 7

2

2

22

1

ilile

ile lt

l

exp,

, mxm

Units have a neighborhood defined; mi is “between”

mi-1 and mi+1, and are all updated together

One-dim map: (Kohonen, 1990)

Page 8: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Radial-Basis Functions

Locally-tuned units:

0

1

wpwyH

h

thh

t

2

2

2 h

ht

th

sp

mxexp

8

Page 9: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Local vs Distributed Representation 9

Page 10: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Training RBF 10

Hybrid learning:

First layer centers and spreads:

Unsupervised k-means

Second layer weights:

Supervised gradient-descent

Fully supervised

(Broomhead and Lowe, 1988; Moody and Darken,

1989)

Page 11: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Regression 11

3

2

2

0

1

2

2

1

h

ht

th

t iih

ti

tih

h

hjtjt

ht i

ihti

tihj

t

th

ti

tiih

i

H

h

thih

ti

t i

ti

tihiihhh

spwyrs

s

mxpwyrm

pyrw

wpwy

yrwsE

mx

m

X|,,,

Page 12: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Classification 12

k h kthkh

h ithiht

i

t i

ti

tihiihhh

wpw

wpwy

yrwsE

0

0

exp

exp

log |,,,

Xm

Page 13: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Rules and Exceptions 13

0

1

vpwy tTH

h

thh

t

xv

Default

rule Exceptions

Page 14: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Rule-Based Knowledge 14

10

2

1022

10

22

3

2

32

12

2

2

2

2

1

2

11

321

.

.

.

ws

cxp

ws

bx

s

axp

ycxbxax

withexp

withexpexp

THEN OR AND IF

Incorporation of prior knowledge (before training)

Rule extraction (after training) (Tresp et al., 1997)

Fuzzy membership functions and fuzzy rules

Page 15: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Normalized Basis Functions 15

2

1

22

22

1

2

2

h

hjtj

t i

th

tiih

ti

tihj

th

t

ti

tiih

H

h

thih

ti

l llt

hht

H

l

tl

tht

h

s

mxgywyrm

gyrw

gwy

s

s

p

pg

/exp

/exp

mx

mx

Page 16: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Competitive Basis Functions 16

H

h

ttttt hphpp1

xrxxr ,||| Mixture model:

l llt

l

hht

hth

l

t

tt

sa

sag

lplp

hphphp

22

22

2

2

/exp

/exp

|

||

mx

mx

x

xx

Page 17: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Regression

222

1

ti

ti

i

tt yrp exp|xr

l

l i

til

ti

tl

i

tih

ti

tht

h

t h

hjtjt

hthhj

t

th

tih

tiih

ihtih

t h i

tih

ti

thhiihhh

lplp

hphphp

yrg

yrgf

s

mxgfmfyrw

wy

yrgws

xrx

xrxxr

m

,

,,

/

/

,,,

||

|||

exp

exp

fit constant the is

explog|

2

2

2

2

21

21

2

1

XL

17

Page 18: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Classification 18

l i

til

ti

tl

i

tih

ti

tht

h

k kh

ihtih

t h i

tih

ti

th

t h i

rtih

thhiihhh

yrg

yrgf

w

wy

yrg

ygwsti

log exp

log exp

exp

exp

log exp log

log|,,,

XL m

Page 19: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

EM for RBF (Supervised EM) 19

tth hpf xr ,|

E-step:

M-step:

t

th

t

ti

th

ih

t

th

T

ht

t htt

h

h

t

th

t

tth

h

f

rfw

f

fs

f

f

mxmx

xm

Page 20: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Learning Vector Quantization 20

otherwise

)label)label( if

it

i

it

it

i

mxm

mxmxm

(

H units per class prelabeled (Kohonen, 1990)

Given x, mi is the closest:

x

mi mj

Page 21: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Mixture of Experts

In RBF, each local fit is a

constant, wih, second

layer weight

In MoE, each local fit is

a linear function of x, a

local expert: ttih

tihw xv

21

(Jacobs et al., 1991)

Page 22: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

MoE as Models Combined

Radial gating:

Softmax gating:

22

l llt

hht

th

s

sg

22

22

2

2

/exp

/exp

mx

mx

l

tTl

tTht

hgxm

xm

exp

exp

Page 23: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Cooperative MoE 23

tj

th

t

ti

tih

tih

tihj

t

tth

tih

tiih

t i

ti

tihiihhh

xgywyrm

gyr

yrwsE

xv

m2

2

1X|,,

,

Regression

Page 24: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Competitive MoE: Regression 24

t

t

th

thh

tth

t

tih

tiih

tihih

tih

t h i

tih

ti

thhiihhh

gf

fyr

wy

yrgws

xm

xv

xv

m

exp log|,,2

,2

1XL

Page 25: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Competitive MoE: Classification 25

k

tkh

tih

k kh

ihtih

t h i

tih

ti

th

t h i

rtih

thhiihhh

w

wy

yrg

ygwsti

xv

xv

m

exp

exp

exp

exp

log exp log

log|,,,

XL

Page 26: Introduction to Machine Learning - CmpE WEB · Title: Introduction to Machine Learning Author: ethem Created Date: 7/9/2014 3:28:29 PM

Hierarchical Mixture of Experts 26

Tree of MoE where each MoE is an expert in a

higher-level MoE

Soft decision tree: Takes a weighted (gating)

average of all leaves (experts), as opposed to

using a single path and a single leaf

Can be trained using EM (Jordan and Jacobs,

1994)