Classification - Seoul National University · Perceptron algorithm Simple optimization procedure...
Transcript of Classification - Seoul National University · Perceptron algorithm Simple optimization procedure...
Cla
ssif
icat
ion
Lea
rnin
g Fr
om D
ata
Cha
pter
8
Out
line
�8.
1 S
tatis
tical
Lea
rnin
g T
heor
y Fo
rmul
atio
n
�8.
2 C
lass
ical
For
mul
atio
n
�8.
3 M
etho
ds f
or C
lass
ific
atio
n
�8.
4 S
umm
ary
�C
lass
ific
atio
n�
Inpu
t sam
ple
x=(x
1, x
2, …
, xd)
is c
lass
ifie
d to
one
(and
onl
y on
e) o
f J
grou
ps.
�C
once
rned
with
rel
atio
nshi
p be
twee
n th
e cl
ass-
mem
bers
hip
labe
l and
fea
ture
vec
tor.
�G
oal i
s to
est
imat
e th
e m
appi
ng x
→ y
(de
cisi
onru
le)
usin
g la
bele
d tr
aini
ng d
ata
(xi,
y i).
�T
wo-
clas
s pr
oble
m�
Out
put o
f th
e sy
stem
�T
akes
on
valu
es y
={0
, 1}.
�L
earn
ing
mac
hine
�Im
plem
ent a
set
of
indi
cato
r fu
nctio
ns f(
x, ω
).
�L
oss
func
tion
),
(
),
(
10))
,(
,(
ωωx
fy
if
xf
yif
wx
fy
L≠=
=
�R
isk
Func
tiona
l
�L
earn
ing
is th
e pr
oble
m o
f fi
ndin
g th
e fu
nctio
nf(
x, ω
0) th
at m
inim
izes
the
prob
abili
ty o
fm
iscl
assi
fica
tion
usin
g on
ly th
e tr
aini
ng d
ata.
∫=
dxdy
yx
px
fy
LR
),
())
,(
,(
)(
ωω
�Im
plem
enta
tion
of m
etho
ds u
sing
SR
Mre
quir
es:
�Sp
ecif
icat
ion
of a
nes
ted
stru
ctur
e on
a s
et o
fin
dica
tor
appr
oxim
atin
g fu
nctio
ns.
�M
inim
izat
ion
of th
e em
piri
cal r
isk
(mis
clas
sifi
catio
n er
ror)
for
a g
iven
ele
men
t of
ast
ruct
ure.
�E
stim
atio
n of
pre
dict
ion
risk
.
�C
lass
ical
for
mul
atio
n of
the
clas
sifi
catio
n�
Con
ditio
nal d
ensi
ty
�Pr
ior
prob
abili
ty
�Po
ster
ior
prob
abili
ty
*),
()1
|(
*),
()0
|(
βαx
x
xx
py
p
py
p
==
==
�D
ecis
ion
rule
�A
pplie
s th
e E
RM
indu
ctiv
e pr
inci
ple
indi
rect
ly to
fir
stes
timat
e th
e de
nsiti
es.
�C
once
ptua
lly f
law
ed in
est
imat
ing
deci
sion
bou
ndar
yvi
a de
nsity
est
imat
ion.
=
>=
=ot
herw
ise
1
)1(
*),
()0
(*)
,(
0
)(
10
yP
py
Pp
iff
βα
xx
x
8.1
Stat
istic
al L
earn
ing
The
ory
Form
ulat
ion
�G
oal i
s to
est
imat
e an
indi
cato
r fu
nctio
n or
deci
sion
bou
ndar
y f(
x, ω
0).
�U
se S
RM
indu
ctiv
e pr
inci
ple.
�C
onst
ruct
ive
met
hods
sho
uld
sele
ct a
n el
emen
tof
a s
truc
ture
Sm=
f(x,
ω0)
and
an
indi
cato
rfu
nctio
n
.
�B
ound
on
pred
ictio
n ri
sk)
,(
0mf
ω�
)/
()
()
(0
0m
mem
pm
hn
RR
Φ+
≤ω
ω
......
21
⊂⊂
⊂⊂
mSS
S
mm
Sh
of
dim
ensi
on-
VC
is
......
21
≤≤
≤≤
mhh
h
SLT
For
mul
atio
n (C
ont’
d)
�T
wo
stra
tegi
es f
or m
inim
izin
g th
e bo
und
�K
eep
the
conf
iden
ce in
terv
al f
ixed
and
min
imiz
eth
e em
piri
cal r
isk.
�In
clud
e al
l sta
tistic
al a
nd n
eura
l net
wor
k m
etho
ds u
sing
dict
iona
ry r
epre
sent
atio
n.
�W
ith h
igh-
dim
ensi
onal
dat
a, f
irst
pro
ject
the
data
ont
o th
elo
w-d
imen
sion
al s
ubsp
ace
(i.e
, m f
eatu
res)
and
then
perf
orm
mod
elin
g in
this
sub
spac
e.
�K
eep
the
valu
e of
the
empi
rica
l ris
k fi
xed
(sm
all)
and
min
imiz
e th
e co
nfid
ence
inte
rval
.�
Req
uire
s a
spec
ial s
truc
ture
suc
h th
at th
e va
lue
of th
eem
piri
cal r
isk
is k
ept s
mal
l for
all
appr
oxim
atin
gfu
nctio
ns.
SLT
For
mul
atio
n (C
ont’
d)
�B
inar
y cl
assi
fica
tion
prob
lem
�G
oal i
s to
min
imiz
e th
e m
iscl
assi
fica
tion
erro
r.
�Pe
rcep
tron
alg
orith
m�
Sim
ple
optim
izat
ion
proc
edur
e fo
r fi
ndin
g f(
x,ω
*)pr
ovid
ing
zero
mis
clas
sifi
catio
n er
ror
if�
trai
ning
dat
a is
line
arly
sep
arab
le.
�in
dica
tor
func
tion
is
�Pr
ereq
uisi
te
�W
eigh
t upd
ate
rule
�W
hen
the
data
is n
ot s
epar
able
and
/or
the
optim
alde
cisi
on b
ound
ary
is n
onlin
ear,
doe
s no
t pro
vide
an
optim
al s
olut
ion.
SLT
For
mul
atio
n (C
ont’
d)
�M
LP
clas
sifi
er�
Can
for
m f
lexi
ble
nonl
inea
r de
cisi
on b
ound
arie
s.
�A
ppro
xim
ate
the
indi
cato
r fu
nctio
n by
a w
ell-
beha
ved
sigm
oid
func
tion.
�E
mpi
rica
l ris
k fu
nctio
nal
�Si
gmoi
d ac
tivat
ions
of
hidd
en u
nits
ena
ble
the
cons
truc
tion
of a
fle
xibl
e no
nlin
ear
deci
sion
bou
ndar
y.
�O
utpu
t sig
moi
d ap
prox
imat
es th
e di
scon
tinuo
us in
dica
tor
func
tion.
�In
neu
ral n
etw
orks
, a c
omm
on p
roce
dure
for
clas
sifi
catio
n de
cisi
ons
is to
use
sig
moi
d ou
tput
.
�O
utpu
t of
the
trai
ned
netw
ork
is in
terp
rete
d as
an
estim
ate
of th
e po
ster
ior
prob
abili
ty.
]*)
)*,
,(
([
)(
θ−=
Vw
xg
sI
xf
)|
1(ˆ
*))
*,,
((
xV
wx
==
yP
gs
SLT
For
mul
atio
n (C
ont’
d)
�G
ener
al p
resc
ript
ion
for
impl
emen
ting
cons
truc
tive
met
hods
(1)
Spe
cify
a (
flex
ible
) cl
ass
of a
ppro
xim
atin
g fu
nctio
ns f
orco
nstr
uctin
g a
(non
linea
r) d
ecis
ion
boun
dary
.
(2)
Cho
ose
a no
nlin
ear
optim
izat
ion
met
hod
for
sele
ctin
gth
e be
st f
unct
ion
from
cla
ss (
1), i
.e.,
the
func
tion
prov
idin
g th
e sm
alle
st e
mpi
rica
l ris
k.
(3)
Sel
ect c
ontin
uous
err
or f
unct
iona
l sui
tabl
e fo
r th
eop
timiz
atio
n m
etho
ds c
hose
n in
(2)
.
(4)
Sel
ect t
he b
est p
redi
ctiv
e m
odel
fro
m a
cla
ss o
ffu
nctio
ns (
1) u
sing
the
firs
t str
ateg
y fo
r m
inim
izin
g SL
Tbo
und.
SLT
For
mul
atio
n (C
ont’
d)
�C
lass
ific
atio
n m
etho
ds u
se a
con
tinuo
us e
rror
func
tiona
l tha
t onl
y ap
prox
imat
es th
e tr
ue o
ne.
�A
cla
ssif
icat
ion
met
hod
usin
g su
chap
prox
imat
ion
will
be
succ
essf
ul o
nly
ifm
inim
izin
g of
the
erro
r fu
nctio
nal s
elec
ted
in(3
) al
so m
inim
izes
true
em
piri
cal r
isk.
8.2
Cla
ssic
al F
orm
ulat
ion
�B
ased
on
stat
isti
cal d
ecis
ion
theo
ry�
Prov
ides
the
foun
datio
n fo
r co
nstr
uctin
g op
timal
deci
sion
rul
es m
inim
izin
g ri
sk.
�St
rict
ly a
pplie
s on
ly w
hen
all d
istr
ibut
ions
are
know
n.
�C
once
rned
with
con
stru
ctin
g de
cisi
on r
ules
.
�E
stim
ate
the
requ
ired
dis
trib
utio
ns f
rom
the
data
and
use
them
with
in th
e fr
amew
ork
ofst
atis
tical
dec
isio
n th
eory
.
Cla
ssic
al F
orm
ulat
ion
(Con
t’d)
�W
hen
x is
not
obs
erve
d
�W
hen
x is
obs
erve
d
Cla
ssic
al F
orm
ulat
ion
(Con
t’d)
�M
iscl
assi
fica
tion
with
une
qual
cos
ts{
}
othe
rwis
e)0
()1
()1
|(
)0|
(
if
10)
(
)|
()
()
|(
)(
whe
neve
r th
at
such
de
fine
d
is re
gion
if
min
imiz
ed
isR
isk
)|
()
()
|(
)(
)|
()
()
|(
)(
)(
isri
sk
ov
eral
l
)|
()
|(
isco
st
ex
pect
ed
the
,
0001
1110
10
10
10
10
==−−
>==
=
==
<
=
=
∈
==
+
=
==
==
+=
==
=
=+
==
∈
∑∑
∫∫
∑∑
∑∑
∫∫
∫∫
yP
yP
CC
CC
yp
yp
r
iy
pi
yP
Ci
yp
iy
PC
x
di
yp
iy
PC
di
yp
iy
PC
di
yp
iy
PC
di
yp
iy
PC
iy
Pq
di
yp
Cd
iy
pC
q
If
ii
ii
ii
ii
ii
ii
i
ii
i
xxx
xx
RR
xx
xx
xx
xx
xx
xx
Rx
00
0R
1R
R0
R1
0R
1R
i
Post
erio
r di
stri
butio
ns
�T
wo
stra
tegi
es�
Est
imat
e th
e pr
ior
prob
abili
ties
and
clas
sco
nditi
onal
den
sitie
s an
d pl
ug th
em in
to B
ayes
’ru
le.
�E
stim
ate
post
erio
r de
nsiti
es d
irec
tly u
sing
trai
ning
data
fro
m a
ll th
e cl
asse
s.
�T
wo
appr
oach
es f
or tw
o st
rate
gies
�Pa
ram
etri
c m
etho
ds.
�A
dapt
ive
flex
ible
met
hods
.
Post
erio
r… (
Con
t’d)
�D
irec
t Est
imat
ion
�E
stim
atio
n of
pos
teri
or d
ensi
ties
can
be d
one
usin
gre
gres
sion
met
hods
.
�Fo
r tw
o-cl
ass
case
�Pa
ram
etri
c R
egre
ssio
n�
Lin
ear
regr
essi
on c
an b
e us
ed f
or c
lass
ific
atio
n fo
rno
n-ga
ussi
an d
istr
ibut
ions
.�
Fish
er li
near
dis
crim
inan
t pro
vide
s an
app
roxi
mat
ion
for
the
post
erio
r pr
obab
ility
, but
bia
sed,
pro
vidi
ng a
nac
cura
te c
lass
ific
atio
n ru
le.
�M
ay p
rovi
de a
poo
r de
cisi
on b
ound
ary.
�A
dapt
ive
Reg
ress
ion
�A
ccur
ate
regr
essi
on d
oes
not g
uara
ntee
a lo
wcl
assi
fica
tion
risk
.
�U
se th
e da
ta to
est
imat
e th
e co
nditi
onal
exp
ecta
tion.
�Fo
r tw
o-cl
ass
prob
lem
s
8.3
Met
hods
for
Cla
ssif
icat
ion
�8.
3.1
Reg
ress
ion-
Bas
ed M
etho
ds�
Met
hods
bas
ed o
n co
ntin
uous
num
eric
alop
timiz
atio
n ca
n be
cas
t in
the
form
of
mul
tiple
-re
spon
se r
egre
ssio
n.
�M
ost p
opul
ar a
ppro
ach
to c
lass
ific
atio
n.
�8.
3.2
Tre
e-B
ased
Met
hods
�B
ased
on
a gr
eedy
opt
imiz
atio
n st
rate
gy.
�C
AR
T is
an
exam
ple.
�8.
3.3
Nea
rest
-Nei
ghbo
r an
d Pr
otot
ype
Met
hods
�L
ocal
met
hods
for
cla
ssif
icat
ion.
�E
stim
ate
the
deci
sion
bou
ndar
y lo
cally
.
�N
eare
st-n
eigh
bors
cla
ssif
icat
ion,
Koh
onen
’sle
arni
ng v
ecto
r qu
antiz
atio
n
ML
P C
lass
ifie
rs
�U
se 1
- of-
J ou
tput
enc
odin
g an
d si
gmoi
d ou
tput
units
.
�Pr
actic
al h
ints
and
impl
emen
tatio
n is
sues
�Pr
e-sc
alin
g of
inpu
t var
iabl
es�
Scal
e in
put d
ata
to th
e ra
nge
[-0.
5, 0
.5].
�T
ypic
ally
pre
-sca
led
to z
ero
mea
n, u
nit v
aria
nce.
�H
elps
to a
void
pre
mat
ure
satu
ratio
n an
d sp
eeds
up
trai
ning
.
�A
ltern
ativ
e ta
rget
out
put v
alue
s�
Tra
inin
g ou
tput
s ar
e se
t to
valu
es 0
.1 a
nd 0
.9.
�A
void
long
trai
ning
tim
e an
d ex
trem
ely
larg
e w
eigh
tsdu
ring
trai
ning
.
�In
itial
izat
ion
�N
etw
ork
para
met
ers
are
initi
aliz
ed to
sm
all r
ando
mva
lues
.
�C
hoic
e of
initi
aliz
atio
n ra
nge
has
subt
le r
egul
ariz
atio
nef
fect
.
�St
oppi
ng r
ules
�D
urin
g tr
aini
ng�
Tra
inin
g sh
ould
pro
ceed
as
long
as
the
decr
easi
ngco
ntin
uous
loss
fun
ctio
n re
duce
s th
e em
piri
cal
mis
clas
sifi
catio
n er
ror.
�E
arly
sto
ppin
g�
used
as
a fo
rm o
f co
mpl
exity
con
trol
(m
odel
sel
ectio
n).
�M
ultip
le lo
cal m
inim
a�
Mai
n fa
ctor
com
plic
atin
g em
piri
cal r
isk
min
imiz
atio
n as
wel
l as
mod
el s
elec
tion.
�U
se th
e m
iscl
assi
fica
tion
erro
r ra
ther
than
squ
ared
err
orlo
ss d
urin
g m
odel
sel
ectio
n.
�L
earn
ing
rate
and
mom
entu
m te
rm�
Aff
ects
loca
l min
ima
foun
d by
bac
kpro
paga
tion
trai
ning
.
�O
ptim
al c
hoic
e of
thes
e is
pro
blem
-dep
ende
nt.
RB
F C
lass
ifie
rs
�U
se m
ultip
le o
utpu
t reg
ress
ion
to b
uild
ade
cisi
on b
ound
ary.
�Fo
rm o
f di
scri
min
ant f
unct
ions
RB
F… (
Con
t’d)
�C
hara
cter
istic
s�
Impl
emen
ts lo
cal d
ecis
ion
boun
dari
es.
�Fo
r fi
xed
valu
es o
f ba
sis
func
tion
para
met
ers,
wik
are
estim
ated
via
line
ar le
ast s
quar
es.
�C
ompl
exity
can
be
dete
rmin
ed b
y th
e nu
mbe
r of
basi
s fu
nctio
ns m
.
�Po
ssib
le to
use
res
ampl
ing
tech
niqu
es to
est
imat
epr
edic
tion
risk
to p
erfo
rm m
odel
sel
ectio
n.
�T
ypic
ally
use
nor
mal
ized
bas
is f
unct
ions
�A
llow
s to
be
inte
rpre
ted
as a
type
of
dens
ity m
ixtu
rem
odel
.
Tre
e-B
ased
Met
hods
�A
dapt
ivel
y sp
lit in
put s
pace
into
dis
join
t reg
ions
in o
rder
to c
onst
ruct
a d
ecis
ion
boun
dary
.
�B
ased
on
a gr
eedy
opt
imiz
atio
n pr
oced
ure.
�Sp
littin
g pr
oces
s ca
n be
rep
rese
nted
as
a bi
nary
tree
.
�Fo
llow
ing
the
grow
th o
f th
e tr
ee, p
runi
ng o
ccur
sas
a f
orm
of
mod
el s
elec
tion.
�G
row
ing
and
prun
ing
stra
tegy
pro
vide
s be
tter
clas
sifi
catio
n ac
cura
cy th
an ju
st g
row
ing
alon
e.
Tre
e-B
ased
Met
hods
�Pr
unin
g cr
iteri
a pr
ovid
es a
n es
timat
e of
the
pred
ictio
n ri
sk w
hile
the
grow
ing
crite
ria
roug
hly
refl
ects
em
piri
cal r
isk.
�R
esul
ting
clas
sifi
er h
as a
bin
ary
tree
repr
esen
tatio
n.
CA
RT
�Po
pula
r ap
proa
ch to
con
stru
ct a
bin
ary-
tree
-ba
sed
clas
sifi
er.
�G
reed
y se
arch
em
ploy
s a
recu
rsiv
epa
rtiti
onin
g st
rate
gy.
�C
ost f
unct
ion
�M
easu
re n
ode
impu
rity
.�
Giv
e a
mea
sure
men
t of
how
hom
ogen
eous
a n
ode
t is
with
res
pect
to th
e cl
ass
labe
ls o
f th
e tr
aini
ng d
ata
in th
ere
gion
of
node
t.
CA
RT
(C
ont’
d)
�C
rite
ria
to b
e m
et�
Q is
at i
ts m
axim
um o
nly
for
prob
abili
ties
(1/J
, … ,
1/J)
.
�Q
is a
t its
min
imum
onl
y fo
r pr
obab
ilitie
s (1
, 0, …
, 0)
, (0,
1, 0
, …,0
), …
, (0,
0, …
,0, 1
).
�Q
is a
sym
met
ric
func
tion
of it
s ar
gum
ents
.
)(
)(
)|
(
,)(
)(
))|
((,
),|
2(),
|1(
()
(
tn
tn
tj
pnt
nt
p
tJ
pt
pt
pQ
tQ
j=
=
=�
CA
RT
(C
ont’
d)
�E
xam
ples
�G
ini a
nd e
ntro
py f
unct
ions
are
use
d fo
r pr
actic
alim
plem
enta
tion.
�G
ini a
nd e
ntro
py f
unct
ions
do
not m
easu
re th
ecl
assi
fica
tion
risk
dir
ectly
.
CA
RT
(C
ont’
d)
�T
wo
diff
icul
ties
whe
n us
ing
the
empi
rica
lm
iscl
assi
fica
tion
cost
�T
here
are
cas
es w
here
mis
clas
sifi
catio
n co
st d
oes
not d
ecre
ase
for
any
cand
idat
e sp
lit, l
eadi
ng to
earl
y ha
lting
in a
poo
r lo
cal m
inim
um.
�T
he m
iscl
assi
fica
tion
cost
doe
s no
t fav
or s
plits
that
tend
to p
rovi
de a
low
er m
iscl
assi
fica
tion
cost
infu
ture
spl
its.
CA
RT
(C
ont’
d)
�D
ecre
ase
in im
puri
ty
Var
iabl
e x k
and
the
split
poi
nt v
are
sel
ecte
d to
max
imiz
eth
e de
crea
se in
nod
e im
puri
ty.
CA
RT
(C
ont’
d)
�Pr
unin
g�
Perf
orm
ed a
fter
gro
win
g is
com
plet
ed.
�Im
plem
ents
mod
el s
elec
tion.
�B
ased
on
min
imiz
ing
pena
lized
em
piri
cal r
isk.
�Pe
rfor
med
in a
gre
edy
sear
ch s
trat
egy.
�E
very
pai
r of
sib
ling
leaf
nod
es is
rec
ombi
ned
in o
rder
tofi
nd a
pai
r th
at r
educ
es th
e fo
llow
ing. no
des
te
rmin
alof
num
ber
:
|
T|
data
tr
aini
ngfo
r th
e
rate
icat
ion
mis
clas
sif
: R
||
emp
TR
Rem
ppe
nλ
+=
CA
RT
(C
ont’
d)
�Pr
oced
ure
�In
itial
izat
ion
�R
oot n
ode
cons
ists
of
who
le in
put s
pace
.
�E
stim
ate
prop
ortio
n of
the
clas
ses
via
p(j|t
=0)
=n j
(0)/
n.
�T
ree
grow
ing
�R
epea
t unt
il st
oppi
ng c
rite
rion
is s
atis
fied
.�
Perf
orm
exh
aust
ive
sear
ch o
ver
all v
alid
nod
es in
the
tree
,al
l spl
it va
riab
les,
and
all
valid
kno
t poi
nts.
�In
corp
orat
e th
e da
ught
ers
into
the
tree
that
res
ults
in th
ela
rges
t dec
reas
e in
the
impu
rity
usi
ng g
ini
or e
ntro
py c
ost
func
tion.
CA
RT
(C
ont’
d)
�T
ree
Prun
ing
�R
epea
t unt
il no
mor
e pr
unin
g oc
curs
.�
Perf
orm
exh
aust
ive
sear
ch o
ver
all s
iblin
g le
af n
odes
in th
etr
ee, m
easu
ring
the
chan
ge in
mod
el s
elec
tion
crite
rion
.
�D
elet
e th
e pa
ir th
at le
ads
to th
e la
rges
t dec
reas
e of
mod
else
lect
ion
crite
rion
.
�D
isad
vant
ages
�Se
nsiti
ve to
coo
rdin
ate
rota
tions
�B
ecau
se C
AR
T p
artit
ions
the
spac
e in
to a
xis-
orie
nted
subr
egio
ns.
�C
an b
e al
levi
ated
by
split
ting
on li
near
com
bina
tions
of
inpu
t var
iabl
es.
Nea
rest
-Nei
ghbo
r an
dPr
otot
ype
Met
hods
�G
oal o
f lo
cal m
etho
ds f
or c
lass
ific
atio
n�
Con
stru
ctio
n of
loca
l dec
isio
n bo
unda
ries
.
�In
SL
T v
iew
, loc
al m
etho
ds f
or c
lass
ific
atio
nfo
llow
the
fram
ewor
k of
loca
l ris
km
inim
izat
ion.
�In
cla
ssic
al d
ecis
ion
theo
ry, l
ocal
met
hods
are
inte
rpre
ted
as lo
cal p
oste
rior
den
sity
estim
atio
n.
Nea
rest
-Nei
ghbo
rC
lass
ific
atio
n
�C
lass
ifie
s an
obj
ect b
ased
on
the
clas
s of
the
kda
ta p
oint
s ne
ares
t to
the
estim
atio
n po
int x
0.
�N
earn
ess
is m
ost c
omm
only
mea
sure
d us
ing
eucl
idea
n di
stan
ce m
etri
c in
x-s
pace
.
�L
ocal
dec
isio
n ru
le is
con
stru
cted
usi
ng th
epr
oced
ure
of lo
cal r
isk
min
imiz
atio
n.
Nea
rest
… (
Con
t’d)
�E
xam
ple
: tw
o-cl
ass
prob
lem
�E
mpi
rica
l ris
k
Nea
rest
… (
Con
t’d)
�R
easo
ns f
or th
e su
cces
s�
Prac
tical
pro
blem
s of
ten
have
a lo
w in
trin
sic
dim
ensi
onal
ity e
ven
thou
gh th
ey m
ay h
ave
man
yin
put v
aria
bles
.
�E
ffec
t of
the
curs
e of
dim
ensi
onal
ity is
not
as
seve
re d
ue to
the
natu
re o
f th
e cl
assi
fica
tion
prob
lem
.�
Acc
urat
e es
timat
es o
f co
nditi
onal
pro
babi
litie
s ar
e no
tne
cess
ary
for
accu
rate
cla
ssif
icat
ion.
Lea
rnin
g V
ecto
r Q
uant
izat
ion
(LV
Q)
�O
utlin
e�
Use
vec
tor
quan
tizat
ion
met
hods
to d
eter
min
ein
itial
loca
tions
of
m p
roto
type
vec
tors
.
�A
ssig
n cl
ass
labe
ls to
thes
e pr
otot
ype.
�A
djus
t the
loca
tions
usi
ng a
heu
rist
ic s
trat
egy.
�R
educ
e th
e em
piri
cal m
iscl
assi
fica
tion
risk
.
�L
VQ
1, L
VQ
2, L
VQ
3 by
Koh
onen
.
LV
Q (
Con
t’d)
�L
VQ
1�
A s
toch
astic
app
roxi
mat
ion
met
hod.
�G
iven
(x(
k), y
( k))
: a
data
poi
nt,
cj( k
) : p
roto
type
cent
er,
and
prot
otyp
e la
bels
wj.
�D
eter
min
e th
e ne
ares
t pro
toty
pe c
ente
r to
the
data
poi
nt.
�U
pdat
e th
e lo
catio
n of
the
near
est p
roto
type
.�
If c
orre
ctly
cla
ssif
ied
�E
lse
�k=
k+1
Em
piri
cal C
ompa
riso
ns(1
)
�M
ixtu
re o
f G
auss
ians
(R
iple
y, 1
994)
Em
piri
cal C
ompa
riso
ns(2
)
�L
inea
rly
Sepa
rabl
e Pr
oble
m
�T
rain
ing
data
set
s ar
e ge
nera
ted
acco
rdin
g to
the
dist
ribu
tion
x∼N
(0,I
), x
∈ℜ
10.
�T
en tr
aini
ng s
ets
are
gene
rate
d, a
nd e
ach
data
set
cont
ains
200
sam
ples
.
�Pr
edic
tion
risk
is e
stim
ated
for
eac
h in
divi
dual
clas
sifi
er u
sing
a la
rge
test
set
(20
00 s
ampl
es).
Em
piri
cal C
ompa
riso
ns(3
)
�W
avef
orm
Dat
a�
Firs
t use
d in
Bre
iman
et a
l (19
84).
�21
inpu
t var
iabl
es th
at c
orre
spon
d to
21
disc
rete
time
sam
ples
take
n fr
om a
ran
dom
ly g
ener
ated
wav
efor
m.
�W
avef
orm
is g
ener
ated
usi
ng a
ran
dom
line
arco
mbi
natio
n of
two
out o
f th
ree
poss
ible
com
pone
nt w
avef
orm
s.
iji
iij
iji
iij
iji
iij
jh
uj
hu
x
jh
uj
hu
x
jh
uj
hu
x
εεε +−
+=
+−
+=
+−
+=
)(
)1(
)(
: 3 C
lass
)(
)1(
)(
: 2 C
lass
)(
)1(
)(
: 1 C
lass
32
31
21
�T
en tr
aini
ng s
ets
are
gene
rate
d, a
nd e
ach
data
set
cont
ains
300
sam
ples
.
�Pr
edic
tion
risk
is e
stim
ated
usi
ng a
larg
e te
st s
et(2
000
sam
ples
). Sum
mar
y
�U
nder
stan
ding
cla
ssif
icat
ion
met
hods
req
uire
scl
ear
sepa
ratio
n be
twee
n co
ncep
tual
pro
cedu
reba
sed
on th
e SR
M in
duct
ive
prin
cipl
e an
d its
tech
nica
l im
plem
enta
tion.
�C
once
ptua
l pro
cedu
re�
Tw
o ne
cess
ary
thin
gs
�M
inim
ize
the
empi
rica
l cla
ssif
icat
ion
erro
r (v
iano
nlin
ear
optim
izat
ion)
.
�E
stim
ate
accu
rate
ly f
utur
e cl
assi
fica
tion
erro
r (m
odel
sele
ctio
n).
Sum
mar
y (C
ont’
d)
�T
echn
ical
impl
emen
tatio
n�
Com
plic
ated
by
the
disc
ontin
uous
mis
clas
sifi
catio
n er
ror
func
tiona
l.
�Pr
even
ts d
irec
t min
imiz
atio
n of
the
empi
rica
l ris
k
�A
ll pr
actic
al m
etho
ds u
se a
sui
tabl
e co
ntin
uous
loss
fun
ctio
npr
ovid
ing
appr
oxim
atio
n fo
r m
iscl
assi
fica
tion
erro
r.
�In
mod
el s
elec
tion,
sho
uld
use
clas
sifi
catio
n er
ror
loss
.
�A
ccur
ate
estim
atio
n of
a p
oste
rior
pro
babi
litie
s is
not n
eces
sary
for
acc
urat
e cl
assi
fica
tion.
Goo
d pr
obab
ilit
y es
tim
ates
are
not
nec
essa
ry fo
r go
od c
lass
ific
atio
n;si
mil
arly
, low
cla
ssif
icat
ion
erro
r do
es n
ot im
ply
that
the
corr
espo
ndin
gcl
ass
prob
abil
itie
s ar
e be
ing
esti
mat
e (e
ven
rem
otel
y) a
ccur
atel
y
-
Fri
edm
an (
1997
)
Sum
mar
y (C
ont’
d)
�SL
T’s
exp
lana
tion
for
empi
rica
l evi
denc
e on
wel
l-be
havi
ng o
f si
mpl
e m
etho
ds�
Sim
ple
clas
sifi
catio
n m
etho
ds m
ay n
ot r
equi
re n
onlin
ear
optim
izat
ion,
so
the
empi
rica
l cla
ssif
icat
ion
erro
r is
min
imiz
ed d
irec
tly.
�O
ften
sim
ple
met
hods
pro
vide
the
sam
e le
vel o
f em
piri
cal
mis
clas
sifi
catio
n er
ror
in th
e m
inim
izat
ion
step
as
mor
eco
mpl
ex m
etho
ds.
�N
o ne
ed to
use
mor
e co
mpl
ex m
etho
ds.
�C
lass
ific
atio
n pr
oble
ms
are
inhe
rent
ly le
ss s
ensi
tive
than
regr
essi
on to
opt
imal
mod
el s
elec
tion.