PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

141
PRML 2.4-2.5 The exponential family & Nonparametric methods June 11, 2014 by Shinichi TAMURA

Transcript of PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

Page 1: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

PRML 2.4-2.5

The exponential family &

Nonparametric methods June 11, 2014

by Shinichi TAMURA

Page 2: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 3: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 4: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 5: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

Almost all of the distributions we studied so far belong to a single class, namely the exponential family.

June 11, 2014 PRML 2.4-2.5

The exponential family

Shinichi TAMURA

Page 6: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

Almost all of the distributions we studied so far belong to a single class, namely the exponential family.

June 11, 2014 PRML 2.4-2.5

Bernoulli,

The exponential family

Shinichi TAMURA

Page 7: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

Almost all of the distributions we studied so far belong to a single class, namely the exponential family.

June 11, 2014 PRML 2.4-2.5

Bernoulli, multinomial,

The exponential family

Shinichi TAMURA

Page 8: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

Almost all of the distributions we studied so far belong to a single class, namely the exponential family.

June 11, 2014 PRML 2.4-2.5

Bernoulli, multinomial, Gaussian,

The exponential family

Shinichi TAMURA

Page 9: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

Almost all of the distributions we studied so far belong to a single class, namely the exponential family.

June 11, 2014 PRML 2.4-2.5

Bernoulli, multinomial, Gaussian, beta,

The exponential family

Shinichi TAMURA

Page 10: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

Almost all of the distributions we studied so far belong to a single class, namely the exponential family.

June 11, 2014 PRML 2.4-2.5

Bernoulli, multinomial, Gaussian, beta, gamma,

The exponential family

Shinichi TAMURA

Page 11: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

Almost all of the distributions we studied so far belong to a single class, namely the exponential family.

June 11, 2014 PRML 2.4-2.5

Bernoulli, multinomial, Gaussian, beta, gamma, von Mises...etc.

The exponential family

Shinichi TAMURA

Page 12: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

Almost all of the distributions we studied so far belong to a single class, namely the exponential family.

June 11, 2014 PRML 2.4-2.5

Parametric distributions

Bernoulli, multinomial, Gaussian, beta, gamma, von Mises...etc.

The exponential family

Gaussian mixture...etc.

Shinichi TAMURA

Page 13: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

p(x|!) = h(x)g(!) exp!!Tu(x)

"

The Exponential Family

The exponential family over x given is a class of distributions which form is

!

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 14: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

p(x|!) = h(x)g(!) exp!!Tu(x)

"

The Exponential Family

The exponential family over x given is a class of distributions which form is

!

Natural parameter

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 15: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

p(x|!) = h(x)g(!) exp!!Tu(x)

"

The Exponential Family

The exponential family over x given is a class of distributions which form is

!

Natural parameter Where and come across

x !

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 16: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

p(x|!) = h(x)g(!) exp!!Tu(x)

"

The Exponential Family

The exponential family over x given is a class of distributions which form is

!

Natural parameter

Normalizing constant

Where and come across

x !

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 17: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

E.g. 1) The Bernoulli Distribution

p(x|!) = µx(1 ! µ)1!x

= "(!!) exp(!x)

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 18: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

E.g. 1) The Bernoulli Distribution

where

! = ln!

µ

1 ! µ

"

p(x|!) = µx(1 ! µ)1!x

= "(!!) exp(!x)u(x)

h(x) = 1

g(!)

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 19: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

E.g. 2) The Multinomial Distribution p(x|!) =

!µxk

k

= exp(!Tx)

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 20: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

E.g. 2) The Multinomial Distribution

where

! = (ln µ1, . . . , ln µM )T

!!

exp(!k) =!

µk = 1

p(x|!) =!

µxkk

= exp(!Tx)u(x)

h(x) = 1

g(!) = 1

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 21: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

E.g. 2) The Multinomial Distribution

where

! = (ln µ1, . . . , ln µM )T

!!

exp(!k) =!

µk = 1

p(x|!) =!

µxkk

= exp(!Tx)

It's inconvenient!

u(x)

h(x) = 1

g(!) = 1

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 22: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

E.g. 2) The Multinomial Distribution

Remove the constraint by

µM = 1 !!M!1

k=1 µk, xM = 1 !!M!1

k=1 xk

p(x|µ) = exp

!M!1"

k=1

xk ln µk +

#1 !

M!1"

k=1

xk

$ln

#1 !

M!1"

k=1

µk

$%

= exp

!M!1"

k=1

xk ln

#µk

1 !&M!1

k=1 µk

$+ ln

#1 !

M!1"

k=1

µk

$%

=

#1 !

M!1"

k=1

µk

$exp

!M!1"

k=1

xk ln

#µk

1 !&M!1

k=1 µk

$%.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 23: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

E.g. 2) The Multinomial Distribution

Remove the constraint by

µM = 1 !!M!1

k=1 µk, xM = 1 !!M!1

k=1 xk

p(x|µ) = exp

!M!1"

k=1

xk ln µk +

#1 !

M!1"

k=1

xk

$ln

#1 !

M!1"

k=1

µk

$%

= exp

!M!1"

k=1

xk ln

#µk

1 !&M!1

k=1 µk

$+ ln

#1 !

M!1"

k=1

µk

$%

=

#1 !

M!1"

k=1

µk

$exp

!M!1"

k=1

xk ln

#µk

1 !&M!1

k=1 µk

$%.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 24: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

E.g. 2) The Multinomial Distribution

Remove the constraint by Therefore...

µM = 1 !!M!1

k=1 µk, xM = 1 !!M!1

k=1 xk

p(x|µ) = exp

!M!1"

k=1

xk ln µk +

#1 !

M!1"

k=1

xk

$ln

#1 !

M!1"

k=1

µk

$%

= exp

!M!1"

k=1

xk ln

#µk

1 !&M!1

k=1 µk

$+ ln

#1 !

M!1"

k=1

µk

$%

=

#1 !

M!1"

k=1

µk

$exp

!M!1"

k=1

xk ln

#µk

1 !&M!1

k=1 µk

$%.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 25: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

E.g. 2') The Multinomial Distribution w/o constraint

where

p(x|!) =!

µxkk

=

"1 +

M!1#

k=1

exp(!k)

$!1

exp(!Tx)

! =!ln

!µ1

1!P

j µj

", . . . , ln

!µM!1

1!P

j µj

", 0

"T

u(x)

h(x) = 1

g(!)

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 26: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

E.g. 3) The Gaussian Distribution

p(x|!) =1

(2"#2)1/2exp

!! 1

2#2(x ! µ)2

"

= (2")!1/2(!2!2)1/2 exp#

!21

4!2

$exp

!%!1 !2

& #xx2

$"

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 27: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

The Exponential Family

E.g. 3) The Gaussian Distribution

where

u(x)

h(x) = 1

g(!)p(x|!) =

1(2"#2)1/2

exp!! 1

2#2(x ! µ)2

"

= (2")!1/2(!2!2)1/2 exp#

!21

4!2

$exp

!%!1 !2

& #xx2

$"

! =!

µ

!2,! 1

2!2

"T

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 28: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 29: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 30: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Maximum likelihood for EF

OK, we know what EF looks like. Then, how to estimate the parameter? Maximize likelihood! Frequentist way.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 31: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Maximum likelihood for EF

Suppose we have i.i.d. data , The log-likelihood of is

June 11, 2014 PRML 2.4-2.5

!X = {x1, . . . ,xN}

Shinichi TAMURA

ln p(X|!) = ln

!N"

n=1

p(xn|!)

#

= ln

!N"

n=1

h(xn)g(!) exp$!Tu(xn)

%#

=N&

n=1

ln h(xn) + N ln g(!) + !TN&

n=1

u(xn).

! !! ln p(X|!) = N!! ln g(!) +N&

n=1

u(xn). "# 0

Page 32: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Maximum likelihood for EF

Suppose we have i.i.d. data , The log-likelihood of is

June 11, 2014 PRML 2.4-2.5

!X = {x1, . . . ,xN}

Shinichi TAMURA

ln p(X|!) = ln

!N"

n=1

p(xn|!)

#

= ln

!N"

n=1

h(xn)g(!) exp$!Tu(xn)

%#

=N&

n=1

ln h(xn) + N ln g(!) + !TN&

n=1

u(xn).

! !! ln p(X|!) = N!! ln g(!) +N&

n=1

u(xn). "# 0

Page 33: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Maximum likelihood for EF

Suppose we have i.i.d. data , The log-likelihood of is

June 11, 2014 PRML 2.4-2.5

!X = {x1, . . . ,xN}

Shinichi TAMURA

ln p(X|!) = ln

!N"

n=1

p(xn|!)

#

= ln

!N"

n=1

h(xn)g(!) exp$!Tu(xn)

%#

=N&

n=1

ln h(xn) + N ln g(!) + !TN&

n=1

u(xn).

! !! ln p(X|!) = N!! ln g(!) +N&

n=1

u(xn). "# 0By putting this to zero

Page 34: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Maximum likelihood for EF

Therefore Here, is determined only through , so it is called “sufficient statistics”. We need to store only for estimation.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

!"! ln g(!ML) =1N

N!

n=1

u(xn).

!ML!

n u(xn)

!n u(xn)

Page 35: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Maximum likelihood for EF

E.g.) Gaussian distribution By and , That's what we already know. June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

g(!) = (!2!2)1/2 exp!!21/4!2

"u(x) = (x, x2)T

!" ln g(!) =

!! !1

2!2

! 12!2

+ !21

4!22

"=#

µ"2 + µ2

$.

! µML =1N

%

n

xn,

"2ML =

1N

%

n

x2n !

!1N

%

n

xn

"2

.

Page 36: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Maximum likelihood for EF

By the way, we want to know the relation between and .

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

!!ML

Page 37: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Maximum likelihood for EF

Gradient of by gives

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

!

!h(x)g(!) exp

"!Tu(x)

#dx = 1

!g(!)!

h(x) exp"!Tu(x)

#dx

+!

h(x)g(!) exp"!Tu(x)

#u(x)dx = 0.

" #! ln g(!) = E [u(x)] .

Page 38: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Maximum likelihood for EF

Gradient of by gives

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

!

!h(x)g(!) exp

"!Tu(x)

#dx = 1

!g(!)!

h(x) exp"!Tu(x)

#dx

+!

h(x)g(!) exp"!Tu(x)

#u(x)dx = 0.

" #! ln g(!) = E [u(x)] .

Similar to !"! ln g(!ML) =1N

N!

n=1

u(xn)

Page 39: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Maximum likelihood for EF

According to LLN, sample mean will converge to the expectation, so will converge to .

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

!!ML

!"! ln g(!ML) =1N

N!

n=1

u(xn)

!" ln g(!) = E [u(x)]

Page 40: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Maximum likelihood for EF

According to LLN, sample mean will converge to the expectation, so will converge to .

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

!!ML

!"! ln g(!ML) =1N

N!

n=1

u(xn)

!" ln g(!) = E [u(x)]

Converge

Page 41: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Maximum likelihood for EF

According to LLN, sample mean will converge to the expectation, so will converge to .

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

!!ML

!"! ln g(!ML) =1N

N!

n=1

u(xn)

!" ln g(!) = E [u(x)]

Converge Converge

Page 42: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 43: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 44: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF

If you want to use the Bayesian inference, a prior distribution is needed. Then, how to decide it, if we don't know anything about the parameter?

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 45: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF

Three candidates: 1. Conjugate priors 2. Uniform distributions 3. Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 46: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF

Three candidates: 1. Conjugate priors ... Easy to handle 2. Uniform distributions 3. Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 47: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF

Three candidates: 1. Conjugate priors ... Easy to handle 2. Uniform distributions ... Principle of indifference 3. Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 48: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF

Three candidates: 1. Conjugate priors ... Easy to handle 2. Uniform distributions ... Principle of indifference 3. Noninformative priors ... Make effects of priors little

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 49: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Conjugate priors

Three candidates: 1. Conjugate priors ... Easy to handle 2. Uniform distributions ... Principle of indifference 3. Noninformative priors ... Make effects of priors little

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 50: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Conjugate priors

Distributions of EF has factors of , so conjugate priors is

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

g(!) exp(!Tu)

p(!|X , !) = f(X , !)!g(!) exp{!TX}

"!

= f(X , !)g(!)! exp{!!TX}.

Page 51: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Conjugate priors

Distributions of EF has factors of , so conjugate priors is

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

g(!) exp(!Tu)

p(!|X , !) = f(X , !)!g(!) exp{!TX}

"!

= f(X , !)g(!)! exp{!!TX}.

Correspond

Page 52: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Conjugate priors

Distributions of EF has factors of , so conjugate priors is

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

g(!) exp(!Tu)

p(!|X , !) = f(X , !)!g(!) exp{!TX}

"!

= f(X , !)g(!)! exp{!!TX}.

Normalizing constant

Page 53: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Conjugate priors

Distributions of EF has factors of , so conjugate priors is

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

g(!) exp(!Tu)

p(!|X , !) = f(X , !)!g(!) exp{!TX}

"!

= f(X , !)g(!)! exp{!!TX}.

Page 54: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Conjugate priors

Distributions of EF has factors of , so conjugate priors is

It will give posteriors as follows.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

g(!) exp(!Tu)

p(!|X , !) = f(X , !)!g(!) exp{!TX}

"!

= f(X , !)g(!)! exp{!!TX}.

p(!|X,X , !) !N!

n=1

h(xn)g(!) exp"!Tu(xn)

#" g(!)! exp{!TX}

! g(!)N+! exp

$!T

%N&

n=1

u(xn) + !X'(

Page 55: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Conjugate priors

Distributions of EF has factors of , so conjugate priors is

It will give posteriors as follows.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

g(!) exp(!Tu)

p(!|X , !) = f(X , !)!g(!) exp{!TX}

"!

= f(X , !)g(!)! exp{!!TX}.

p(!|X,X , !) !N!

n=1

h(xn)g(!) exp"!Tu(xn)

#" g(!)! exp{!TX}

! g(!)N+! exp

$!T

%N&

n=1

u(xn) + !X'(

Correspond

Page 56: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Uniform distributions

Three candidates: 1. Conjugate priors ... Easy to handle 2. Uniform distributions ... Principle of indifference 3. Noninformative priors ... Make effects of priors little

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 57: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Uniform distributions

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

The uniform distribution is common choice for discrete bounded variable. C.f.: Principle of insufficient reason (or Principle of indifference)

Page 58: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Uniform distributions

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

The uniform distribution is common choice for discrete bounded variable. C.f.: Principle of insufficient reason (or Principle of indifference) But two problems arise when it is applied to continuous variables: 1.  The normalization problem 2.  The transformation problem

Page 59: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Uniform distributions

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

1. Normalization Problem If the parameter is unbounded These priors are called “improper”.

! !

"!p(!)d! =

! !

"!const d! ! "

Page 60: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Uniform distributions

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

1. Normalization Problem If the parameter is unbounded These priors are called “improper”. Note that these priors can give proper posteriors, because posteriors are proportional to likelihood, which can be normalized.

! !

"!p(!)d! =

! !

"!const d! ! "

Page 61: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Uniform distributions

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

2. Transformation problem Non-linear transformation gives non-constant priors. E.g.) (Sometimes, the posteriors are not sensitive to the difference.)

p(!) = 1!!"!=

!"

p(") = p(!)####d!

d"

#### = 2"

Page 62: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Uniform distributions

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

2. Transformation problem Non-linear transformation gives non-constant priors. E.g.) (Sometimes, the posteriors are not sensitive to the difference.)

Not constant for !

p(!) = 1!!"!=

!"

p(") = p(!)####d!

d"

#### = 2"

Page 63: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Uniform distributions

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

2. Transformation problem Non-linear transformation gives non-constant priors. E.g.) (Sometimes, the posteriors are not sensitive to the difference.)

Not constant for !Think "constant for what?"

p(!) = 1!!"!=

!"

p(") = p(!)####d!

d"

#### = 2"

Page 64: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Uniform distributions

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Keep these problems in mind: 1.  The normalization problem 2.  The transformation problem

Page 65: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

Three candidates: 1. Conjugate priors ... Easy to handle 2. Uniform distributions ... Principle of indifference 3. Noninformative priors ... Make effects of priors little

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 66: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Two examples of noninformative priors: 1. Priors for location parameters 2. Priors for scale parameters

These are constructed to make effects to posteriors as little as possible, so that the inference would be objective.

Page 67: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

1. Priors for location parameters If the density form is p(x|µ) = f(x ! µ),

Page 68: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

1. Priors for location parameters If the density form is the constant shift gives same density:

!x = x + c

p(x|µ) = f(x ! µ),

p(!x|!µ) = f(!x ! !µ).

Page 69: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

1. Priors for location parameters If the density form is the constant shift gives same density: This property is “translation invariance” and these parameter is “location parameter”.

!x = x + c

p(x|µ) = f(x ! µ),

p(!x|!µ) = f(!x ! !µ).

Page 70: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

1. Priors for location parameters To reflect the translation invariance, priors should be ! A

Bp(µ)dµ =

! A

Bp(µ ! c)dµ for"A,B.

#$ p(µ) = p(µ ! c).#$ p(µ) = constant.

Page 71: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

1. Priors for location parameters To reflect the translation invariance, priors should be ! A

Bp(µ)dµ =

! A

Bp(µ ! c)dµ for"A,B.

#$ p(µ) = p(µ ! c).#$ p(µ) = constant.

We obtained uniform distributions after all. But unlike before, we know when to use it.

Page 72: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

1. Priors for location parameters E.g.) The mean in Gaussian

p(x|µ) =1

(2!"2)1/2exp

!! 1

2"2(x ! µ)2

"

Page 73: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

1. Priors for location parameters E.g.) The mean in Gaussian

p(x|µ) =1

(2!"2)1/2exp

!! 1

2"2(x ! µ)2

" f(x ! µ)This form is

Page 74: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

1. Priors for location parameters E.g.) The mean in Gaussian This prior is also obtained as a limit of conjugates.

p(x|µ) =1

(2!"2)1/2exp

!! 1

2"2(x ! µ)2

" f(x ! µ)This form is

p(µ) = N (µ|µ0,!20)

!20!"!!!!"const.,

µN =!2

N!20 + !2

µ0 +N!2

0

N!20 + !2

µML "µML,

1!2

N

=1!2

0

+N

!2"N

!2.

Page 75: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

2. Priors for scale parameters If the density form is p(x|!) =

1!

f!x

!

"

Page 76: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

2. Priors for scale parameters If the density form is the constant scale gives same density:

p(x|!) =1!

f!x

!

"

p(!x|!!) =1!! f

"!x!!

#!x = cx

Page 77: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

2. Priors for scale parameters If the density form is the constant scale gives same density: This property is “scale invariance” and these parameter is “scale parameter”.

p(x|!) =1!

f!x

!

"

p(!x|!!) =1!! f

"!x!!

#!x = cx

Page 78: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

2. Priors for scale parameters To reflect the scale invariance, priors should be

! A

Bp(!)d! =

! A

Bp

"1c!

# $$$$d!

d(c!)

$$$$ d! for!A,B.

"# p(!) =1cp

"1c!

#.

"# p(!) $ 1!

.

"# p(ln !) = const.

Page 79: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

2. Priors for scale parameters E.g.) The deviation in Gaussian p(!x|!) =

1(2"!2)1/2

exp"! 1

2!2!x2

#

Page 80: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

2. Priors for scale parameters E.g.) The deviation in Gaussian

This form is 1! f!

x!

"

p(!x|!) =1

(2"!2)1/2exp

"! 1

2!2!x2

#

Page 81: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

2. Priors for scale parameters E.g.) The deviation in Gaussian This prior is also obtained as a limit of conjugates.

This form is 1! f!

x!

"

p(!x|!) =1

(2"!2)1/2exp

"! 1

2!2!x2

#

p(!) = Gam(!|a0, b0)a0,b0!"!!!!!!"const

!,

aN = a0 +N

2"N

2,

bN = b0 +N

2"2

ML "N

2"2

ML,

Page 82: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Priors for EF – Noninformative priors

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Two examples of noninformative priors: 1. Priors for location parameters 2. Priors for scale parameters

p(x|µ) = f(x ! µ) =" p(µ) = const.

p(x|!) =1!

f!x

!

"=! p(!) " 1

!

Page 83: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 84: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 85: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 86: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

We learned “parametric approach”

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 87: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

We learned “parametric approach” vs.

We will learn “nonparametric approach”

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 88: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

We learned “parametric approach” vs.

We will learn “nonparametric approach” What is the difference?

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 89: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Parametric Nonparametric

Assume a specific form of the distribution

Put few assumption about the form of distribution

Simple Complex (depend on data size)

Poor Rich / Flexible Efficient Inefficient

Page 90: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Parametric Nonparametric

Assume a specific form of the distribution

Put few assumption about the form of distribution

Simple Complex (depend on data size)

Poor Rich / Flexible Efficient Inefficient

Page 91: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 92: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

We will learn: 1. Histogram methods 2. Kernel density estimators 3. Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 93: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

1. Histogram methods Split the space into grids (or bins), and count data points.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 94: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

1. Histogram methods Split the space into grids (or bins), and count data points. where

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

p(x) = pi =ni

N!i(x ! i-th bin),

!i = Width of ith bin (usually same for all i),

ni = # of observations which is assigned to ith bin,N = Total # of observations.

Page 95: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

1. Histogram methods Split the space into grids (or bins), and count data points. where

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

p(x) = pi =ni

N!i(x ! i-th bin),

!i = Width of ith bin (usually same for all i),

ni = # of observations which is assigned to ith bin,N = Total # of observations.

This is piecewise constant, hence discontinuous.

Page 96: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

1. Histogram methods – Example is...

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

! = 0.04

0 0.5 10

5

! = 0.08

0 0.5 10

5

! = 0.25

0 0.5 10

5

!

Page 97: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

1. Histogram methods – Example is...

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

! = 0.04

0 0.5 10

5

! = 0.08

0 0.5 10

5

! = 0.25

0 0.5 10

5

Too narrow to catch enough points Too spiky (noisy)

!

Page 98: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

1. Histogram methods – Example is...

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

! = 0.04

0 0.5 10

5

! = 0.08

0 0.5 10

5

! = 0.25

0 0.5 10

5

Too narrow to catch enough points Too spiky (noisy)

# of bins = MD (curse of dimensionality) !

Page 99: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

1. Histogram methods – Example is...

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

! = 0.04

0 0.5 10

5

! = 0.08

0 0.5 10

5

! = 0.25

0 0.5 10

5

Too narrow to catch enough points Too spiky (noisy)

Good intermediate value

# of bins = MD (curse of dimensionality) !

Page 100: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

1. Histogram methods – Example is...

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

! = 0.04

0 0.5 10

5

! = 0.08

0 0.5 10

5

! = 0.25

0 0.5 10

5

Too narrow to catch enough points Too spiky (noisy)

Good intermediate value

Too wide to express the data Too smooth (less info)

# of bins = MD (curse of dimensionality) !

Page 101: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

1. Histogram methods – Example is...

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

! = 0.04

0 0.5 10

5

! = 0.08

0 0.5 10

5

! = 0.25

0 0.5 10

5

Too narrow to catch enough points Too spiky (noisy)

Good intermediate value

Too wide to express the data Too smooth (less info)

Find good value is very important!

# of bins = MD (curse of dimensionality) !

Page 102: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

Lessons from histogram methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Estimate density at a particular point from data points of small local region.

Page 103: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

Lessons from histogram methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Estimate density at a particular point from data points of small local region. The regions are defined by “smoothing parameter”, which control the complexity in relation with data size.

Page 104: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

Lessons from histogram methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Estimate density at a particular point from data points of small local region. The regions are defined by “smoothing parameter”, which control the complexity in relation with data size.

Other problems •  Discontinuity •  Not scalable (curse of dimensionality)

Page 105: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

Lessons from histogram methods Let's consider a small local region , then

where .

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

R

P =!R p(x)dx

Pr(K out of N data ! R) =N !

K!(N " K)!PK(1 " P )N!K ,

Page 106: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

Lessons from histogram methods Let's consider a small local region , then

where . If 1.  K is large enough (smoother not too small) 2.  N is constant over (smoother small enough)

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

R

P =!R p(x)dx

Pr(K out of N data ! R) =N !

K!(N " K)!PK(1 " P )N!K ,

R

Page 107: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

Lessons from histogram methods Let's consider a small local region , then

where . If 1.  K is large enough (smoother not too small) 2.  N is constant over (smoother small enough)

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

R

P =!R p(x)dx

Pr(K out of N data ! R) =N !

K!(N " K)!PK(1 " P )N!K ,

R

Contradictory

Page 108: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

Lessons from histogram methods Let's consider a small local region , then

where . If 1.  K is large enough (smoother not too small) 2.  N is constant over (smoother small enough)

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

R

P =!R p(x)dx

Pr(K out of N data ! R) =N !

K!(N " K)!PK(1 " P )N!K ,

R

Contradictory Depend on data size

Page 109: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

Lessons from histogram methods Let's consider a small local region , then

where . If 1.  K is large enough (smoother not too small) 2.  N is constant over (smoother small enough)

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

R

P =!R p(x)dx

Pr(K out of N data ! R) =N !

K!(N " K)!PK(1 " P )N!K ,

R

! p(x) =K

NV.

Page 110: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 111: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 112: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Kernel density estimators

Fix a region (e.g., hypercube centered on x, side is h) and count data by kernel function k(u) (Parzen window).

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

k(u) =

!1, |ui| ! 1/2, (i = 1, . . . D)0, otherwise.

Page 113: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Kernel density estimators

Fix a region (e.g., hypercube centered on x, side is h) and count data by kernel function k(u) (Parzen window).

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Centered on origin, side is 1

k(u) =

!1, |ui| ! 1/2, (i = 1, . . . D)0, otherwise.

Page 114: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Kernel density estimators

Fix a region (e.g., hypercube centered on x, side is h) and count data by kernel function k(u) (Parzen window).

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

k(u) =

!1, |ui| ! 1/2, (i = 1, . . . D)0, otherwise.

Discontinuous kernel

Page 115: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Kernel density estimators

Fix a region (e.g., hypercube centred on x, side is h) and count data by kernel function k(u) (Parzen window).

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

K =N!

n=1

k

"x ! xn

h

#,

V = hD,

! p(x) =1N

N!

n=1

1hD

k

"x ! xn

h

#.

k(u) =

!1, |ui| ! 1/2, (i = 1, . . . D)0, otherwise.

Page 116: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Kernel density estimators

Symmetry of k(u) let us re-interpret the result.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

N data points in the single cube centered on x

Page 117: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Kernel density estimators

Symmetry of k(u) let us re-interpret the result.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

N data points in the single cube centered on x

N cubes centered on xn around x

Page 118: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Kernel density estimators

Other choice of k(u): Gaussian

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

k(u) =1

(2!)D/2exp

!!"u"2

2

".

Page 119: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Kernel density estimators

Other choice of k(u): Gaussian

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

k(u) =1

(2!)D/2exp

!!"u"2

2

".

This kernel give continuous density.

Page 120: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Kernel density estimators

Other choice of k(u): Gaussian You can use anything as long as it holds

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

k(u) ! 0,!

k(u)du = 1.

k(u) =1

(2!)D/2exp

!!"u"2

2

".

Page 121: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Kernel density estimators

Example Again, we can see that smooth parameter h controls the outcome of estimations.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

h = 0.005

0 0.5 10

5

h = 0.07

0 0.5 10

5

h = 0.2

0 0.5 10

5

Page 122: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 123: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 124: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nearest-neighbour methods

Use a sphere as a region which centred on x and contains K (fixed number) data points.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 125: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nearest-neighbour methods

Use a sphere as a region which centred on x and contains K (fixed number) data points. where V(x) denotes the volume of the sphere.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

p(x) =K

NV (x),

Page 126: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nearest-neighbour methods

Note that this density can not be normalized. From x* where faraway from all data points, the radius of the sphere is inversely proportional to x, thus integral diverge.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

! !

"!

dx

r(x)!

! !

x!

dx

r(x)

!! !

x!

dx

x " x†

# $.

"!

RD

K

NV (x)dx %

!

RD

dxr(x)D

# $.

Page 127: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nearest-neighbour estimators

Example Here again, smooth parameter K controls the outcome of estimations.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

K = 1

0 0.5 10

5

K = 5

0 0.5 10

5

K = 30

0 0.5 10

5

Page 128: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nearest-neighbour estimators

Example Here again, smooth parameter K controls the outcome of estimations. Furthermore, we can observe that in K=1 case.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

K = 1

0 0.5 10

5

K = 5

0 0.5 10

5

K = 30

0 0.5 10

5

p(x) ! "

Page 129: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

Another problem of Kernels and NNs These methods need all observed data for estimation, so both time and space complexity is O(N). It is very inefficient. On that point, parametric methods are quite efficient (c.f., sufficient statistics). Histograms are also efficient.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 130: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Histograms Kernels NNs K Not fixed Not fixed Fixed V Not fixed Fixed Not fixed Smoother h V Continuity No It depends Yes* Dimensionality Suffer Scalable Scalable Normalization Proper Proper Improper Data set Discard Keep Keep

!

* If K=1, not continuous

Page 131: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Histograms Kernels NNs K Not fixed Not fixed Fixed V Not fixed Fixed Not fixed Smoother h V Continuity No It depends Yes* Dimensionality Suffer Scalable Scalable Normalization Proper Proper Improper Data set Discard Keep Keep

!

* If K=1, not continuous

Page 132: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Histograms Kernels NNs K Not fixed Not fixed Fixed V Not fixed Fixed Not fixed Smoother h V Continuity No It depends Yes* Dimensionality Suffer Scalable Scalable Normalization Proper Proper Improper Data set Discard Keep Keep

!

* If K=1, not continuous

Page 133: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nonparametric methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Histograms Kernels NNs K Not fixed Not fixed Fixed V Not fixed Fixed Not fixed Smoother h V Continuity No It depends Yes* Dimensionality Suffer Scalable Scalable Normalization Proper Proper Improper Data set Discard Keep Keep

!

* If K=1, not continuous

Page 134: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nearest-neighbour methods

Use NNs as classifier To do this, use the sphere contains K points irrespective to the class.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 135: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nearest-neighbour methods

Use NNs as classifier To do this, use the sphere contains K points irrespective to the class. where Kk is # in class k and sphere.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

p(x|Ck) =Kk

NkV,

p(x) =K

NV,

Page 136: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nearest-neighbour methods

Use NNs as classifier To do this, use the sphere contains K points irrespective to the class. where Kk is # in class k and sphere. Class priors are , so

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

p(x|Ck) =Kk

NkV,

p(x) =K

NV,

p(Ck|x) =p(x|Ck)p(Ck)

p(x)=

Kk

K.

p(Ck) = Nk/N

Page 137: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nearest-neighbour methods

Use NNs as classifier Therefore, x will be classified to the greatest majority among x's K-nearest neighbours.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 138: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nearest-neighbour methods

Use NNs as classifier Therefore, x will be classified to the greatest majority among x's K-nearest neighbours. If K=1, it is called “nearest-neighbour rule”.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 139: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Nearest-neighbour methods

Use NNs as classifier – Example Same as the discussion so far, here K acts as smooth parameter.

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

x6

x7

K = 1

0 1 20

1

2

x6

x7

K = 3

0 1 20

1

2

x6

x7

K = 31

0 1 20

1

2

Page 140: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA

Page 141: PRML 2.4-2.5: The Exponential Family & Nonparametric Methods

NONPARAMETRIC METHODS THE EXPONENTIAL FAMILY

Today's topics

1. The exponential family 1.  What is exponential family? 2.  Maximum likelihood for EF 3.  How to decide priors for EF

2. Nonparametric methods 1.  What is the point of nonparametric methods ? 2.  Kernel density estimator 3.  Nearest-neighbour methods

June 11, 2014 PRML 2.4-2.5 Shinichi TAMURA