Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic...

30
S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 1 Kernels for Dynamic Textures S.V.N. Vishwanathan [email protected] National ICT Australia and Australian National University Joint work with Alex Smola and René Vidal

Transcript of Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic...

Page 1: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 1

Kernels for Dynamic TexturesS.V.N. Vishwanathan

[email protected]://web.anu.edu.au/~vishy

National ICT Australiaand

Australian National University

Joint work with Alex Smola and René Vidal

Page 2: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Roadmap

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 2

Introduction to Kernel Methods

Why kernels?

Kernels on Dynamical Systems

Trajectories, Noise ModelsComputation

Dynamical Textures

ARMA ModelsApproximate SolutionsKernel ComputationExperiments

Outlook and Conclusion

Page 3: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Classification

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 3

Data:

Pairs of observations (xi, yi)

Underlying distribution P(x, y)

Examples (blood status, cancer), (transactions, fraud)

Task:

Find a function f (x) which predicts y given x

The function f (x) must generalize well

Page 4: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Optimal Separating Hyperplane

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 4

Minimize1

2‖w‖2 subject to yi(〈w, xi〉 + b) ≥ 1 for all i

Page 5: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Kernels and Nonlinearity

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 5

Problem: Linear functions are often toosimple to provide good estimators

Idea 1: Map to a higher dimensionalfeature space via Φ : x → Φ(x) andsolve the problem there Replace ev-ery 〈x, x′〉 by 〈Φ(x), Φ(x′)〉

Idea 2: Instead of computing Φ(x) ex-plicitly use a kernel functionk(x, x′) := 〈Φ(x), Φ(x′)〉A large class of functions are admis-sible as kernels

Non-vectorial data can be handled ifwe can compute meaningful k(x, x′)

Page 6: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Roadmap

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 6

Introduction to Kernel Methods

Why kernels?

Kernels on Dynamical Systems

Trajectories, Noise ModelsComputation

Dynamical Textures

ARMA ModelsApproximate SolutionsKernel ComputationExperiments

Outlook and Conclusion

Page 7: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

The Basic Idea

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 7

Key Observation:

Trajectories are easily observableSimilar trajectories ⇒ similar systemsRestrict attention to interesting casesAverage over noise models

Kernels Using Dynamical Systems:

Simulate system for both inputsSimilar time evolution ⇒ similar inputs

Kernels on Dynamical Systems:

Restrict to interesting initial conditionsSimulate both the systemsSimilar time evolution ⇒ similar systems

Page 8: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Notation

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 8

X - state space (Hilbert space)

A - time evolution operators

T - time of measurement

µ - nice probability measure on T

Discounting Factors:For some λ > 0

µ(t) = λ−1e−λt for T = R+0

µ(t) =e−λt

1− e−λfor T = N0

Time Evolution:We study

xA(t) := A(t)x for A ∈ A

Page 9: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Trajectories and Kernels

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 9

Comparing Trajectories:Using the dot product on X we define a dot product on XT

〈θ, θ′〉 := Eµ[〈θ(t), θ′(t)〉] for θ, θ′ ∈ XT

Extending to Dynamical Systems:Identify a dynamical system with its trajectory and define

k((x,A), (x, A)) := Eµ

[〈A(t)x, A(t) x〉

]Other Ideas:

A nicely decaying measure required for convergenceModify the dot product in X

Covariance matrices?Rational kernels and transducers

Page 10: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Special Cases

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 10

Kernels on Dynamical Systems:

Restrict attention to x = x

Compare trajectory for identical initial conditionsTake expectation if interested in a range of x

k(A, A) := Ex

[k((x,A), (x, A))

]More generally

k(A, A) := EA EA Ex

[k((x,A), (x, A))

]Kernels Using Dynamical Systems:

Restrict attention to a particular dynamical systemAs before we can take expectations over A

k(x, x) := Ex Ex EA [k((x,A), (x,A))]

Page 11: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Discrete Linear Systems

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 11

Linear Systems:

We assume time propagation occurs as

xA(t + 1) = AxA(t) + at + ξt

In closed form

xA(t) = At x0 +

t∑i=0

At−i ξi + At−i at

To avoid messy math assume at = 0 and hence

xA(t) = At x0 +

t∑i=0

At−i ξi

Contribution to kernel due to A as well as noise

Page 12: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Continuous Linear Systems

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 12

Linear Systems:

Sytem dynamics here are described by

d

dtxA(t) = AxA(t) + a(t) + ξ(t)

Here ξ(t) with E[ξ(t)] = 0 is a stochastic process and

xA(t) = exp(A t)x0 +

∫ t

0

exp(A(t− τ ))(a(τ ) + ξ(τ ))dτ

As before we assume a(t) = 0

We even assume ξ(τ ) = 0 (avoids messy math again!)

xA(t) = exp(A t)x0

Kernel contribution only due to A

Page 13: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Convergence Criterion

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 13

Discrete Case:

Let A and B and W be linear operatorsThe matrix norms obey 0 ≤ ‖A‖, ‖B‖ ≤ Λ

For suitable λ with eλ > Λ2 and W � 0

M :=

∞∑t=0

e−λtAtWBt

Sylvester equation e−λAMB + W = M

Continuous Case:We define

M :=

∫ ∞

0

e−λt exp(At)>W exp(Bt) dt

Sylvester equation (A> + λ2 1)M + M(B + λ

2 1) = −W

Page 14: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Gory Details

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 14

Contribution due to A:

p∞∑t=0

e−λt〈Atx, Atx〉 := p · x>

[ ∞∑t=0

e−λt(At)>W At

]x

= p · x>M x

Contribution due to noise:

p∞∑t=0

t∑j,j′=0

e−λt〈At−jξj, At−j′

ξj′〉

= p tr

(Cξ

[ ∞∑t=0

e−λt(At)>M At

]):= p tr(Cξ M)

In above equations p is a normalizing term

Page 15: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Delving Deeper

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 15

More on M and M :

The matrix M and M look like

M :=

[ ∞∑t=0

e−λt(At)>W At

]and

M :=

[ ∞∑t=0

e−λt(At)>M At

]Sylvester Equation:

Both M and M satisfy the Sylvester equation

e−λ A>M A +W = M and e−λ A> M A +M = M

Can be solved for in cubic time

Page 16: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Discrete Kernel

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 16

Discrete Case:

Putting it all together

k((A, x), (A, x)) = p[x>M x+ tr(CξM)

]Note that Cξ is the covariance matrix of ξt

Can assume different noise models per time step

Initial Conditions:

C be the covariance matrix of the initial conditionsIf we set x = x then

k((A, x), (A, x)) = p[tr(CM) + tr(CξM)

]

Page 17: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Continuous Kernel

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 17

Contribution due to A:

Since we assumed a(t) = ξ(t) = 0 we get

k((x,A), (x, A)) = λ−1

∫ ∞

0

e−λt〈exp(A t)x, exp(A t) x〉dt

The Final Form:

The kernel can be expressed as

k((x,A), (x, A)) = λ−1x>M x

where

(A> +λ

21)M + M>(A +

λ

21) = −W

Solution in cubic time by solving Sylvester equation

Page 18: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Special Cases

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 18

Snapshot:

If we consider only the snapshot at time instance T

k((x,A), (x, A)) = λ−1x exp(A t)W exp(A t)> x>

Initial Conditions:

Fix A = A

Now we just solve

M = −1

2(A+

λ

21)−1W

Dynamical Systems:

Fix x = x to get k(A, A) = λ−1 tr(MC)

Here C is the covariance matrix of initial conditions

Page 19: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Graph Kernels

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 19

Graph Laplacian:

Let E be the adjacency matrix and D := diag(E 1)

L := E −D and L := D−12LD−1

2

Diffusion Process:

We can define a diffusion process by

d

dtx(t) = Lx(t)

Diffusion Kernel (Kondor and Lafferty, 2002):

If we measure overlap at time instance T we get

K = exp(LT )> exp(LT )

Kij is the probability that state l reached from i and j

Page 20: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Graph Kernels

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 20

Undirected Graphs (Kondor and Lafferty, 2002):

Here L is symmetric and hence yields

K = exp(2LT )

Labeled Graphs (Gärtner, 2002):

If W acts as an indicator for node labelsSay Wij = 1 if two nodes have same labelFor other fancy weights see (Kashima et al, 2003)

Averaged Graph Laplacian:

If we average over a range of T values

K =1

2

(L +

λ

21

)−1

Page 21: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Roadmap

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 21

Introduction to Kernel Methods

Why kernels?

Kernels on Dynamical Systems

Trajectories, Noise ModelsComputation

Dynamical Textures

ARMA ModelsApproximate SolutionsKernel ComputationExperiments

Outlook and Conclusion

Page 22: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

ARMA Models

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 22

ARMA Model:

An auto-regressive moving average model is

x(t + 1) = Ax(t) + B v(t)

y(t) = φ(x(t)) + w(t)

x(t) is a hidden variablev(t) and w(t) are IID random noise

Linear Gaussian Model:

If φ is linear and the noise is white Gaussian:

x(t + 1) = Ax(t) + v(t) v(t) ∼ N(0, Q)

y(t) = C x(t) + w(t) w(t) ∼ N(0, R)

Fix scaling by demanding that C>C = 1

Page 23: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Dynamic Textures

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 23

Image Model:

y(t) ∈ Rm are the observed noisy imagesx(t) ∈ Rn (n < m) are hidden variables

Modeling:

A sequence of images {y(1), . . . , y(τ )} is observedIdeally we want to solve

A(τ ),C(τ ), Q(τ ), R(τ ) = arg maxA,C,Q,R

p(y(1), . . . , y(τ ))

Exact Solution:

n4sid in MATLAB solves above problemDoes not scale well if m is largeImpractical for images where m ∼ 105

Page 24: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Approximate Solution

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 24

Problem To Solve:

For any variable z(t) define Zτi := [z(i), . . . , z(τ )]

We are solving

Y τ1 = CXτ

1 + W τ1 with C>C = 1

Solving By SVD:

Solving for arg minC,Xτ1‖W‖ yields

C(τ ) = U and X(τ ) = ΣV > where Y τ1 = UΣV >

Solving for arg minA ‖Xτ2 −AXτ

1‖ yields

A(τ ) = ΣV >D1V (V >D2V )−1Σ−1

Here D1 =

[0 0

1(τ−1) 0

]and D2 =

[1(τ−1) 0

0 0

]

Page 25: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Dynamic Texture Kernel

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25

Kernel Definition:

Estimate model and compute kernels between modelsIf we average out the noise then for some W � 0

k((x0,A,C), (x′0,A′,C′)) := E

v,w

[ ∞∑t=1

e−λty>t Wy′t

]Kernel Computation:

The kernel can be computed as

k = x>0 Mx′0 +(eλ − 1

)−1tr[QM + WR

]The matrices M and M satisfy

M = e−λ A>C>WC ′A′ +e−λ A>M A′

M = C>W C′ +e−λ A> M A′

Page 26: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Experimental Setup

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 26

Typical Textures:

Some sample textures

A long clip was cut to shorter clips of 120 frames each

Freak Textures:

We also collected some freak textures

Page 27: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Results

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 27

Kernel Induced Metric:

Clips closer on a axis are from the same master clipWe plot the kernel induced metric for λ = 0.9 and 0.1

Results fairly independent of the cholice of λ

Notice the block diagonal structure of the metric matrix

Page 28: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Roadmap

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 28

Introduction to Kernel Methods

Why kernels?

Kernels on Dynamical Systems

Trajectories, Noise ModelsComputation

Dynamical Textures

ARMA ModelsApproximate SolutionsKernel ComputationExperiments

Outlook and Conclusion

Page 29: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

Conclusion

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 29

A new method to embed dynamical systems

Analytical solutions for linear systems

Many graph kernels are special cases

Analytical solutions require cubic time

Are better solutions possible for special cases?

Extensions to nonlinear systems?

Application to dynamical textures

Works with approximate model parameters

Picks out clips from the same master clip

Close relations to rational kernels of Cortes et. al.

More information at http://mlg.anu.edu.au/~vishy

Page 30: Kernels for Dynamic Textures - Purdue Universityvishy/talks/Dynamic.pdf · 2009. 8. 22. · Dynamic Texture Kernel S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 25 Kernel

S.V.N. Vishwanathan: Kernels for Dynamic Textures, Page 30

Questions?