Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes,...

36
> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE More on kernels Marcel Lüthi Graphics and Vision Research Group Department of Mathematics and Computer Science University of Basel

Transcript of Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes,...

Page 1: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

More on kernels

Marcel Lüthi

Graphics and Vision Research GroupDepartment of Mathematics and Computer Science

University of Basel

Page 2: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Kernels everywhere

Integral and differential equations• Aronszajn, Nachman. "Theory of reproducing kernels." Transactions of the American mathematical

society (1950): 337-404.

Numerical analysis, Approximation and Interpolation theory• Wahba, Grace. Spline models for observational data. Vol. 59. Siam, 1990.• Schaback, Robert, and Holger Wendland. "Kernel techniques: From machine learning to meshless

methods." Acta Numerica 15 (2006): 543-639.• Hennig, Philipp, and Osborn, Michael: Probabilistic numerics

• Geostatistics (Gaussian processes)• Stein, Michael L. Interpolation of spatial data: some theory for kriging. Springer Science & Business

Media, 1999.

2

Page 3: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Kernels everywhere

• Learning Theory / Machine learning• Vapnik, Vladimir. Statistical learning theory. Vol. 1. New York: Wiley, 1998.

• Hofmann, Thomas, Bernhard Schölkopf, and Alexander J. Smola. "Kernel methods in machine learning." The annals of statistics (2008): 1171-1220.

• Shape modelling / Image analysis• Grenander, Ulf, and Michael I. Miller. "Computational anatomy: An emerging discipline."

Quarterly of applied mathematics 56.4 (1998): 617-694.

• Younes, Laurent: Shapes and diffeomorphisms, Springer 2010

3

Page 4: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

What do they have in common?

• Solution space has a rich structure to be able to:• Predict unseen values

• Deal with noisy or incomplete data

• Capture a pattern

• Kernels ideally suited to define such structure• The resulting space of functions is

mathematically “nice”.

4

ML

StatisticsImage analysis

NumericsDifferential equations

Page 5: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Back to basics: Scalar-valued GPs

Vector-valued (this course)

• Samples u are deformation fields:

𝑢:𝒳 → ℝ𝑑

Scalar-valued (more common)

• Samples f are real-valued functions

𝑓 ∶ 𝒳 → ℝ

Page 6: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Scalar-valued Gaussian processes

Vector-valued (this course)

𝑢 ∼ 𝐺𝑃 Ԧ𝜇, 𝒌Ԧ𝜇:𝒳 → ℝ𝑑

𝒌:𝒳 ×𝒳 → ℝ𝑑×𝑑

Scalar-valued (more common)

𝑓 ∼ 𝐺𝑃 𝜇, 𝑘𝜇:𝒳 → ℝ𝑘:𝒳 × 𝒳 → ℝ

Page 7: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

A connection

Matrix-valued kernels can be reinterpreted as scalar-valued kernels:

Matrix valued kernel: 𝒌:𝒳 ×𝒳 → ℝ𝒅×𝒅

Scalar valued kernel: 𝑘:𝒳 × 1. . 𝑑 × 𝒳 × 1. . 𝑑 → ℝ

Bijection: Define𝑘( 𝑥, 𝑖 , 𝑥′, 𝑗 = 𝒌 𝑥′, 𝑥′ 𝑖,𝑗

Page 8: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Vector/scalar valued kernel matrices

8

𝑲 =

𝑘11 𝑥1, 𝑥1 𝑘12 𝑥1, 𝑥1𝑘21 𝑥1, 𝑥1 𝑘22 𝑥1, 𝑥1

…𝑘11 𝑥1, 𝑥𝑛 𝑘12 𝑥1, 𝑥𝑛𝑘21 𝑥1, 𝑥𝑛 𝑘22 𝑥1, 𝑥𝑛

⋮ ⋮𝑘11 𝑥𝑛, 𝑥1 𝑘12 𝑥𝑛, 𝑥1𝑘21 𝑥𝑛, 𝑥1 𝑘22 𝑥𝑛, 𝑥1

…𝑘11 𝑥𝑛, 𝑥𝑛 𝑘12 𝑥𝑛, 𝑥𝑛𝑘21 𝑥𝑛, 𝑥𝑛 𝑘22 𝑥𝑛, 𝑥𝑛

𝐾 =

𝑘 (𝑥1, 1), (𝑥1, 1) 𝑘 (𝑥1, 1), (𝑥1, 2)

𝑘 𝑥1, 2 , (𝑥1, 1) 𝑘 𝑥1, 2 , (𝑥1, 2)…

𝑘 (𝑥1, 1), (𝑥𝑛, 1) 𝑘 (𝑥1, 1), (𝑥𝑛, 2)

𝑘 𝑥1, 2 , (𝑥𝑛, 1) 𝑘 𝑥1, 2 , (𝑥𝑛, 2)

⋮ ⋮𝑘 (𝑥𝑛, 1), (𝑥1, 1) 𝑘 (𝑥𝑛, 1), (𝑥1, 2)

𝑘 𝑥𝑛, 2 , (𝑥1, 1) 𝑘 𝑥𝑛, 2 , (𝑥1, 2)…

𝑘 (𝑥𝑛, 1), (𝑥𝑛, 1) 𝑘 (𝑥𝑛, 1), (𝑥𝑛, 2)

𝑘 𝑥𝑛, 2 , (𝑥𝑛, 1) 𝑘 𝑥𝑛, 2 , (𝑥𝑛, 2)

Page 9: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

A connection

Matrix-valued kernels can be reinterpreted as scalar-valued kernels:

Matrix valued kernel: 𝒌:𝒳 ×𝒳 → ℝ𝒅×𝒅

Scalar valued kernel: 𝑘:𝒳 × 1. . 𝑑 × 𝒳 × 1. . 𝑑 → ℝ

Bijection: Define𝑘( 𝑥, 𝑖 , 𝑥′, 𝑗 = 𝒌 𝑥′, 𝑥′ 𝑖,𝑗

All the theory developed for the scalar-valued GPs holds also for vector-valued GPs!

Page 10: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

The sampling space

Page 11: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

The space of samples

Sampling from 𝐺𝑃 𝜇, 𝑘 is done using the corresponding normal distribution 𝑁( Ԧ𝜇, K)

Algorithm (slightly inefficient)

1. Do an SVD: K = 𝑈𝐷2𝑈𝑇

2. Draw a normal vector 𝛼 ∼ 𝑁 0, 𝐼𝑛×𝑛3. Compute Ԧ𝜇 + 𝑈𝐷𝛼

11

Page 12: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

The space of samples

• From K = 𝑈𝐷2𝑈𝑇(using that 𝑈𝑇𝑈 = 𝐼) we have thatK𝑈𝐷−1 = 𝑈𝐷

• A sample 𝑠 = Ԧ𝜇 + 𝑈𝐷𝛼 = Ԧ𝜇 + K𝑈𝐷−1𝛼

corresponds to linear combinations of the columns of K.

12

• K is symmetric → rows/columns can be used interchangeably

Page 13: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Example: Squared exponential

13

σ = 1

𝑘 𝑥, 𝑥′ = exp −𝑥 − 𝑥′ 2

𝜎2

Page 14: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Example: Squared exponential

14

𝑘 𝑥, 𝑥′ = exp −𝑥 − 𝑥′ 2

𝜎2

σ = 3

Page 15: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Multi-scale signals

• k x, x′ = exp − 𝑥 −𝑥′

1

2

+ 0.1 exp − 𝑥 −𝑥′

0.1

2

15

Page 16: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Periodic kernels

• Define 𝑢 𝑥 =cos 𝑥sin(𝑥)

• 𝑘 𝑥, 𝑥′ = exp(−‖(𝑢 𝑥 − 𝑢 𝑥′ ‖2= exp(−4 sin2‖𝑥 −𝑥′‖

𝜎2)

16

Page 17: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Symmetric kernels

• Enforce that f(x) = f(-x)

• 𝑘 𝑥, 𝑥′ = 𝑘 −𝑥, 𝑥′ + 𝑘(𝑥, 𝑥′)

17

Page 18: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Changepoint kernels

• 𝑘 𝑥, 𝑥′ = 𝑠 𝑥 𝑘1 𝑥, 𝑥′ 𝑠 𝑥′ + (1 − 𝑠 𝑥 )𝑘2(𝑥, 𝑥′)(1 − 𝑠 𝑥′ )

• s 𝑥 =1

1+exp( −𝑥)

18

Page 19: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

f x = x

19

Combining existing functions

𝑘 𝑥, 𝑥′ = 𝑓 𝑥 𝑓 𝑥′

Page 20: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

20

Combining existing functions

𝑘 𝑥, 𝑥′ = 𝑓 𝑥 𝑓 𝑥′

f x = sin(x)

Page 21: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

{f1 x = x, f2 x = sin(x)}

21

𝑘 𝑥, 𝑥′ =

𝑖

𝑓𝑖 𝑥 𝑓𝑖(𝑥′)

Combining existing functions

Page 22: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Reproducing Kernel Hilbert Space

• Define the space of functions

𝐻 = {𝑓|𝑓 𝑥 =

𝑖=1

𝑁

𝛼𝑖𝑘 𝑥, 𝑥𝑖 , 𝑛 ∈ ℕ, 𝑥𝑖 ∈ 𝑋, 𝛼𝑖 ∈ ℝ}

For 𝑓 𝑥 = σ𝑖 𝛼𝑖𝑘 𝑥𝑖 , 𝑥 and 𝑔 𝑥 = σ𝑗 𝛼𝑗′𝑘(𝑥𝑗, 𝑥) we define the

inner product

𝑓, 𝑔 𝑘 =

𝑖,𝑗

𝛼𝑖𝛼𝑗′𝑘(𝑥𝑖 , 𝑥𝑗)

The space H called a Reproducing Kernel Hilbert Space (RKHS).

Page 23: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Two differnet basis for the RKHS

• Kernel basis • Eigenbasis (KL-Basis)

𝑘 𝑥, 𝑥′ = exp −𝑥 − 𝑥′ 2

9

Page 24: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Gaussian process regression

Page 25: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Gaussian process regression

• Given :Observations: {(𝑥1, 𝑦1), … , 𝑥𝑛, 𝑦𝑛 }

• Goal:compute p(𝑦∗|𝑥∗, 𝑥1, … , 𝑥𝑛, 𝑦1, … , 𝑦𝑛)

25𝑥1 𝑥2 𝑥𝑛𝑥∗

𝑦∗

Page 26: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Gaussian process regression

• Solution given by posterior process 𝐺𝑃 𝜇𝑝, 𝑘𝑝 with

𝜇𝑝(𝑥∗) = 𝐾 𝑥∗, 𝑋 𝐾 𝑋, 𝑋 + 𝜎2𝐼 −1𝑦

𝑘𝑝 𝑥∗, 𝑥∗′ = 𝑘 𝑥∗, 𝑥∗′ − 𝐾 𝑥∗, 𝑋 𝐾 𝑋, 𝑋 + 𝜎2𝐼 −1𝐾 𝑋, 𝑥∗′

• We can sample from the posterior.

26

Page 27: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Examples

27

Page 28: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Examples

28

Gaussian kernel (𝜎 = 1)

Page 29: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Examples

29

Gaussian kernel (𝜎 = 5)

Page 30: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Examples

30

Periodic kernel

Page 31: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Examples

31

Changepoint kernel

Page 32: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Examples

32

Symmetric kernel

Page 33: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Examples

33

Linear kernel

Page 34: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Observations about the solution

• The covariance is independent of the value at the training points

38

𝑘𝑝 𝑥∗, 𝑥∗′ = 𝑘 𝑥∗, 𝑥∗′ − 𝐾 𝑥∗, 𝑋 𝐾 𝑋, 𝑋 + 𝜎2𝐼 −1𝐾 𝑋, 𝑥∗′

Page 35: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

Kernels and associated structures

39

Page 36: Marcel Lüthi - unibas.chshapemodelling.cs.unibas.ch/pmm2018/slides/odds-and-ends.pdf•Younes, Laurent: Shapes and diffeomorphisms, Springer 2010 3 > DEPARTMENT OF MATHEMATICS AND

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE

An enlightening paper

40