ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · —...

94
Computing Unit 6: Optimization and Root Finding Kurt Hornik

Transcript of ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · —...

Page 1: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Computing Unit 6:Optimization and RootFinding

Kurt Hornik

November 12, 2020

Page 2: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Outline

� Basics

� Root finding

� Optimization

Slide 2

Page 3: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Basics

Root finding: for ƒ : Rn → Rm, find roots ∗ such that ƒ (∗) = 0.

Optimization: for ƒ : Rn → R and X ⊆ Rn, find ∗ ∈ X such thatƒ (∗) =mx∈X ƒ () (if a max is sought) or ƒ (∗) =min∈X ƒ () (if a minis sought).

Simplest case where max/min are attained: life can be morecomplicated.

If X = Rn: unconstrained optimization problem.

If ∗ is a root of ƒ : ⇒ minimizes ‖ƒ ()‖2 over the domain of ƒ .

If ∗ is a local optimum of ƒ and this is smooth: ⇒ ∗ is a critical pointof ƒ , i.e., ∇ƒ (∗) = 0.

Can find roots via optimization, and optimize via root finding!

Note that the ƒ for optimization always is a scalar function!

Slide 3

Page 4: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Basics

Root finding: for ƒ : Rn → Rm, find roots ∗ such that ƒ (∗) = 0.

Optimization: for ƒ : Rn → R and X ⊆ Rn, find ∗ ∈ X such thatƒ (∗) =mx∈X ƒ () (if a max is sought) or ƒ (∗) =min∈X ƒ () (if a minis sought).

Simplest case where max/min are attained: life can be morecomplicated.

If X = Rn: unconstrained optimization problem.

If ∗ is a root of ƒ : ⇒ minimizes ‖ƒ ()‖2 over the domain of ƒ .

If ∗ is a local optimum of ƒ and this is smooth: ⇒ ∗ is a critical pointof ƒ , i.e., ∇ƒ (∗) = 0.

Can find roots via optimization, and optimize via root finding!

Note that the ƒ for optimization always is a scalar function!

Slide 3

Page 5: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Basics

Root finding: for ƒ : Rn → Rm, find roots ∗ such that ƒ (∗) = 0.

Optimization: for ƒ : Rn → R and X ⊆ Rn, find ∗ ∈ X such thatƒ (∗) =mx∈X ƒ () (if a max is sought) or ƒ (∗) =min∈X ƒ () (if a minis sought).

Simplest case where max/min are attained: life can be morecomplicated.

If X = Rn: unconstrained optimization problem.

If ∗ is a root of ƒ : ⇒ minimizes ‖ƒ ()‖2 over the domain of ƒ .

If ∗ is a local optimum of ƒ and this is smooth: ⇒ ∗ is a critical pointof ƒ , i.e., ∇ƒ (∗) = 0.

Can find roots via optimization, and optimize via root finding!

Note that the ƒ for optimization always is a scalar function!

Slide 3

Page 6: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Basics

Root finding: for ƒ : Rn → Rm, find roots ∗ such that ƒ (∗) = 0.

Optimization: for ƒ : Rn → R and X ⊆ Rn, find ∗ ∈ X such thatƒ (∗) =mx∈X ƒ () (if a max is sought) or ƒ (∗) =min∈X ƒ () (if a minis sought).

Simplest case where max/min are attained: life can be morecomplicated.

If X = Rn: unconstrained optimization problem.

If ∗ is a root of ƒ : ⇒ minimizes ‖ƒ ()‖2 over the domain of ƒ .

If ∗ is a local optimum of ƒ and this is smooth: ⇒ ∗ is a critical pointof ƒ , i.e., ∇ƒ (∗) = 0.

Can find roots via optimization, and optimize via root finding!

Note that the ƒ for optimization always is a scalar function!

Slide 3

Page 7: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Basics

Root finding: for ƒ : Rn → Rm, find roots ∗ such that ƒ (∗) = 0.

Optimization: for ƒ : Rn → R and X ⊆ Rn, find ∗ ∈ X such thatƒ (∗) =mx∈X ƒ () (if a max is sought) or ƒ (∗) =min∈X ƒ () (if a minis sought).

Simplest case where max/min are attained: life can be morecomplicated.

If X = Rn: unconstrained optimization problem.

If ∗ is a root of ƒ : ⇒ minimizes ‖ƒ ()‖2 over the domain of ƒ .

If ∗ is a local optimum of ƒ and this is smooth: ⇒ ∗ is a critical pointof ƒ , i.e., ∇ƒ (∗) = 0.

Can find roots via optimization, and optimize via root finding!

Note that the ƒ for optimization always is a scalar function!

Slide 3

Page 8: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Basics

Root finding: for ƒ : Rn → Rm, find roots ∗ such that ƒ (∗) = 0.

Optimization: for ƒ : Rn → R and X ⊆ Rn, find ∗ ∈ X such thatƒ (∗) =mx∈X ƒ () (if a max is sought) or ƒ (∗) =min∈X ƒ () (if a minis sought).

Simplest case where max/min are attained: life can be morecomplicated.

If X = Rn: unconstrained optimization problem.

If ∗ is a root of ƒ : ⇒ minimizes ‖ƒ ()‖2 over the domain of ƒ .

If ∗ is a local optimum of ƒ and this is smooth: ⇒ ∗ is a critical pointof ƒ , i.e., ∇ƒ (∗) = 0.

Can find roots via optimization, and optimize via root finding!

Note that the ƒ for optimization always is a scalar function!Slide 3

Page 9: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Outline

� Basics

� Root finding

� Optimization

Slide 4

Page 10: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Root finding

� Solve systems of linear equations ⇒ numerical linear algebra (2 moreweeks)

� Find roots of polynomials: polyroot().� Find roots of continuous univariate functions: uniroot().

This is based on the bisection method.� Already encountered Newton’s method for root finding, which is a

fixed-point iteration method.Interestingly, no function for this in base R.

Slide 5

Page 11: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Root finding

� Solve systems of linear equations ⇒ numerical linear algebra (2 moreweeks)

� Find roots of polynomials: polyroot().

� Find roots of continuous univariate functions: uniroot().This is based on the bisection method.

� Already encountered Newton’s method for root finding, which is afixed-point iteration method.Interestingly, no function for this in base R.

Slide 5

Page 12: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Root finding

� Solve systems of linear equations ⇒ numerical linear algebra (2 moreweeks)

� Find roots of polynomials: polyroot().� Find roots of continuous univariate functions: uniroot().

This is based on the bisection method.

� Already encountered Newton’s method for root finding, which is afixed-point iteration method.Interestingly, no function for this in base R.

Slide 5

Page 13: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Root finding

� Solve systems of linear equations ⇒ numerical linear algebra (2 moreweeks)

� Find roots of polynomials: polyroot().� Find roots of continuous univariate functions: uniroot().

This is based on the bisection method.� Already encountered Newton’s method for root finding, which is a

fixed-point iteration method.Interestingly, no function for this in base R.

Slide 5

Page 14: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Bisection method

Suppose ƒ is continuous with a sign change in [, b] (i.e., ƒ () and ƒ (b)have opposite signs).

Then ƒ must have at least one root in [, b].

Can be found by the following iteration: Starting with 0 = and b0 = b,

� Take the current interval [n, bn].� If bn − n is small enough; done. Otherwise, compute the midpointn = (n + bn)/2.

� If ƒ has a sign change in [n, n], take this as the next interval;otherwise take [n, bn].

Slide 6

Page 15: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Bisection method

Suppose ƒ is continuous with a sign change in [, b] (i.e., ƒ () and ƒ (b)have opposite signs).

Then ƒ must have at least one root in [, b].

Can be found by the following iteration: Starting with 0 = and b0 = b,

� Take the current interval [n, bn].� If bn − n is small enough; done. Otherwise, compute the midpointn = (n + bn)/2.

� If ƒ has a sign change in [n, n], take this as the next interval;otherwise take [n, bn].

Slide 6

Page 16: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Bisection method

In each step, halves the interval (length).

After n steps, interval length is (b − )/2n.

So the interval converges to a root of ƒ “exponentially fast”.

Hence the computational complexity is logarithmic in the precision(interval length) sought.

Slide 7

Page 17: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Bisection method

In each step, halves the interval (length).

After n steps, interval length is (b − )/2n.

So the interval converges to a root of ƒ “exponentially fast”.

Hence the computational complexity is logarithmic in the precision(interval length) sought.

Slide 7

Page 18: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Bisection method

In each step, halves the interval (length).

After n steps, interval length is (b − )/2n.

So the interval converges to a root of ƒ “exponentially fast”.

Hence the computational complexity is logarithmic in the precision(interval length) sought.

Slide 7

Page 19: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example

The function ƒ () = cos() − has a root in [0, π/2]:

R> f <- function(x) cos(x) - xR> plot(f, 0, pi / 2); abline(h = 0)

0.0 0.5 1.0 1.5

−1.

5−

1.0

−0.

50.

00.

51.

0

x

f

Slide 8

Page 20: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example

R> uniroot(f, c(0, pi / 2))

$root[1] 0.7390839

$f.root[1] 2.059443e-06

$iter[1] 5

$init.it[1] NA

$estim.prec[1] 6.103516e-05

Slide 9

Page 21: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Fixed-point iteration

Suppose g is continuous and the iteration

n+1 = g(n)

gives a sequence n with (finite) limit ∗. What can we say about ∗?

By continuity,

∗ = limnn+1 = limn g(n) = g

limnn�

= g(∗)

Thus ∗ must be a fixed point of g, i.e., solve ∗ = g(∗).

Can use to find roots of ƒ via fixed points of g() = + ƒ ()? Not quite sosimple . . .

Slide 10

Page 22: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Fixed-point iteration

Suppose g is continuous and the iteration

n+1 = g(n)

gives a sequence n with (finite) limit ∗. What can we say about ∗?

By continuity,

∗ = limnn+1 = limn g(n) = g

limnn�

= g(∗)

Thus ∗ must be a fixed point of g, i.e., solve ∗ = g(∗).

Can use to find roots of ƒ via fixed points of g() = + ƒ ()? Not quite sosimple . . .

Slide 10

Page 23: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Fixed-point iteration

Suppose g is continuous and the iteration

n+1 = g(n)

gives a sequence n with (finite) limit ∗. What can we say about ∗?

By continuity,

∗ = limnn+1 = limn g(n) = g

limnn�

= g(∗)

Thus ∗ must be a fixed point of g, i.e., solve ∗ = g(∗).

Can use to find roots of ƒ via fixed points of g() = + ƒ ()? Not quite sosimple . . .

Slide 10

Page 24: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Fixed-point iteration

If g is smooth about ∗,

n+1 − ∗ = g(n) − g(∗) ≈ g′(∗)(n − ∗),

So convergence somehow needs g to be attractive at ∗, i.e.,|g′(∗)| < 1.

Otherwise, the approximation error n − ∗ would not get reduced.

In the attractive case, approximation errors go to zero exponentiallyfast.

Slide 11

Page 25: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Fixed-point iteration

If g is smooth about ∗,

n+1 − ∗ = g(n) − g(∗) ≈ g′(∗)(n − ∗),

So convergence somehow needs g to be attractive at ∗, i.e.,|g′(∗)| < 1.

Otherwise, the approximation error n − ∗ would not get reduced.

In the attractive case, approximation errors go to zero exponentiallyfast.

Slide 11

Page 26: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Fixed-point iteration

If g is smooth about ∗,

n+1 − ∗ = g(n) − g(∗) ≈ g′(∗)(n − ∗),

So convergence somehow needs g to be attractive at ∗, i.e.,|g′(∗)| < 1.

Otherwise, the approximation error n − ∗ would not get reduced.

In the attractive case, approximation errors go to zero exponentiallyfast.

Slide 11

Page 27: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Fixed-point iteration

Prime example: Newton’s method. This uses

g() = −ƒ ()

ƒ ′().

Clearly,

= g()⇒ƒ ()

ƒ ′()= 0⇒ ƒ () = 0.

Also,

g′() = 1 −ƒ ′()2 − ƒ ()ƒ ′′()

ƒ ′()2=ƒ ()ƒ ′′()

ƒ ′()2.

Slide 12

Page 28: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Fixed-point iteration

Prime example: Newton’s method. This uses

g() = −ƒ ()

ƒ ′().

Clearly,

= g()⇒ƒ ()

ƒ ′()= 0⇒ ƒ () = 0.

Also,

g′() = 1 −ƒ ′()2 − ƒ ()ƒ ′′()

ƒ ′()2=ƒ ()ƒ ′′()

ƒ ′()2.

Slide 12

Page 29: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Fixed-point iteration

Prime example: Newton’s method. This uses

g() = −ƒ ()

ƒ ′().

Clearly,

= g()⇒ƒ ()

ƒ ′()= 0⇒ ƒ () = 0.

Also,

g′() = 1 −ƒ ′()2 − ƒ ()ƒ ′′()

ƒ ′()2=ƒ ()ƒ ′′()

ƒ ′()2.

Slide 12

Page 30: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Fixed-point iteration

Hence, if ∗ is a fixed point of g: g′(∗) = 0!

In a sense, such fixed points are “very attractive”.

Convergence to ∗ is actually “locally quadratic” in the sense that

|n+1 − ∗| ≤ const |n − ∗|2.

(Note the confusing terminology: approximation errors go to zero fasterthan exponentially fast!).

Slide 13

Page 31: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Fixed-point iteration

Hence, if ∗ is a fixed point of g: g′(∗) = 0!

In a sense, such fixed points are “very attractive”.

Convergence to ∗ is actually “locally quadratic” in the sense that

|n+1 − ∗| ≤ const |n − ∗|2.

(Note the confusing terminology: approximation errors go to zero fasterthan exponentially fast!).

Slide 13

Page 32: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Outline

� Basics

� Root finding

� Optimization

Slide 14

Page 33: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Optimization with base R

� optimize() optimizes a univariate function based on golden sectionsearch.

� nlm() optimizes functions via Newton-type algorithms (i.e., viafinding critical points)

� optim() provides several optimization methods: Nelder-Mead,quasi-Newton and conjugate-gradient algorithms, and simulatedannealing.

� nls() solves non-linear least squares problems� constrOptim() minimizes a function subject to linear inequality

constraints using an adaptive barrier algorithm.

Much more in add-on packages . . . and other courses.

Here, we really only show the tip of the iceberg, and some basics.

Slide 15

Page 34: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Optimization with base R

� optimize() optimizes a univariate function based on golden sectionsearch.

� nlm() optimizes functions via Newton-type algorithms (i.e., viafinding critical points)

� optim() provides several optimization methods: Nelder-Mead,quasi-Newton and conjugate-gradient algorithms, and simulatedannealing.

� nls() solves non-linear least squares problems� constrOptim() minimizes a function subject to linear inequality

constraints using an adaptive barrier algorithm.

Much more in add-on packages . . . and other courses.

Here, we really only show the tip of the iceberg, and some basics.

Slide 15

Page 35: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Optimization with base R

� optimize() optimizes a univariate function based on golden sectionsearch.

� nlm() optimizes functions via Newton-type algorithms (i.e., viafinding critical points)

� optim() provides several optimization methods: Nelder-Mead,quasi-Newton and conjugate-gradient algorithms, and simulatedannealing.

� nls() solves non-linear least squares problems� constrOptim() minimizes a function subject to linear inequality

constraints using an adaptive barrier algorithm.

Much more in add-on packages . . . and other courses.

Here, we really only show the tip of the iceberg, and some basics.

Slide 15

Page 36: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Optimization with base R

� optimize() optimizes a univariate function based on golden sectionsearch.

� nlm() optimizes functions via Newton-type algorithms (i.e., viafinding critical points)

� optim() provides several optimization methods: Nelder-Mead,quasi-Newton and conjugate-gradient algorithms, and simulatedannealing.

� nls() solves non-linear least squares problems� constrOptim() minimizes a function subject to linear inequality

constraints using an adaptive barrier algorithm.

Much more in add-on packages . . . and other courses.

Here, we really only show the tip of the iceberg, and some basics.

Slide 15

Page 37: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

A function ƒ is unimodal (over some interval [, b]) if there is an m inthe interval such that

ƒ is increasing for ≤ < m and decreasing for m< ≤ b.

In this case, clearly ƒ has its maximum at m, and no local maxima.

ƒ does not have to be continuous. Example:

Slide 16

Page 38: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

0.0 0.5 1.0 1.5 2.0 2.5 3.0

−2.

0−

1.5

−1.

0−

0.5

0.0

0.5

x

f

Slide 17

Page 39: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Golden section search is based on “trisection”.

One starts with an interval [, b] containing the maximum, and addstwo more points 1 and 2 to cut into three subintervals:

< 1 < 2 < b.

If ƒ (1) > ƒ (2), the max cannot be in [2, b], and the new interval is[, 2].

If ƒ (1) < ƒ (2), the max cannot be in [, 1], and the new interval is[1, b].

Slide 18

Page 40: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Golden section search is based on “trisection”.

One starts with an interval [, b] containing the maximum, and addstwo more points 1 and 2 to cut into three subintervals:

< 1 < 2 < b.

If ƒ (1) > ƒ (2), the max cannot be in [2, b], and the new interval is[, 2].

If ƒ (1) < ƒ (2), the max cannot be in [, 1], and the new interval is[1, b].

Slide 18

Page 41: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search−

2.0

−1.

5−

1.0

−0.

50.

00.

5

x

f

a x1 x2 b

Slide 19

Page 42: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Golden section search is based on the following idea:

� Intervals are always partitioned in constant proportions:

1 = + γ1(b − ), 2 = + γ2(b − )

The γ are chosen in a way that one of the previous points (andhence its function value ƒ ()) can be re-used.

Why? Function evaluation can be very expensive.

Slide 20

Page 43: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Golden section search is based on the following idea:

� Intervals are always partitioned in constant proportions:

1 = + γ1(b − ), 2 = + γ2(b − )

The γ are chosen in a way that one of the previous points (andhence its function value ƒ ()) can be re-used.

Why? Function evaluation can be very expensive.

Slide 20

Page 44: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

If ƒ (1) > ƒ (2), the new interval is [′ = , b′ = 2], and the new points are

′1 = + γ1(2 − ), ′2 = + γ2(2 − )

One of these must be 1, and clearly ′1 < 1, so we must have ′2 = 1.

Thus,

+ γ1(b − ) = 1 = ′2 = + γ2(2 − ) = + γ22(b − )

from which γ1 = γ22.

Slide 21

Page 45: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

If ƒ (1) > ƒ (2), the new interval is [′ = , b′ = 2], and the new points are

′1 = + γ1(2 − ), ′2 = + γ2(2 − )

One of these must be 1, and clearly ′1 < 1, so we must have ′2 = 1.

Thus,

+ γ1(b − ) = 1 = ′2 = + γ2(2 − ) = + γ22(b − )

from which γ1 = γ22.

Slide 21

Page 46: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

If ƒ (1) < ƒ (2), the new interval is [′ = 1, b′ = b], and the new points are

′1 = 1 + γ1(b − 1), ′2 = 1 + γ2(b − 1)

One of these must be 2 = (1 − γ2) + γ2b. As clearly′2 = (1 − γ2)1 + γ2b > 2, we must have 2 = ′1. Thus:

+ γ2(b − ) = 1 + γ1(b − 1)= (1 − γ1)1 + γ1b= + (1 − γ1)(1 − ) + γ1(b − )= + (1 − γ1)γ1(b − ) + γ1(b − )

from which γ2 = 2γ1 − γ21.

Slide 22

Page 47: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

If ƒ (1) < ƒ (2), the new interval is [′ = 1, b′ = b], and the new points are

′1 = 1 + γ1(b − 1), ′2 = 1 + γ2(b − 1)

One of these must be 2 = (1 − γ2) + γ2b. As clearly′2 = (1 − γ2)1 + γ2b > 2, we must have 2 = ′1. Thus:

+ γ2(b − ) = 1 + γ1(b − 1)= (1 − γ1)1 + γ1b= + (1 − γ1)(1 − ) + γ1(b − )= + (1 − γ1)γ1(b − ) + γ1(b − )

from which γ2 = 2γ1 − γ21.Slide 22

Page 48: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Looks complicated? We have

γ22 = γ1, γ21 = 2γ1 − γ2.

Subtracting:

γ22 − γ21 = γ1 − (2γ1 − γ2) = γ2 − γ1.

As γ1 < γ2 and γ22 − γ21 = (γ2 − γ1)(γ2 + γ1), we get

γ2 + γ1 = 1.

Slide 23

Page 49: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Looks complicated? We have

γ22 = γ1, γ21 = 2γ1 − γ2.

Subtracting:

γ22 − γ21 = γ1 − (2γ1 − γ2) = γ2 − γ1.

As γ1 < γ2 and γ22 − γ21 = (γ2 − γ1)(γ2 + γ1), we get

γ2 + γ1 = 1.

Slide 23

Page 50: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Looks complicated? We have

γ22 = γ1, γ21 = 2γ1 − γ2.

Subtracting:

γ22 − γ21 = γ1 − (2γ1 − γ2) = γ2 − γ1.

As γ1 < γ2 and γ22 − γ21 = (γ2 − γ1)(γ2 + γ1), we get

γ2 + γ1 = 1.

Slide 23

Page 51: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Substituting now gives γ22 = γ1 = 1 − γ2, so that γ2 solves

γ22 + γ2 − 1 = 0.

Looks familiar?

Roots are −1/2 ±p

5/4, so that

γ2 =

p5 − 1

2=

p5 − 1

2

p5 + 1p5 + 1

=2

p5 + 1

=1

ϕ≈ 0.61803

is the reciprocal value of the golden ratio ϕ!

Similarly,

γ1 = 1 − γ2 = 1 −1

ϕ= 1 − (ϕ − 1) = 2 − ϕ ≈ 0.38197.

Slide 24

Page 52: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Substituting now gives γ22 = γ1 = 1 − γ2, so that γ2 solves

γ22 + γ2 − 1 = 0.

Looks familiar?

Roots are −1/2 ±p

5/4, so that

γ2 =

p5 − 1

2=

p5 − 1

2

p5 + 1p5 + 1

=2

p5 + 1

=1

ϕ≈ 0.61803

is the reciprocal value of the golden ratio ϕ!

Similarly,

γ1 = 1 − γ2 = 1 −1

ϕ= 1 − (ϕ − 1) = 2 − ϕ ≈ 0.38197.

Slide 24

Page 53: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Good to know (so that we can impress friends and family).

One can show that using the golden section proportions providesoptimal shrinkage of the interval length.

Text book and lecture notes contain a simple implementation.

In real life, just use optimize().

With the function we used as example:

R> f <- function(x) - (x - 0.5) * (x - 2.25) + ifelse(x < 1, -1, 0)

Slide 25

Page 54: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Good to know (so that we can impress friends and family).

One can show that using the golden section proportions providesoptimal shrinkage of the interval length.

Text book and lecture notes contain a simple implementation.

In real life, just use optimize().

With the function we used as example:

R> f <- function(x) - (x - 0.5) * (x - 2.25) + ifelse(x < 1, -1, 0)

Slide 25

Page 55: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Good to know (so that we can impress friends and family).

One can show that using the golden section proportions providesoptimal shrinkage of the interval length.

Text book and lecture notes contain a simple implementation.

In real life, just use optimize().

With the function we used as example:

R> f <- function(x) - (x - 0.5) * (x - 2.25) + ifelse(x < 1, -1, 0)

Slide 25

Page 56: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Good to know (so that we can impress friends and family).

One can show that using the golden section proportions providesoptimal shrinkage of the interval length.

Text book and lecture notes contain a simple implementation.

In real life, just use optimize().

With the function we used as example:

R> f <- function(x) - (x - 0.5) * (x - 2.25) + ifelse(x < 1, -1, 0)

Slide 25

Page 57: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Good to know (so that we can impress friends and family).

One can show that using the golden section proportions providesoptimal shrinkage of the interval length.

Text book and lecture notes contain a simple implementation.

In real life, just use optimize().

With the function we used as example:

R> f <- function(x) - (x - 0.5) * (x - 2.25) + ifelse(x < 1, -1, 0)

Slide 25

Page 58: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

R> optimize(f, c(0, 3), maximum = TRUE)

$maximum[1] 1.375

$objective[1] 0.765625

Slide 26

Page 59: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

0.0 0.5 1.0 1.5 2.0 2.5 3.0

−2.

0−

1.5

−1.

0−

0.5

0.0

0.5

x

f

x*

Slide 27

Page 60: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Things to note:

� Golden section search is only guaranteed to work if ƒ is unimodal (formax).

� By default, optimize does minimization and not maximization.� optimize only works for functions ƒ : R→ R. Does not need

derivatives: blessing or curse?

Slide 28

Page 61: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Things to note:

� Golden section search is only guaranteed to work if ƒ is unimodal (formax).

� By default, optimize does minimization and not maximization.

� optimize only works for functions ƒ : R→ R. Does not needderivatives: blessing or curse?

Slide 28

Page 62: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Golden section search

Things to note:

� Golden section search is only guaranteed to work if ƒ is unimodal (formax).

� By default, optimize does minimization and not maximization.� optimize only works for functions ƒ : R→ R. Does not need

derivatives: blessing or curse?

Slide 28

Page 63: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the Poisson distribu-tion

The density of the Poisson distribution with parameter λ at(non-negative integer) is

λ

!e−λ

Suppose we observe a sample 1, . . . , n from a Poisson distribution withunknown λ. How can we estimate λ from that sample?

We’ll learn how to do this in the Statistics 2 course.

Slide 29

Page 64: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the Poisson distribu-tion

The density of the Poisson distribution with parameter λ at(non-negative integer) is

λ

!e−λ

Suppose we observe a sample 1, . . . , n from a Poisson distribution withunknown λ. How can we estimate λ from that sample?

We’ll learn how to do this in the Statistics 2 course.

Slide 29

Page 65: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the Poisson distribu-tion

The density of the Poisson distribution with parameter λ at(non-negative integer) is

λ

!e−λ

Suppose we observe a sample 1, . . . , n from a Poisson distribution withunknown λ. How can we estimate λ from that sample?

We’ll learn how to do this in the Statistics 2 course.

Slide 29

Page 66: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the Poisson distribu-tion

The preferred method is maximum likelihood estimation (MLE): oneforms the likelihood function

L(λ|1, . . . , n) =n∏

=1

λ

!e−λ

and maximizes this over λ.

Usually, one actually takes the log-likelihood because this is moreconvenient to work with.

Slide 30

Page 67: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the Poisson distribu-tion

Up to an additive constant, the log-likelihood is

LL(λ|1, . . . , n) =n∑

=1

( log(λ) − λ) = s(1, . . . , n) log(λ) − nλ

with s(1, . . . , n) =∑n

=1 .

So for the MLE we have to find mxλ>0 s log(λ) − nλ.

The MLE is obviously given by λ̂ = s/n = ̄.

Nice as we can do math to find a simple formula.

Slide 31

Page 68: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the Poisson distribu-tion

Up to an additive constant, the log-likelihood is

LL(λ|1, . . . , n) =n∑

=1

( log(λ) − λ) = s(1, . . . , n) log(λ) − nλ

with s(1, . . . , n) =∑n

=1 .

So for the MLE we have to find mxλ>0 s log(λ) − nλ.

The MLE is obviously given by λ̂ = s/n = ̄.

Nice as we can do math to find a simple formula.

Slide 31

Page 69: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the Poisson distribu-tion

Up to an additive constant, the log-likelihood is

LL(λ|1, . . . , n) =n∑

=1

( log(λ) − λ) = s(1, . . . , n) log(λ) − nλ

with s(1, . . . , n) =∑n

=1 .

So for the MLE we have to find mxλ>0 s log(λ) − nλ.

The MLE is obviously given by λ̂ = s/n = ̄.

Nice as we can do math to find a simple formula.

Slide 31

Page 70: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the Poisson distribu-tion

Numerically, we can simply do

R> mle_pois <- function(x) {+ LL <- function(lambda) {+ sum(x) * log(lambda) - length(x) * lambda+ }+ optimize(LL, lower = 0, upper = max(x), maximum = TRUE)+ }

(Some finite upper bound is needed.)

Slide 32

Page 71: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the Poisson distribu-tion

To illustrate:

R> x <- rpois(100, 3.2)R> mle_pois(x)

$maximum[1] 3.660001

$objective[1] 108.8715

R> ## Compare toR> mean(x)

[1] 3.66

Slide 33

Page 72: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Multivariate optimization

Newton-type algorithms use the following idea.

Suppose ƒ : X→ R with X ⊆ Rn is twice continuously differentiable on theinterior of X and attains its max there at some ∗.

Then ∗ must be a critical point:

∇ƒ (∗) = 0,

where

∇ƒ () =�

∂ƒ

∂ξ1(), . . . ,

∂ƒ

∂ξn()

�′

, = (ξ1, . . . , ξn)′

is the gradient (and the prime denotes transposition).

Slide 34

Page 73: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Multivariate optimization

We can try finding roots of the gradient using a multivariate version ofNewton’s method.

First, close to n we have

∇ƒ () ≈ Ln() = ∇ƒ (n) + Hƒ (n)( − n),

where

Hƒ () =

∂2ƒ

∂ξ∂ξj()

1≤,j≤n

is the Hessian of ƒ at , i.e., the matrix of all second partial derivativesof ƒ .

Details in the math course.

Slide 35

Page 74: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Multivariate optimization

Now solve Ln() = 0. Mathematically, this gives the linear system

Hƒ (n)( − n) = −∇ƒ (n)

which has solution

− n = −Hƒ (n)−1∇ƒ (n).

Thus, one gets the iteration

n+1 = n − Hƒ (n)−1∇ƒ (n).

That’s the basic idea. Things are really more complicated, butfortunately we can simply use functions like optim() or nlm().

Slide 36

Page 75: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Multivariate optimization

Now solve Ln() = 0. Mathematically, this gives the linear system

Hƒ (n)( − n) = −∇ƒ (n)

which has solution

− n = −Hƒ (n)−1∇ƒ (n).

Thus, one gets the iteration

n+1 = n − Hƒ (n)−1∇ƒ (n).

That’s the basic idea. Things are really more complicated, butfortunately we can simply use functions like optim() or nlm().

Slide 36

Page 76: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Multivariate optimization

Now solve Ln() = 0. Mathematically, this gives the linear system

Hƒ (n)( − n) = −∇ƒ (n)

which has solution

− n = −Hƒ (n)−1∇ƒ (n).

Thus, one gets the iteration

n+1 = n − Hƒ (n)−1∇ƒ (n).

That’s the basic idea. Things are really more complicated, butfortunately we can simply use functions like optim() or nlm().

Slide 36

Page 77: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Multivariate optimization

Things to note:

� This needs ƒ smooth and the first and second partials of ƒ . We caneither provide these in addition to ƒ , or have R approximate thesenumerically.

� This can only find some local optimum of ƒ , which may not be theglobal one.

� Things will always be fine if we maximize a concave function.

Slide 37

Page 78: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Multivariate optimization

Things to note:

� This needs ƒ smooth and the first and second partials of ƒ . We caneither provide these in addition to ƒ , or have R approximate thesenumerically.

� This can only find some local optimum of ƒ , which may not be theglobal one.

� Things will always be fine if we maximize a concave function.

Slide 37

Page 79: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Multivariate optimization

Things to note:

� This needs ƒ smooth and the first and second partials of ƒ . We caneither provide these in addition to ƒ , or have R approximate thesenumerically.

� This can only find some local optimum of ƒ , which may not be theglobal one.

� Things will always be fine if we maximize a concave function.

Slide 37

Page 80: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion

The density of the normal distribution with parameters μ and σ2 at isgiven by

1p2πσ

exp

( − μ)2

2σ2

Suppose we observe a sample 1, . . . , n from a normal distribution withunknown μ and σ2. How can we estimate μ and σ2 from that sample?

Slide 38

Page 81: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion

The density of the normal distribution with parameters μ and σ2 at isgiven by

1p2πσ

exp

( − μ)2

2σ2

Suppose we observe a sample 1, . . . , n from a normal distribution withunknown μ and σ2. How can we estimate μ and σ2 from that sample?

Slide 38

Page 82: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion

The likelihood of a sample 1, . . . , n is

L(μ, σ2|1, . . . , n) =n∏

=1

1p2πσ

exp

( − μ)2

2σ2

Up to an additive constant, the log-likelihood is

LL(μ, σ2) = −n

2log(σ2) −

1

2σ2

n∑

=1

( − μ)2.

This has partials

∂LL

∂μ=1

σ2

n∑

=1

( − μ),∂LL

∂σ2= −

n

2

1

σ2+

1

2σ4

n∑

=1

( − μ)2.

Slide 39

Page 83: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion

The likelihood of a sample 1, . . . , n is

L(μ, σ2|1, . . . , n) =n∏

=1

1p2πσ

exp

( − μ)2

2σ2

Up to an additive constant, the log-likelihood is

LL(μ, σ2) = −n

2log(σ2) −

1

2σ2

n∑

=1

( − μ)2.

This has partials

∂LL

∂μ=1

σ2

n∑

=1

( − μ),∂LL

∂σ2= −

n

2

1

σ2+

1

2σ4

n∑

=1

( − μ)2.

Slide 39

Page 84: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion

The likelihood of a sample 1, . . . , n is

L(μ, σ2|1, . . . , n) =n∏

=1

1p2πσ

exp

( − μ)2

2σ2

Up to an additive constant, the log-likelihood is

LL(μ, σ2) = −n

2log(σ2) −

1

2σ2

n∑

=1

( − μ)2.

This has partials

∂LL

∂μ=1

σ2

n∑

=1

( − μ),∂LL

∂σ2= −

n

2

1

σ2+

1

2σ4

n∑

=1

( − μ)2.

Slide 39

Page 85: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion

This has a unique critical point given by

μ̂ = ̄, σ̂2 =1

n

n∑

=1

( − ̄)2

where ̄ is the mean of , which thus gives the MLEs.

Note that the MLE of σ2 is not the sample variance!

Nice as we can do math to find a simple formula.

Slide 40

Page 86: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion

This has a unique critical point given by

μ̂ = ̄, σ̂2 =1

n

n∑

=1

( − ̄)2

where ̄ is the mean of , which thus gives the MLEs.

Note that the MLE of σ2 is not the sample variance!

Nice as we can do math to find a simple formula.

Slide 40

Page 87: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion

Numerically, we can simply minimize twice the negative log-likelihood:

R> mle_norm <- function(x, p0) {+ nLL <- function(p) {+ mu <- p[1]+ sigmasq <- p[2]+ length(x) * log(sigmasq) + sum((x - mu) ^ 2) / sigmasq+ }+ optim(p0, nLL)+ }

This needs some initial estimate of (μ, σ2).

Slide 41

Page 88: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion I

To illustrate (note that in R the normal distribution is parametrized bythe standard deviation σ and not the variance σ2):

R> x <- rnorm(100, 0.5, 2)R> mle_norm(x, c(0, 1))

$par[1] 0.6005534 4.5651644

$value[1] 251.826

$countsfunction gradient

53 NA

$convergenceSlide 42

Page 89: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion II

[1] 0

$messageNULL

R> ## Compare toR> c(mean(x), (1 - 1 / length(x)) * var(x))

[1] 0.6003242 4.5642757

Slide 43

Page 90: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion

By default, this actually uses Nelder-Mead, whichuses only function values and is robust but relatively slow. It will workreasonably well for non-differentiable functions.

Could also do quasi-Newton: optim(method = "BFGS"), whichuses function values and gradients to build up a picture of the surfaceto be optimized.

(No explicit computation of second partials.)

Slide 44

Page 91: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion

By default, this actually uses Nelder-Mead, whichuses only function values and is robust but relatively slow. It will workreasonably well for non-differentiable functions.

Could also do quasi-Newton: optim(method = "BFGS"), whichuses function values and gradients to build up a picture of the surfaceto be optimized.

(No explicit computation of second partials.)

Slide 44

Page 92: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion

Alternatively, we can use nlm() which employs a Newton-type method(and always minimizes), without explicitly providing gradients andHessians):

R> nLL2 <- function(p, x) {+ mu <- p[1]+ sigmasq <- p[2]+ length(x) * log(sigmasq) + sum((x - mu) ^ 2) / sigmasq+ }

Starting from (0,1) again:

Slide 45

Page 93: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion

Alternatively, we can use nlm() which employs a Newton-type method(and always minimizes), without explicitly providing gradients andHessians):

R> nLL2 <- function(p, x) {+ mu <- p[1]+ sigmasq <- p[2]+ length(x) * log(sigmasq) + sum((x - mu) ^ 2) / sigmasq+ }

Starting from (0,1) again:

Slide 45

Page 94: ComputingUnit6: Optimization and Root Findinghornik/Comp/comp_unit6.pdf · 2020. 11. 12. · — Find roots of polynomials: polyroot(). — Find roots of continuous univariate functions:

Example: MLE for the normal distribu-tion

R> nlm(nLL2, c(0, 1), x)

$minimum[1] 251.826

$estimate[1] 0.6003227 4.5642620

$gradient[1] -4.237677e-05 -5.506546e-05

$code[1] 1

$iterations[1] 15

Slide 46