DATA MINING from data to information Ronald Westra Dep. Mathematics Knowledge Engineering
Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr....
Transcript of Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr....
![Page 1: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/1.jpg)
Data Mining - Mathematics
Dr. Jean-Michel RICHER
Dr. Jean-Michel RICHER Data Mining - Mathematics 1 / 102
![Page 2: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/2.jpg)
Outline
1. Introduction
2. Matrices and vectors
3. Derivative
4. Gradient
5. Lagrangian
6. Exercises
Dr. Jean-Michel RICHER Data Mining - Mathematics 2 / 102
![Page 3: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/3.jpg)
1. Introduction
Dr. Jean-Michel RICHER Data Mining - Mathematics 3 / 102
![Page 4: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/4.jpg)
What we will cover
What we will coverMathematical background needed for the understandingof Machine Learning techniques:
matrix and vector operationsderivativegradientlagrangian
Dr. Jean-Michel RICHER Data Mining - Mathematics 4 / 102
![Page 5: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/5.jpg)
Difficulty of mathematics
Difficulty of mathematicsuse of symbols that represent expressionssome symbols have different meaning depending onthe contextsometimes strict syntax and sometimes not
Dr. Jean-Michel RICHER Data Mining - Mathematics 5 / 102
![Page 6: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/6.jpg)
Difficulty of mathematics
Example of a prime numbern is a prime number if it has only two divisors 1 and itself(but with restriction that x 6= 1)
implies that we deal with integersimplies the notion of divisibility:
∀ n,p,q, r ∈ N
n = p × q + r
n is divisible by q if (and only if) r = 0 and p ≥ 1
Dr. Jean-Michel RICHER Data Mining - Mathematics 6 / 102
![Page 7: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/7.jpg)
2. Matrices and vectors
Dr. Jean-Michel RICHER Data Mining - Mathematics 7 / 102
![Page 8: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/8.jpg)
Vector
Vectora series of values that can be identified by their indexin an array of length p
x(p) or simply x ∈ Rp
for computer scientists a 1D array of length p
x(p) = (x1, x2, . . . , xp) = [x1, x2, . . . , xp], xi ∈ Rcan also be represented vertically (formathematicians):
x(p) = x =
x1x2...
xp
, xT = [x1, x2, . . . , xp]
in this case xT (T for Transposition) will be the horizontalrepresentation
Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102
![Page 9: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/9.jpg)
Operations on vectors
Operations on vectorsvectors must have the same length+, −, ×, /the dot (or scalar) product of two vectors
xy = x · y = x � y = xT y =
p∑i=1
xi × yi
norm (or length) of a vector ||x || =√
x · x
Dr. Jean-Michel RICHER Data Mining - Mathematics 9 / 102
![Page 10: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/10.jpg)
Example of operations on vectors
Examples of operations on vectors
x = [1,−2, 3]y = [−1, 4,−7]x + y = [1 + (−1),−2 + 4, 3 + (−7)] = [0, 2,−4]x × y = [1×−1,−2× 4, 3×−7] = [−1,−8,−21]xy = (1×−1) + (−2× 4) + (3×−7) = −30
xT y = [1,−2, 3] ·
−14−7
= −30
Dr. Jean-Michel RICHER Data Mining - Mathematics 10 / 102
![Page 11: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/11.jpg)
Examples of operations on vectors
Norm of a vector in 2D
x · x = (2× 2) + (3× 3)
= 4 + 9 = 13
||x || =√
13 = 3.6055
3
2
Dr. Jean-Michel RICHER Data Mining - Mathematics 11 / 102
![Page 12: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/12.jpg)
Matrix
MatrixX(p) = (x1, x2, . . . , xp) where xi ∈ Rn, notation that canlead to confusionfor computer scientists a 2D array defined by :
I n rowsI and p columns
can be seen as an array of vectors or vector ofvectors
X(n,p) =
x1
1 x21 . . . xp
1x1
2 x22 . . . xp
2...
. . . . . ....
x1n x2
n . . . xpn
=
x11
x12...
x1n
x21
x22...
x2n
· · ·
xp1
xp2...
xpn
x ji is the element in row i and column j
Dr. Jean-Michel RICHER Data Mining - Mathematics 12 / 102
![Page 13: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/13.jpg)
Properties of matrices
Square matricesa square matrix is such that n = p
I the diagonal is a separation lineI called lower triangular if all the entries above the main
diagonal are zeroI called upper triangular if all the entries under the main
diagonal are zero
I or In is the identity matrix
In =
1 0 . . . 0
0 1. . .
......
. . . 1 00 . . . 0 1
Dr. Jean-Michel RICHER Data Mining - Mathematics 13 / 102
![Page 14: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/14.jpg)
Operations on matrices
Matrix summatrices must have the same dimensions
X(n,p) =
x1
1 x21 . . . xp
1x2
1 x22 . . . xp
2...
.... . .
...x1
n x2n . . . xp
n
+ Y (n,p) =
y1
1 y21 . . . yp
1y2
1 y22 . . . yp
2...
.... . .
...y1
n y2n . . . yp
n
= Z(n,p) =
x1
1 + y11 x2
1 + y21 . . . xp
1 + yp1
x21 + y1
2 x22 + y2
2 . . . xp2 + yp
2...
.... . .
...x1
n + y1n x2
n + y2n . . . xp
n + ypn
Dr. Jean-Michel RICHER Data Mining - Mathematics 14 / 102
![Page 15: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/15.jpg)
Operations on matrices
Matrix productX(n,p)× Y (p,q) = Z(n,q)
number of columns of X = number of rows of Y
z ji =
p∑k=1
aki × bj
k
for (int i = 0; i < n; ++i)
for (int j = 0; j < q; ++j) {
double sum = 0;
for (int k = 0; k < p; ++k) {
sum += a[i][k] * b[k][j];
}
c[i][j] = sum;
}
Dr. Jean-Michel RICHER Data Mining - Mathematics 15 / 102
![Page 16: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/16.jpg)
Operations on matrices
Matrix productNote that generally:
AB 6= BA
Dr. Jean-Michel RICHER Data Mining - Mathematics 16 / 102
![Page 17: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/17.jpg)
Operations on matrices
Example of matrix product
X(3, 2) =
11 1221 2231 32
× Y (2, 3) =
[1 2 34 5 6
]
Dr. Jean-Michel RICHER Data Mining - Mathematics 17 / 102
![Page 18: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/18.jpg)
Operations on matrices
Example of matrix product
X(3, 2) =
11 1221 2231 32
× Y (2, 3) =
[1 2 34 5 6
]
Z(3, 3) =
11× 1 + 12× 4 11× 2 + 12× 5 11× 3 + 12× 621× 1 + 22× 4 11× 1 + 22× 4 21× 1 + 22× 431× 3 + 32× 6 31× 3 + 32× 6 31× 3 + 32× 6
Dr. Jean-Michel RICHER Data Mining - Mathematics 17 / 102
![Page 19: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/19.jpg)
Operations on matrices
Example of matrix and vector product
X(3, 2) =
11 1221 2231 32
× y(2) =
[12
]
Dr. Jean-Michel RICHER Data Mining - Mathematics 18 / 102
![Page 20: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/20.jpg)
Operations on matrices
Example of matrix and vector product
X(3, 2) =
11 1221 2231 32
× y(2) =
[12
]
Z(3, 3) =
11× 1 + 12× 221× 1 + 22× 231× 1 + 32× 2
=
356595
Dr. Jean-Michel RICHER Data Mining - Mathematics 18 / 102
![Page 21: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/21.jpg)
Operations on matrices
Matrix transposeto make it simple: exchange of values from both sidesof the diagonal of the matrix
X(n,p) =
x1
1 x21 . . . xp
1x2
1 x22 . . . xp
2...
. . . . . ....
x1n x2
n . . . xpn
X T (p,n) =
x1
1 x12 . . . x1
nx2
1 x22 . . . x2
n...
. . . . . ....
xp1 xp
2 . . . xpn
Dr. Jean-Michel RICHER Data Mining - Mathematics 19 / 102
![Page 22: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/22.jpg)
Operations on matrices
Matrix transpose
X(3, 3) =
11 12 1321 22 2331 32 33
X T (3, 3) =
11 21 3112 22 3213 23 33
X(3, 5) =
11 12 13 14 1521 22 23 24 2531 32 33 34 35
X T (5, 3) =
11 21 3112 22 3213 23 3314 24 3415 25 35
Dr. Jean-Michel RICHER Data Mining - Mathematics 20 / 102
![Page 23: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/23.jpg)
Operations on matrices
Inverse of a matrixThe inverse matrix A−1 of a matrix A is such that
A−1 × A = I
and is used for example to solve linear equation systems:
A× x = b then x = A−1 × b
where x and b are vectors
Dr. Jean-Michel RICHER Data Mining - Mathematics 21 / 102
![Page 24: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/24.jpg)
Operations on matrices
Inverse of a matrix
A(3, 3) =
1 2 3−1 2 −3−7 6 −5
A−1(3, 3) =
0.125 0.4375 −0.18750.25 0.25 0
0.125 −0.3125 0.0625
A×
x1x2x3
= b =
14−6−10
A−1b =
123
Dr. Jean-Michel RICHER Data Mining - Mathematics 22 / 102
![Page 25: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/25.jpg)
3. Derivative
Dr. Jean-Michel RICHER Data Mining - Mathematics 23 / 102
![Page 26: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/26.jpg)
Derivative
Derivative of f (x)the slope of the tangent of a curve in a given pointif positive: the curve will increaseif negative: the curve will decreaseif zero: won’t increase or decrease
Dr. Jean-Michel RICHER Data Mining - Mathematics 24 / 102
![Page 27: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/27.jpg)
Derivative
Derivative of f (x)
f(x+h)
f(x)
x+hx
Dr. Jean-Michel RICHER Data Mining - Mathematics 25 / 102
![Page 28: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/28.jpg)
Derivative
DerivativeMore formally the derivative can be defined as
limh→0f (x + h)− f (x)
(x + h)− x= limh→0
f (x + h)− f (x)
h
Notations:
f ′(x) ordf (x)
dxdf (x)
dx means the variation of f (x) if we increase x by a smallvalue
Dr. Jean-Michel RICHER Data Mining - Mathematics 26 / 102
![Page 29: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/29.jpg)
Properties of the derivative
Property of addition and product of the derivative
(f + g)′ = f ′ + g′
(f × g)′ = f ′ × g + f × g′
As an exercise you could try to prove the result of (f × g)′
Dr. Jean-Michel RICHER Data Mining - Mathematics 27 / 102
![Page 30: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/30.jpg)
Properties of the derivative
Property of the composition
(g ◦ f )′ = (g′ ◦ f )× f ′
in other words:
g(f (x))′ = g′(f (x))× f ′(x) (DerivComp)
Dr. Jean-Michel RICHER Data Mining - Mathematics 28 / 102
![Page 31: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/31.jpg)
Properties of the derivative
Property of the compositionFind the derivative of
h(x) = sin(3x2 + 2)
let g(x) = sin(x), then g′(x) = cos(x)
let f (x) = 3x2 + 2, then f ′(x) = 6x
So
h′(x) = g(f (x))′ = g′(f (x))× f ′(x)
h′(x) = cos(3x2 + 2)× 6x
Dr. Jean-Michel RICHER Data Mining - Mathematics 29 / 102
![Page 32: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/32.jpg)
Properties of the derivative
Property of the inverse function
Let f−1(x) be the inverse function of f (x), i.e. f−1(f (x)) = x
[f−1(f (x))]′ = (x)′ = 1
f−1′(f (x))× f ′(x) = 1 from (DerivComp)
f−1′(f (x)) =1
f ′(x)(DerivInv)
Dr. Jean-Michel RICHER Data Mining - Mathematics 30 / 102
![Page 33: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/33.jpg)
Derivative of xn
f (x) = xn
Dr. Jean-Michel RICHER Data Mining - Mathematics 31 / 102
![Page 34: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/34.jpg)
Derivative of xn
f (x) = xn
The derivative should have the following behaviour
X −∞ 0 ∞
x2k − 0 +
x2k+1 + 0 +
with n even (2k) or odd (2k + 1)
Dr. Jean-Michel RICHER Data Mining - Mathematics 32 / 102
![Page 35: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/35.jpg)
Derivative of xn
(x + a)n
Remember that x0 = 1
(x + a)2 = x2 + 2ax + a2
(x + a)3 = x3 + 3ax2 + 3a2x + a3
(x + a)4 = x4 + 4ax3 + 6a2x2 + 3a3x + a4
...
(x + a)n = α0,na0xn + · · ·+ αi,jaix j + · · ·+ αn,0anx0
(x + a)n =∑i=n
i=0,j=n−i αi,jaix j
the coefficients αi,j of aix j are given by Pascal’s triangle
Dr. Jean-Michel RICHER Data Mining - Mathematics 33 / 102
![Page 36: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/36.jpg)
A bit of history
Blaise Pascal [fr] (1623-1662)was a French mathematician,physicist, inventor, writer andcatholic theologianwrote a significant treatise on thesubject of projective geometry atthe age of 16work on the principles of hydraulicfluids (hydraulic press and thesyringe)theological work, referred toposthumously as the Pensées(Thoughts)
Dr. Jean-Michel RICHER Data Mining - Mathematics 34 / 102
![Page 37: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/37.jpg)
Derivative of xn
Pascal’s triangle
n xn axn−1 . . .0 1 01 1 1 02 1 2 1 03 1 3 3 1 04 1 4 6 4 1
Obviously, the coefficient α1,n−1 of axn−1 is n
Dr. Jean-Michel RICHER Data Mining - Mathematics 35 / 102
![Page 38: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/38.jpg)
Derivative of xn
Derivative of xn
f (x + h)− f (x) = (x + h)n − xn
Dr. Jean-Michel RICHER Data Mining - Mathematics 36 / 102
![Page 39: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/39.jpg)
Derivative of xn
Derivative of xn
f (x + h)− f (x) = (x + h)n − xn
= (xn + nhxn−1 + α2,n−2h2xn−2 + . . .)− xn
Dr. Jean-Michel RICHER Data Mining - Mathematics 36 / 102
![Page 40: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/40.jpg)
Derivative of xn
Derivative of xn
f (x + h)− f (x) = (x + h)n − xn
= (xn + nhxn−1 + α2,n−2h2xn−2 + . . .)− xn
= nhxn−1 + α2,n−2h2xn−2 + . . .
Dr. Jean-Michel RICHER Data Mining - Mathematics 36 / 102
![Page 41: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/41.jpg)
Derivative of xn
Derivative of xn
f (x + h)− f (x) = (x + h)n − xn
= (xn + nhxn−1 + α2,n−2h2xn−2 + . . .)− xn
= nhxn−1 + α2,n−2h2xn−2 + . . .
f (x+h)−f (x)h =
nhxn−1+α2,n−2h2xn−2+...
h
Dr. Jean-Michel RICHER Data Mining - Mathematics 36 / 102
![Page 42: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/42.jpg)
Derivative of xn
Derivative of xn
f (x + h)− f (x) = (x + h)n − xn
= (xn + nhxn−1 + α2,n−2h2xn−2 + . . .)− xn
= nhxn−1 + α2,n−2h2xn−2 + . . .
f (x+h)−f (x)h =
nhxn−1+α2,n−2h2xn−2+...
h
= nxn−1 + α2,n−2hxn−2 + . . .︸ ︷︷ ︸0
Dr. Jean-Michel RICHER Data Mining - Mathematics 36 / 102
![Page 43: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/43.jpg)
Derivative of xn
Derivative of xn
f (x + h)− f (x) = (x + h)n − xn
= (xn + nhxn−1 + α2,n−2h2xn−2 + . . .)− xn
= nhxn−1 + α2,n−2h2xn−2 + . . .
f (x+h)−f (x)h =
nhxn−1+α2,n−2h2xn−2+...
h
= nxn−1 + α2,n−2hxn−2 + . . .︸ ︷︷ ︸0
limh→0f (x+h)−f (x)
h = nxn−1
Dr. Jean-Michel RICHER Data Mining - Mathematics 36 / 102
![Page 44: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/44.jpg)
Function 1/x
Function 1/x
Dr. Jean-Michel RICHER Data Mining - Mathematics 37 / 102
![Page 45: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/45.jpg)
Derivative of 1/x
f (x) = 1/xThe derivative should have the following behaviour
X −∞ 0 ∞
x − NA −
Dr. Jean-Michel RICHER Data Mining - Mathematics 38 / 102
![Page 46: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/46.jpg)
Derivative of 1/x
Derivative of 1/x
f (x + h)− f (x) = 1x+h −
1x
= x−(x+h)x(x+h)
= −hx2+hx
f (x+h)−f (x)h =
−hx2+hx
h
= −hh(x2+hx)
limh→0f (x+h)−f (x)
h = − 1x2
Dr. Jean-Michel RICHER Data Mining - Mathematics 39 / 102
![Page 47: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/47.jpg)
Function log(x)
Function log(x)ln(x) or log(x)
the natural logarithm of x is the power to whiche = 2.718281 . . . would have to be raised to equal x
for example ln(7.5) = 2.0149 . . ., becausee2.0149... = 7.5used to replace products by sumsother functions:
logn(x) =ln(x)
ln(n)
Dr. Jean-Michel RICHER Data Mining - Mathematics 40 / 102
![Page 48: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/48.jpg)
Function log(x)
Function log(x)
Dr. Jean-Michel RICHER Data Mining - Mathematics 41 / 102
![Page 49: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/49.jpg)
Derivative of log(x)
f (x) = log(x)The derivative should have the following behaviour
X −∞ 0 ∞
x NA −∞ +
Dr. Jean-Michel RICHER Data Mining - Mathematics 42 / 102
![Page 50: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/50.jpg)
Properties of the function log(x)
Properties of the function log(x)
log(1) = 0log(e) = 1log(x × y) = log(x) + log(y)log(x/y) = log(x)− log(y)log(xn) = n× log(x)
Dr. Jean-Michel RICHER Data Mining - Mathematics 43 / 102
![Page 51: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/51.jpg)
Derivative of log(x)
Derivative of log(x)By definition
log(a) =
∫ a
1
1x
dx
so the derivative of log(x) is 1x
Dr. Jean-Michel RICHER Data Mining - Mathematics 44 / 102
![Page 52: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/52.jpg)
Derivative of log(x)
Derivative of log(x)
f (x + h)− f (x) = log(x + h)− log(x)
Dr. Jean-Michel RICHER Data Mining - Mathematics 45 / 102
![Page 53: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/53.jpg)
Derivative of log(x)
Derivative of log(x)
f (x + h)− f (x) = log(x + h)− log(x)
= log( x+hx )
Dr. Jean-Michel RICHER Data Mining - Mathematics 45 / 102
![Page 54: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/54.jpg)
Derivative of log(x)
Derivative of log(x)
f (x + h)− f (x) = log(x + h)− log(x)
= log( x+hx )
= log(1 + hx )
Dr. Jean-Michel RICHER Data Mining - Mathematics 45 / 102
![Page 55: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/55.jpg)
Derivative of log(x)
Derivative of log(x)
f (x + h)− f (x) = log(x + h)− log(x)
= log( x+hx )
= log(1 + hx )
f (x+h)−f (x)h = 1
h × log(1 + hx )
Dr. Jean-Michel RICHER Data Mining - Mathematics 45 / 102
![Page 56: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/56.jpg)
Derivative of log(x)
Derivative of log(x)
f (x + h)− f (x) = log(x + h)− log(x)
= log( x+hx )
= log(1 + hx )
f (x+h)−f (x)h = 1
h × log(1 + hx )
= log((1 + hx )
1h )
Dr. Jean-Michel RICHER Data Mining - Mathematics 45 / 102
![Page 57: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/57.jpg)
Derivative of log(x)
Derivative of log(x)
f (x + h)− f (x) = log(x + h)− log(x)
= log( x+hx )
= log(1 + hx )
f (x+h)−f (x)h = 1
h × log(1 + hx )
= log((1 + hx )
1h )
limh→0f (x+h)−f (x)
h = log((1 + hx )
1h )
Dr. Jean-Michel RICHER Data Mining - Mathematics 45 / 102
![Page 58: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/58.jpg)
Derivative of log(x)
Derivative of log(x)
f (x + h)− f (x) = log(x + h)− log(x)
= log( x+hx )
= log(1 + hx )
f (x+h)−f (x)h = 1
h × log(1 + hx )
= log((1 + hx )
1h )
limh→0f (x+h)−f (x)
h = log((1 + hx )
1h )
= log(limh→0(1 + hx )
1h ) = log(e
1x )
Dr. Jean-Michel RICHER Data Mining - Mathematics 45 / 102
![Page 59: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/59.jpg)
Function exp(x)
Function exp(x) (Jakob Bernoulli [ch], 1654-1705)the exponential function aka the antilogarithmexp(x) = ex = limn→∞(1 + x
n)n
e1 = 2.718281 . . .
Dr. Jean-Michel RICHER Data Mining - Mathematics 46 / 102
![Page 60: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/60.jpg)
A bit of history
Jakob Bernoulli [ch], 1654-1705)family, of Belgium origin, wererefugees fleeing from persecutionby the Spanish rulers of theNetherlandsswiss mathematician andastronomertheory of permutations andcombinations (Bernoulli numbers),by which he derived theexponential seriesLaw of large numbers, in statistics,1713
Dr. Jean-Michel RICHER Data Mining - Mathematics 47 / 102
![Page 61: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/61.jpg)
Function exp(x)
Function exp(x)
Dr. Jean-Michel RICHER Data Mining - Mathematics 48 / 102
![Page 62: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/62.jpg)
Derivative of exp(x)
f (x) = exp(x)The derivative should have the following behaviour
X −∞ 0 ∞
x + 1 +
Dr. Jean-Michel RICHER Data Mining - Mathematics 49 / 102
![Page 63: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/63.jpg)
Properties of the function ex
Properties of the function ex
e0 = 1ex > 0 ∀xe−x = 1
ex
ex+y = ex × ey
ex−y = ex
ey
ex×y = (ex )y
Dr. Jean-Michel RICHER Data Mining - Mathematics 50 / 102
![Page 64: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/64.jpg)
Derivative of ex
Derivative of ex
By definition log(ex ) = x . So we compute the derivative ofthis last expression:
log(ex ) = x
Dr. Jean-Michel RICHER Data Mining - Mathematics 51 / 102
![Page 65: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/65.jpg)
Derivative of ex
Derivative of ex
By definition log(ex ) = x . So we compute the derivative ofthis last expression:
log(ex ) = x
(log(ex ))′ = (x)′
Dr. Jean-Michel RICHER Data Mining - Mathematics 51 / 102
![Page 66: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/66.jpg)
Derivative of ex
Derivative of ex
By definition log(ex ) = x . So we compute the derivative ofthis last expression:
log(ex ) = x
(log(ex ))′ = (x)′
1ex × e′(x) = 1 by(DerivInv)
Dr. Jean-Michel RICHER Data Mining - Mathematics 51 / 102
![Page 67: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/67.jpg)
Derivative of ex
Derivative of ex
By definition log(ex ) = x . So we compute the derivative ofthis last expression:
log(ex ) = x
(log(ex ))′ = (x)′
1ex × e′(x) = 1 by(DerivInv)
e′(x) = e(x)
Dr. Jean-Michel RICHER Data Mining - Mathematics 51 / 102
![Page 68: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/68.jpg)
To sum up
Derivatives
(xn)′ = n xn−1
( 1x )′ = − 1
x2
(log(x))′ = 1x
(e(x))′ = ex
Dr. Jean-Michel RICHER Data Mining - Mathematics 52 / 102
![Page 69: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/69.jpg)
4. Gradient
Dr. Jean-Michel RICHER Data Mining - Mathematics 53 / 102
![Page 70: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/70.jpg)
Gradient
Definition of the gradientGiven a function of several variables f (x , y , z), the gradientis the vector of the partial derivatives of f
∇f (x , y , z) = ∇f = [∂f∂x,∂f∂y
,∂f∂z
]
The partial derivative ∂f∂x is the derivative of f (x , y , z) when
y and z are considered as constants:
∂f∂x
=∂f (x , y , z)
∂x=
df (x , y , z)|y,zdx
Dr. Jean-Michel RICHER Data Mining - Mathematics 54 / 102
![Page 71: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/71.jpg)
Gradient
Property of the gradientthe gradient ∇f gives the direction toward which youcan increase the value of the functionconversly −∇f gives the direction toward which youcan decrease the value of the function
Dr. Jean-Michel RICHER Data Mining - Mathematics 55 / 102
![Page 72: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/72.jpg)
Use of the gradient
Finding the minimum of a functionthe gradient can be used to find the minimum of afunction by progressively decreasing the coordinatesby substractinga fraction of the value of the gradient
Dr. Jean-Michel RICHER Data Mining - Mathematics 56 / 102
![Page 73: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/73.jpg)
Gradient
A convex quadratic functionConsider the following function:
f (x , y) = (x2 + 8× x − 4) + (y2 + 6× y − 3)
-10 -5 0 5 10-10
-5
0
5
10
-500
50100150200250300350
(x**2+8*x-4)+(y**2+6*y-3)
-50050100150200250300350
Dr. Jean-Michel RICHER Data Mining - Mathematics 57 / 102
![Page 74: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/74.jpg)
Gradient
A convex quadratic functionThe gradient of the function is:
∂f∂x = 2× x + 8
∂f∂y = 2× y + 6
The minimum is found for ∂f∂x = 0 and ∂f
∂y = 0
∂f∂x = 0 ⇒ x = −4
∂f∂y = 0 ⇒ y = −3
Dr. Jean-Michel RICHER Data Mining - Mathematics 58 / 102
![Page 75: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/75.jpg)
Gradient
Gradient descent algorithm
Data: f (x)Result: x?: the minimum of the functioninitialise vector x ;while not terminate_condition do
compute gradient ∇f ;x = x − α×∇f ;
endAlgorithm 1: A very simple descent algorithm
note that the Terminate Condition can be defined indifferent ways (improvement, number of iterations)α = 0.1 for example, if too big the algorithm won’t findthe solution
Dr. Jean-Michel RICHER Data Mining - Mathematics 59 / 102
![Page 76: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/76.jpg)
Gradient
Descent for a convex quadratic functionFor the previous convex function we obtain this:
x0= 3, y0= 5, alpha=0.1
gradient=(14, 16)
x1= 1.5999, y1= 3.4
gradient=(11.2, 12.8)
x2= 0.48, y2 = 2.1199
gradient=(8.96, 10.2399)
...
x48 = -3.99984389478361, y48 = -2.999821594038412
gradient=(0.0003122104327797359, 0.000356811923175826)
x49 = -3.999875115826888, y49 = -2.9998572752307298
Dr. Jean-Michel RICHER Data Mining - Mathematics 60 / 102
![Page 77: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/77.jpg)
Gradient
Difficulty of finding the minimumit becomes more difficult to find the minimum if
the function is not convexthe function has many minima (Rastrigin orHimmelblau functions)
Dr. Jean-Michel RICHER Data Mining - Mathematics 61 / 102
![Page 78: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/78.jpg)
Gradient
In Python
from scipy import optimize
def f(x):
return x[0]**2+8*x[0] -4+x[1]**2+6*x[1]-3
def fprime(x):
return np.array ([(2*x[0]+8) , (2*x[1]+6) ])
z = optimize.fmin_bfgs(f, [3, 5], fprime=fprime)
print(z)
Optimization terminated successfully.
Current function value: -32.000000
Iterations: 2
Function evaluations: 4
Gradient evaluations: 4
[-4. -3.]
Dr. Jean-Michel RICHER Data Mining - Mathematics 62 / 102
![Page 79: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/79.jpg)
5. Lagrangian
Dr. Jean-Michel RICHER Data Mining - Mathematics 63 / 102
![Page 80: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/80.jpg)
A bit of history
Joseph-Louis Lagrangeborn Guiseppe LodovicoLagrangia [it,fr] (1736 - 1813) was afranco-italian mathematician andastronomermade significant contributions tothe fields of analysis, numbertheory, and both classical andcelestial mechanicsin 1787, at age 51, moved fromBerlin to Paris and became amember of the French Academyof Sciencesremained in France until the end ofhis life
Dr. Jean-Michel RICHER Data Mining - Mathematics 64 / 102
![Page 81: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/81.jpg)
Lagrangian multipliers
Principleyou want to minimize or maximize f (x) subject tog(x) = 0under certain conditions
I f (x) is a quadratic functionI g(x) are linear constraints
define the function
L(x , α) = f (x) + αg(x)
where α ≥ 0 ∈ R is called the lagrangian multiplier
Dr. Jean-Michel RICHER Data Mining - Mathematics 65 / 102
![Page 82: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/82.jpg)
Lagrangian multipliers
Resolutiona solution of L(x , α) is a point of gradient 0so compute and solve
∂L(x ,α)∂x = 0
∂L(x ,α)∂α = g(x) = 0
or reuse in L(x , α)
Dr. Jean-Michel RICHER Data Mining - Mathematics 66 / 102
![Page 83: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/83.jpg)
Method of Lagrange - example
Statement of the exampleSuppose you want to put a fence around some fieldwhich as a form of a rectangle (x , y) and you want tomaximize the area knowing that you have P meters offence: {
Max x × ysuch that P = 2x + 2y
then f (x , y) = xy and g(x) = P − 2x − 2y = 0{Max xy
such that P − 2x − 2y = 0
Dr. Jean-Michel RICHER Data Mining - Mathematics 67 / 102
![Page 84: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/84.jpg)
Method of Lagrange - example
Lagrange formulation
L(x , y , α) = xy + α(P − 2x − 2y)
the derivatives give us
∂L(x , y , α)
∂x= y − 2α = 0
∂L(x , y , α)
∂y= x − 2α = 0
∂L(x , y , α)
∂α= P − 2x − 2y = 0
Dr. Jean-Michel RICHER Data Mining - Mathematics 68 / 102
![Page 85: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/85.jpg)
Method of Lagrange - example
Resolutionthe first two constraints give us y = 2α = x , so x = y
in other words, the area is a squareand the last one that P = 4x = 4y
consequently α = P/8 because α = x/2 = y/2
Dr. Jean-Michel RICHER Data Mining - Mathematics 69 / 102
![Page 86: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/86.jpg)
6. Exercises
Dr. Jean-Michel RICHER Data Mining - Mathematics 70 / 102
![Page 87: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/87.jpg)
Matrix - Matrix multiplication
Matrix - Matrix multiplicationConsider the following matrices:
A =
1 0 −24 5 −36 −3 −2
B =
0.5 7 38 −6 21 −2 3
Perform the products by hand and compare:
A× B ?= B × A
Dr. Jean-Michel RICHER Data Mining - Mathematics 71 / 102
![Page 88: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/88.jpg)
Matrix - Matrix multiplication
Matrix - Vector multiplicationConsider the following matrix and vector:
A =
1 0 −24 5 −36 −3 −2
x =
0.581
Perform the products by hand and compare:
A× x ?= xT × A
Dr. Jean-Michel RICHER Data Mining - Mathematics 72 / 102
![Page 89: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/89.jpg)
Matrix - Matrix multiplication
Vector - Vector multiplicationConsider the following vectors:
x =
146
y =
−381
Perform the following operations:
x × y and x � y
Dr. Jean-Michel RICHER Data Mining - Mathematics 73 / 102
![Page 90: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/90.jpg)
Check results in python
Check results in pythonWrite a program in python to check the results of thedifferent matrix and vector products
Dr. Jean-Michel RICHER Data Mining - Mathematics 74 / 102
![Page 91: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/91.jpg)
Derivative
Derivative rulesRemember to use the following rules:
(f + g)′ = f ′ + g′ (dSum)
(f × g)′ = f ′ × g + f × g′ (dProd)
f (g(x))′ = f ′(g(x))× g′(x) (dComp)
Dr. Jean-Michel RICHER Data Mining - Mathematics 75 / 102
![Page 92: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/92.jpg)
Derivative
Derivativeswrite a python program to draw the following functions(use matplotlib) and compute their derivatives by hand:
f1(x) =1+ 1
xx−3
f2(x) = 1x2+ex
f3(x) = x×ex
1+ex
Dr. Jean-Michel RICHER Data Mining - Mathematics 76 / 102
![Page 93: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/93.jpg)
Derivative
DerivativesThe results are the following
f ′1(x) = −x2 + 2x − 3
(x − 3)2 x2
f ′2(x) = − ex + 2x
(ex + x2)2
f ′3(x) =e
x (ex + x + 1)
(ex + 1)2
Dr. Jean-Michel RICHER Data Mining - Mathematics 77 / 102
![Page 94: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/94.jpg)
Derivative of f1(x)
Derivative of f1(x)
f1(x) = (1 +1x)× 1
x − 3
f1(x) = F(x)×G(x)
So we need to apply the formula (dProd)
Dr. Jean-Michel RICHER Data Mining - Mathematics 78 / 102
![Page 95: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/95.jpg)
Derivative of f1(x)
Derivative of f1(x)
F(x) = (1 +1x)
F ′(x) = − 1x2
G(x) =1
x − 3= H(K (x))
withH(z) =
1z
K (z) = z − 3
Dr. Jean-Michel RICHER Data Mining - Mathematics 79 / 102
![Page 96: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/96.jpg)
Derivative of f1(x)
Derivative of f1(x)
G′(x) = H ′(K (x))× K ′(x)
G′(x) = − 1(x − 3)2 × 1
Dr. Jean-Michel RICHER Data Mining - Mathematics 80 / 102
![Page 97: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/97.jpg)
Derivative of f1(x)
Derivative of f1(x)Finally
f ′1(x) = − 1x2 ×
1x − 3
+ (1 +1x)×− 1
(x − 3)2
f ′1(x) = − 1x2 ×
x − 3x − 3
+ 1 +1x×− x
(x − 3)2
f ′1(x) = −(x − 3) + (x + 1)xx2(x − 3)2
f ′1(x) = −x2 + 2x − 3x2(x − 3)2
Dr. Jean-Michel RICHER Data Mining - Mathematics 81 / 102
![Page 98: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/98.jpg)
Derivative of f2(x)
Derivative of f2(x)
f2(x) =1
x2 + ex
f2(x) = F(G(x))
So we need to appy the formula (dComp) with
F(z) =1z
G(z) = z2 + ez
Dr. Jean-Michel RICHER Data Mining - Mathematics 82 / 102
![Page 99: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/99.jpg)
Derivative of f2(x)
Derivative of f2(x)
F ′(z) = − 1z2
G′(z) = 2z + ez (dSum)
Dr. Jean-Michel RICHER Data Mining - Mathematics 83 / 102
![Page 100: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/100.jpg)
Derivative of f2(x)
Derivative of f2(x)Finally
f ′2(x) = − 1x2 + ex × (2x + ex)
f ′2(x) = −2x + ex
x2 + ex
Dr. Jean-Michel RICHER Data Mining - Mathematics 84 / 102
![Page 101: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/101.jpg)
Derivative
Derivativescheck the results with the derivative calculatorwrite a program in python using sympy to computethe derivatives of the functions
Dr. Jean-Michel RICHER Data Mining - Mathematics 85 / 102
![Page 102: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/102.jpg)
Gradient
Gradientdetermine where f1(x) is minimumusing the gradient method try to determine wheref2(x) and f3(x) are maximum or minimum
Dr. Jean-Michel RICHER Data Mining - Mathematics 86 / 102
![Page 103: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/103.jpg)
Gradient of f1
Gradient/derivative of f1
The derivative of f1 is:
f ′1(x) = −x2 + 2x − 3x2(x − 3)2
There are extremum (maximum or minimum) wheref ′1(x) = 0The function is not defined if the denominator is equal tox2(x − 3)2 = 0: {
x2 = 0 ⇒ x = 0(x − 3)2 = 0 ⇒ x = 3
Dr. Jean-Michel RICHER Data Mining - Mathematics 87 / 102
![Page 104: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/104.jpg)
Gradient/derivative of f1
Gradient/derivative of f1
The derivative is equal to 0 if:
x2 + 2x − 3 = 0
∆ = b2 − 4ac = 22 − 4× 1×−3 = 16
x1 =−b −
√∆
2a=−2− 42× 1
= −3
x2 =−b +
√∆
2a=−2 + 42× 1
= +1
thenx2 + 2x − 3 = (x − 1)(x + 3)
Dr. Jean-Michel RICHER Data Mining - Mathematics 88 / 102
![Page 105: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/105.jpg)
Gradient/derivative of f2
Gradient/derivative of f2
The derivative of f2 is:
f ′2(x) = −2x + ex
x2 + ex
The denominator x2 + e is always positive so we need to
solve:2x + ex = 0
which is not possible by analytical methods we need tofind the root by using an approximation method
Dr. Jean-Michel RICHER Data Mining - Mathematics 89 / 102
![Page 106: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/106.jpg)
Gradient/derivative of f2
Gradient/derivative of f2 (1/2)
import numpy as np
def gradient_ascent(x, df):
delta = 0.0000001
alpha = 0.1
while True:
x_n = x + alpha * df(x)
if math.fabs(x-x_n) < delta:
break
x = x_n
return x
Dr. Jean-Michel RICHER Data Mining - Mathematics 90 / 102
![Page 107: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/107.jpg)
Gradient/derivative of f2
Gradient/derivative of f2 (2/2)
def f2(x):
return 1/(x*x + np.exp(x))
def df2(x):
return (-2*x - math.exp(x))/(x**2 + math.exp(
x))**2
x2_star = gradient_ascent2( -0.5, df2)
print("f2: ", x2_star , " => ", f2(np.asarray ([ x2_star
])))
Dr. Jean-Michel RICHER Data Mining - Mathematics 91 / 102
![Page 108: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/108.jpg)
Lagrangian
LagrangianConsier the following problem
Max x2 + y2 + z2
such that
x + 2y + z = 12x − y − 3z = 4
use the method of Lagrange to solve it to determinex , y and z
Dr. Jean-Michel RICHER Data Mining - Mathematics 92 / 102
![Page 109: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/109.jpg)
Lagrangian resolution
Lagrangian resolutionWe define
L(x , y , z, α1, α2) = f (x , y , z) +α1(x + 2y + z − 1) +α2(2x − y − 3z − 4)
with f (x , y , z) = x2 + y2 + z2
Dr. Jean-Michel RICHER Data Mining - Mathematics 93 / 102
![Page 110: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/110.jpg)
Lagrangian resolution
Lagrangian resolutionWe need to compute the partial derivatives of L for eachvariable :
∂L(x ,y,z,α1,α2)∂x = 2x + α1 + 2α2 = 0 (1)
∂L(x ,y,z,α1,α2)∂y = 2y + 2α1 − α2 = 0 (2)
∂L(x ,y,z,α1,α2)∂z = 2z + α1 − 3α2 = 0 (3)
∂L(x ,y,z,α1,α2)∂α1
= x + 2y + z − 1 = 0 (4)
∂L(x ,y,z,α1,α2)∂α2
= 2x − y − 3z − 4 = 0 (5)
Dr. Jean-Michel RICHER Data Mining - Mathematics 94 / 102
![Page 111: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/111.jpg)
Lagrangian resolution
Lagrangian resolutionExpress α1 from x and α2 in (1)
−2x − 2α2 = α1 (1)
2y + 2α1 − α2 = 0 (2)
2z + α1 − 3α2 = 0 (3)
x + 2y + z − 1 = 0 (4)
2x − y − 3z − 4 = 0 (5)
Dr. Jean-Michel RICHER Data Mining - Mathematics 95 / 102
![Page 112: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/112.jpg)
Lagrangian resolution
Lagrangian resolutionThen replace in (2) and (3)
−2x − 2α2 = α1 (1)
2y + 2(−2x − 2α2)− α2 = 0 ⇒ 2y − 4x − 5α2 = 0 (2)
2z + (−2x − 2α2)− 3α2 = 0 ⇒ 2z − 2x − 5α2 = 0 (3)
x + 2y + z − 1 = 0 (4)
2x − y − 3z − 4 = 0 (5)
Dr. Jean-Michel RICHER Data Mining - Mathematics 96 / 102
![Page 113: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/113.jpg)
Lagrangian resolution
Lagrangian resolutionBy substracting (2) and (3) we get−2x + 2y − 2z = x − y + z = 0. And finally we have asystem of 3 equations with 3 variables:
x − y + z = 0 (6) = (2)− (3)
x + 2y + z − 1 = 0 (4)
2x − y − 3z − 4 = 0 (5)
Dr. Jean-Michel RICHER Data Mining - Mathematics 97 / 102
![Page 114: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/114.jpg)
Lagrangian resolution
Lagrangian resolution
Then we compute (6)− (4) and obtain y = 13 .
The rest of the resolution is obvious and we should get
x = 1615
y = 13
z = −1115
α1 = −5275
α2 = −5475
Dr. Jean-Michel RICHER Data Mining - Mathematics 98 / 102
![Page 115: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/115.jpg)
Lagrangian resolution
Lagrangian resolutionIn Python:
import numpy as np
A = np.asarray ([[2,0,0,-1,-2], [0,2,0,-2,1],
[0,0,2,-1,3], [1,2,1,0,0], [2,-1,-3,0,0]])
b = np.asarray ([0,0,0,1,4])
x = np.linalg.solve(A, b)
print(x)
Dr. Jean-Michel RICHER Data Mining - Mathematics 99 / 102
![Page 116: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/116.jpg)
6. End
Dr. Jean-Michel RICHER Data Mining - Mathematics100 /
102
![Page 117: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/117.jpg)
University of Angers - Faculty of Sciences
UA - Angers2 Boulevard Lavoisier 49045 AngersCedex 01Tel: (+33) (0)2-41-73-50-72
Dr. Jean-Michel RICHER Data Mining - Mathematics101 /
102
![Page 118: Data Mining - Mathematics - univ-angers.frricher/dm/data_mining_0_maths.pdf · 2018. 5. 25. · Dr. Jean-Michel RICHER Data Mining - Mathematics 8 / 102. Operations on vectors Operations](https://reader036.fdocuments.in/reader036/viewer/2022071611/614ab62b12c9616cbc6997c0/html5/thumbnails/118.jpg)
Dr. Jean-Michel RICHER Data Mining - Mathematics102 /
102