Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems)...

45
Jarrar © 2018 Mustafa Jarrar: Lecture Notes on Linear Regression Birzeit University, 2018 Mustafa Jarrar Birzeit University Machine Learning Linear Regression Version 1

Transcript of Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems)...

Page 1: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 1

Mustafa Jarrar: Lecture Notes on Linear RegressionBirzeit University, 2018

Mustafa JarrarBirzeit University

Machine Learning Linear Regression

Version 1

Page 2: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 2

More Online Courses at: http://www.jarrar.infoCourse Page: http://www.jarrar.info/courses/AI/

Watch this lecture and download the slides

Acknowledgement: This lecture is based on (but not limited to) Andrew Ng’s course about Machine Learning https://www.youtube.com/channel/UCMoXOGX9mgrYNEwpcIQUcag

Page 3: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 3

In this lecture:

q Part 1: Motivation (Regression Problems)q Part 2: Linear Regression

q Part 3: The Cost Function

q Part 4: The Gradient Descent Algorithm

q Part 5: The Normal Equation

q Part 6: Linear Algebra overview

q Part 7: Using Octave

q Part 8: Using R

Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018

Machine Learning Linear Regression

Page 4: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 4

Motivation

We may assume a linear curve, How much a house of 1250ft2 costs?

Given the following housing prices,

Price(in 1000s of dollars)

Size (feet2)

1250

230

So, conclude that a house with 1250ft2 costs 230K$.

Page 5: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 5

Motivation

Given the following housing prices

Given the right answers for each example in the data (training data)

Supervised Learning: Regression Problem:Predict real-valued outputRemember that classification (not regression) refers to predicting discrete-valued output

Price(in 1000s of dollars)

Size (feet2)

Page 6: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 6

Motivation

Given the following training set of housing prices:

Notation:m = Number of training examplesx’s = “input” variable/featuresy’s = output variable/target variable(x, y) : a training example(xi, yi) : the ith training example

Size in feet2 (x) Price ($) in 1000s (y)2104 4601416 2321534 315852 178… …

Our job is to learn from this data how to predict prices

For example:x1 = 2104y1 = 460

Page 7: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 20187

In this lecture:

q Part 1: Motivation (Regression Problems)

q Part 2: Linear Regression q Part 3: The Cost Function

q Part 4: The Gradient Descent Algorithm

q Part 5: The Normal Equation

q Part 6: Linear Algebra overview

q Part 7: Using Octave

q Part 8: Using R

Machine Learning Linear Regression

Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018

Page 8: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 8

y

x

x

xx

xxxx xx

x x

xx

x

x xx

Linear Regression

Training Set

Learning Algorithm

h

Hypothesis

Size of house

Estimated Price

h maps x’s to y’s

How to represent h?

h (x) = q0 + q1x

h(x)

• This is a linear function.

• Also called linear regression with one variable.

Page 9: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 9

Linear Regression with Multiple Features

x1 x2 x3 x4 ySuppose we have the following features

Size ft2 bedrooms floors Age Price2104 5 1 45 4601416 3 2 40 2321534 3 2 30 315852 2 1 36 178

… … … … …

A hypothesis function h(x) might be:h(x) = q0 + q1x1 + q2x2 + q3x3 +…+ qnxn

or h(x) = q0x0 + q1x1 + q2x2 + q3x3 +…+ qnxn (x0=1)e.g., h(x) = 80 + 0.1∙x1 + 0.01∙x2 + 3∙x3 - 2∙x4

Linear regression with multiple features is also called multiple linear regression

Page 10: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 10

Polynomial Regression

Another hypothesis function h(x) might be (polynomial):

h (x) = q0x0 + q1x1 + q2x22 + q2x33 +…+ qnxnn (x0=1)

Suppose we have the following featuresx1 x2 x3 x4 y

Size ft2 bedrooms floors Age Price2104 5 1 45 4601416 3 2 40 2321534 3 2 30 315852 2 1 36 178

… … … … …

Page 11: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 201811

Polynomial Regression

Price(y)

Size (x)

x

x xx

x

x

x

x

x

x xx

x

x

xx

x

x xx

h (x) = q0 + q1x + q2x2

h (x) = q0 + q1x + q2x2 + q3x3

h (x) = q0 + q1x + q2! x

• We may combine features (e.g., size = width * depth).

• We have the option of what features and what models

(quadric, cubic,…) to use.

• Deciding which features and models that best fit our data and

application, is beyond the scope of this course, but there are

several algorithms for this.

Page 12: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 201812

In this lecture:

q Part 1: Motivation (Regression Problems)

q Part 2: Linear Regression

q Part 3: The Cost Functionq Part 4: The Gradient Descent Algorithm

q Part 5: The Normal Equation

q Part 6: Linear Algebra overview

q Part 7: Using Octave

q Part 8: Using R

Machine Learning Linear Regression

Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018

Page 13: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 13

Understanding qs Values

y

x

x

xx

xxxx xx

x x

xx

x

x xx

h (x) = q0 + q1x

qi’s: Parameters

How to choose qi’s?

Page 14: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 14

Understanding qs Values

h(x) = 1 + 0.5x

0321

3

2

1

q0 = 1q1 = 0.5

h(x) =0 + 0.5x

0321

3

2

1

q0 = 0q1 = 0.5

h(x) = 1.5 + 0∙x

0321

3

2

1

q0 = 1.5q1 = 0

Page 15: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 15

The Cost Function

y

x

x

xx

x xx

x

x

Idea: Choose q0,q1 so thath(x) is close to y for our training examples (x,y).

y: is the actual value.h(x): is the estimated value.

q0,q1

12m ∑ ( h(xi) – yi )2J(q0,q1) =

This cost function is also called a Squared Error function

The Cost Function J aims to find q0,q1 that minimizes the error.

i=1

m

Page 16: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 16

Undersetting the Cost Function

0321

3

2

1

Lets try different values of qs and see how the cost function J(q0,q1) behave!

╳╳

0321

3

2

1 ╳╳

0321

3

2

1 ╳╳

╳J(0,1) J(0,0.5) J(0,0)

321

3

2

1

J(q1)

q1╳

J(0.5)= 1/2∙3 [(0.5-1)2 +(1-2)2(1.5-3)2]

≃ 0.68J(1)= 1/2∙3 [(0)2 +(0)2(0)2]

= 0

J(0)= 1/2∙3 [(1)2 +(2)2(3)2]

≃ 2.3

╳╳╳

Given our training data, this is the best value of q1.

But this figure plots only q1, the problem

becomes complicated if we have also (q0. q1)

Page 17: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 17

Cost Function Intuition II

Hypothesis:

Parameters:

Cost Function:

Our Goal:

h (x) = q0 + q1 x1

q0, q1

12m ∑ ( h(xi) – yi )2J(q0,q1) =

i=1

m

minimize J(q0, q1)

Remember that our goal is find the minimum values of q0 and q1

Page 18: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 18

Cost Function Intuition II

Given any dataset, when we try to draw the cost function J(q0,q1), we may get this 3D shape:

Given a dataset, our goal is find the minimum values of J(q0,q1)

Page 19: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 19

Cost Function Intuition II

We may draw the cost function also using contour figures:

Page 20: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 20

Cost Function Intuition II

(q0,q1 )

h (x) = 800 + 0.15 ∙ x

Page 21: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 21

Cost Function Intuition II

(q0,q1 )

h (x) = 360 + 0∙x

Page 22: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 22

Cost Function Intuition II

(q0,q1 )

Is there any way/algorithm to find qs automatically?Yes, e.g., the Gradient Descent Algorithm

Page 23: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 23

In this lecture:q Part 1: Motivation (Regression Problems)

q Part 2: Linear Regression

q Part 3: The Cost Function

q Part 4: The Gradient Descent Algorithmq Part 5: The Normal Equation

q Part 6: Linear Algebra overview

q Part 7: Using Octave

q Part 8: Using R

Machine Learning Linear Regression

Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018

Page 24: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 24

The Gradient Descent Algorithm

!"#$0 ≔ '0 −α . ++,-

. .('0, '1)Repeat until convergence {

}

!"#$1 ≔ '1 −α . ++,3

. .('0, '1)'0:= !"#$0'1:= !"#$1

Starts with some initial values of '0 and '1Keep changing '0 and '1 to reduce .('0, '1)Until hopefully we end up at a minimum

Learning Rate

Partial Derivation

Jis 0or1

Page 25: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 25

The Problem of the Gradient Descent algo.

q1If α is too small, gradient descent can be slow

q1If α is too large, gradient decent can overshoot the minimum. It may fail to converge or even diverge

May converge to global minimum May converge to a local minimum

Page 26: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 26

The Gradient Descent for Linear Regression

!0 ≔ !0 − α . '( .∑*+'( (ℎ(.*) − 0*)

!1 ≔ !1 − α . '( .∑*+'( (ℎ(.*) − 0*) . .*

Repeat until convergence {

}

Simplified version for linear regression

Next, we will try to use Linear Algebra to numericallyminimize !2 (called Normal Equation) without needing to use iterating algorithms like Gradient Descent. However Gradient Descent scales better for bigger datasets.

Jis 0and1

Page 27: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 201827

In this lecture:

q Part 1: Motivation (Regression Problems)

q Part 2: Linear Regression

q Part 3: The Cost Function

q Part 4: The Gradient Descent Algorithm

q Part 5: The Normal Equationq Part 6: Linear Algebra overview

q Part 7: Using Octave

q Part 8: Using R

Machine Learning Linear Regression

Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018

Page 28: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 28

Minimizing !" numerically

Convert the features and the target value into matrixes:

2104 5 1 451416 3 2 401534 3 2 30852 2 1 36

X =

x1 x2 x3 x4 ySize ft2 bedrooms floors Age Price2104 5 1 45 4601416 3 2 40 2321534 3 2 30 315852 2 1 36 178

… … … … …

Another way to numerically (using linear algebra) estimate the optimal values of !".Given the following features:

460232315178

y=

Page 29: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 29

Minimizing !" numerically

1 2104 5 1 451 1416 3 2 401 1534 3 2 301 852 2 1 36

X =

x1 x2 x3 x4 ySize ft2 bedrooms floors Age Price2104 5 1 45 4601416 3 2 40 2321534 3 2 30 315852 2 1 36 178

… … … … …

460232315178

y=

add #0, so to represent !0

Page 30: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 30

The Normal Equation

To obtain the optimal values (minimized) of !s, Use the following Normal Equation:

! = (XT ∙X)-1 ∙ XT ∙ y

In Octave: pinv(X’*X)*X’ *y

!: the set of !s we want to minimizeX: the set of features in the training sety: the output values we try to predict XT: the transpose of XX-1: the inverse of X

Where:

Page 31: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 31

Gradient Descent Vs Normal Equation

Gradient Descent Normal Equation

• Needs to choose the learning rate (α)

• Needs many iterations

• Works well even when nis large

• No need to choose (α)

• Don’t need to iterate• Need to compute (XTX),

which takes about O(n3)

• Slow if n is very large• Some matrices (e.g.,

singular) are non-invertible

m training examples, n features

Recommendation: use the Normal Equation If the number of features in less than 1000, otherwise the Gradient Descent.

Page 32: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 32

In this lecture:

q Part 1: Motivation (Regression Problems)

q Part 2: Linear Regression

q Part 3: The Cost Function

q Part 4: The Gradient Descent Algorithm

q Part 5: The Normal Equation

q Part 6: Linear Algebra overviewq Part 7: Using Octave

q Part 8: Using R

Machine Learning Linear Regression

Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018

Page 33: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 33

Matrix Vector Multiplication

1 34 02 1

15 =╳

1×1 + 3×5 = 16

4×1 + 0×5 = 4

1647

2×1 + 1×5 = 7

Page 34: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 34

Matrix Matrix Multiplication

1 3 24 0 1

1 3 0 15 2

=╳

Multiple with the first column

1 3 24 0 1

105

=╳ 119

Multiple with the second column

1 3 24 0 1

312

=╳ 1014

11 109 14

Page 35: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 201835

Matrix Inverse

3 4

2 16 =╳

If A is an m×m matrix, and if it has an inverse, then

A×A-1 = A-1×A = I

The multiplication of a matrix with its inverse produce an

identity matrix:

0.4 -0.1

-0.05 0.075

1 0

0 1

Some matrices do not have inverses e.g., if all cells are zeros

For more, please review Linear Algebra

Page 36: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 36

Matrix Transpose

1 2 03 5 9A=

1 32 50 9

AT=Example:

Let A be an m×n matrix, and let B = AT

Then B is an n×m matix, and Bij = Aji

Page 37: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 37

In this lecture:

q Part 1: Motivation (Regression Problems)

q Part 2: Linear Regression

q Part 3: The Cost Function

q Part 4: The Gradient Descent Algorithm

q Part 5: The Normal Equation

q Part 6: Linear Algebra overview

q Part 7: Using Octaveq Part 8: Using R

Machine Learning Linear Regression

Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018

Page 38: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 38

Octave

• Download from https://www.gnu.org/software/octave/

• High-level Scientific Programming Language

• Free alternatives to (and compatible with) Matlab

• helps in solving linear and nonlinear problems numerically

Online Tutorial https://www.youtube.com/playlist?list=PLnnr1O8OWc6aAjSc50lzzPVWgjzucZsSD

Page 39: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 39

Compute the normal equation using Octave

1 2104 5 1 451 1416 3 2 401 1534 3 2 301 852 2 1 36

X =

460232315178

y=Given

! =

In Octave: load X.txtload y.txtC= pinv(X’*X)*X’ *ySave !.txt

188.40.4-56-93-3.7

These are the values of !s we need to use in our hypothesis function h(x)

h (x) = q0 + q1x + q2x + q3x + q4xh (x) = 188.4 + 0.4x - 56x - 93x – 3.7x

Page 40: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 40

In this lecture:

q Part 1: Motivation (Regression Problems)

q Part 2: Linear Regression

q Part 3: The Cost Function

q Part 4: The Gradient Descent Algorithm

q Part 5: The Normal Equation

q Part 6: Linear Algebra overview

q Part 7: Using Octave

q Part 8: Using R

Machine Learning Linear Regression

Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018

Page 41: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 201841

R (programming language)

Free software environment for statistical computing and graphics that is supported by the R Foundation for Statistical Computing.

R comes with many functions that can do sophisticated stuff, and the ability to install additional packages to do much more.

Download R: https://cran.r-project.org/

A very good IDE for R is the RStudio:https://www.rstudio.com/products/rstudio/download/

R basics tutorial:https://www.youtube.com/playlist?list=PLjgj6kdf_snYBkIsWQYcYtUZiDpam7ygg

Page 42: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 42

Compute the normal equation using R

1 2104 5 1 451 1416 3 2 401 1534 3 2 301 852 2 1 36

X =

460232315178

y=Given

In R:X = as.matrix(read.table("~/x.txt", header=F, sep=","))Y = as.matrix(read.table("~/y.txt", header=F, sep=","))

thetas = solve( t(X) %*% X ) %*% t(X) %*% Y

write.table(thetas, file="~/thetas.txt",row.names=F, col.names=F)

t(X): is the transpose of matrix X.solve(X): is the inverse of matrix X.

Page 43: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 43

References

[1] Andrew Ng’s course about Machine Learning https://www.youtube.com/channel/UCMoXOGX9mgrYNEwpcIQUcag

[2] Sami Ghawi, Mustafa Jarrar: Lecture Notes on Introduction to Machine Learning, Birzeit University, 2018

[3] Mustafa Jarrar: Lecture Notes on Decision Tree Machine Learning, Birzeit University, 2018

[4] Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning, Birzeit University, 2018

Page 44: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 44

Extra Example

Year Grape Price Watermelon Price Melon Price2013 5 3 42014 6 2 32015 7 2 3

Given the following price melon in the previous 5 years

1 5 31 6 21 7 2X=

433Y=

Page 45: Machine Learning: Linear Regression - Jarrar€¦ · qPart 1: Motivation (Regression Problems) qPart 2: Linear Regression qPart 3: The Cost Function qPart 4: The Gradient Descent

Jarrar © 2018 45

X= XT=

X*XT=

1 5 31 6 21 7 2

1 1 15 6 73 2 2

35 37 4237 41 4742 47 54

5.0000 -24.0000 17.0000-24.0000 126.0000 -91.000017.0000 -91.0000 66.0000

Inv(X*XT)=

C=pinv(x2'*x2)*(x2'*y2)