Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro...

24
Nearest Neighbour

Transcript of Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro...

Page 1: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Nearest Neighbour

Page 2: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Here is our 2D dataset, with 3 different classes

Page 3: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Each datapoint has some features and a class label

(length, width), (species)

Page 4: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Given a new datapoint, how can we determine its class?

Page 5: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Find its “Nearest Neighbour” in the feature space

Page 6: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Computing “similarity” between two points

(l1,w1)

(l2,w2)

Page 7: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Euclidean distance in 2D

(l1,w1)

(l2,w2)

b

ac

c2 = a2 + b2 i.e Pythagoras' theorem

Page 8: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Euclidean distance in 2D

(l1,w1)(l2,w2)

ba

cc2 = a2 + b2

Page 9: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Euclidean distance in 2D

(l1,w1)(l2,w2)

ba

cc2 = a2 + b2

c = √a2 + b2

Page 10: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Euclidean distance in 2D

(l1,w1)(l2,w2)

ba

cc2 = a2 + b2

c = √a2 + b2

c = √(l1-l2)2 + (w1-w2)

2

Page 11: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Euclidean distance in 2D

In R:p1 <- c(0, 0)p2 <- c(1, 1)distance <- sqrt(sum((p1-p2)^2))

(l1,w1)(l2,w2)

ba

cc2 = a2 + b2

c = √a2 + b2

c = √(l1-l2)2 + (w1-w2)

2

Page 12: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Choose the closest point

(l1,w1)

(l2,w2)

(l3,w3)

Page 13: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Choose the closest point

(l1,w1)

(l2,w2)

(l3,w3)

c1,2

c1,3

Page 14: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Choose the closest point

(l1,w1)

(l2,w2)

(l3,w3)

c1,2

c1,3

c1,3 < c1,2

Page 15: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Find its “Nearest Neighbour” in the feature space

Page 16: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

“Nearest Neighbour” in the feature space

Page 17: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Nearest Neighbour Algorithm

Given a test point x

Compute the distance between x and every other datapoint

The class of x is set as the same as the closest datapoint

Page 18: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Again our 2D dataset

Page 19: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Let’s try a different test point

Page 20: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Here is it’s neighbour

Page 21: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Resulting Nearest Neighbour classification

For every point in the space we colour it with the class of the datapoint it is closest to.

Page 22: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Resulting Nearest Neighbour classification

For every point in the space we colour it with the class of the datapoint it is closest to.

Page 23: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Resulting Nearest Neighbour classification

For every point in the space we colour it with the class of the datapoint it is closest to.

Page 24: Nearest Neighbour - University College Londonweb4.cs.ucl.ac.uk › staff › O.Macaodha › ml_intro › 1.3_Nearest Neighbour.pdfEach datapoint has some features and a class label

Practical example

2_nearest_neighbour.R