KNN Tutorial (K-nearest neighbor) - Machine Learning

hmd3214 89 views 13 slides Oct 02, 2024
Slide 1
Slide 1 of 13
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13

About This Presentation

KNN Tutorial


Slide Content

K-Nearest Neighbors (KNN) Tutorial Credit: Italo José

KNN Tutorial What’s KNN? KNN (K — Nearest Neighbors ) is one of many (supervised learning) algorithms used in data mining and machine learning, it’s a classifier algorithm where the learning is based “how similar” is a data (a vector) from other .

KNN Tutorial How it’s working? The KNN is pretty simple, imagine that you have a data about colored balls: Purple balls; Yellow balls; And a ball that you don’t know if it’s purple or yellow, but you have all the data about this colour(except the colour label).

KNN Tutorial So, how are you going to know the ball’s colour? imagine you like a machine that you only have the ball’s characteristics(data), but doesn’t the final label. Hou do you will to know the ball’s colour(final label/your class)?

KNN Tutorial Let’s suppose that data with number 1(and label R) are referring to the purple balls and the data with number 2 (and label A) are referring to the yellow balls, this’s just to make the explanation easier,. Each line refers to a ball and each column refers to a ball’s characteristic, in the last column we have the class (colour) of each of the balls: R  -> purple; A  -> yellow

KNN Tutorial

KNN Tutorial The KNN’s steps are: 1 — Receive an unclassified data; 2 — Measure the distance (Euclidian, Manhattan, Minkowski or Weighted) from the new data to all others data that is already classified; 3 — Gets the K(K is a parameter that you define) smaller distances; 4 — Check the list of classes had the shortest distance and count the amount of each class that appears; 5 — Takes as correct class the class that appeared the most times; 6 —Classifies the new data with the class that you took in step 5;

KNN Tutorial The Euclidean distance’s formula is defined below:

KNN Tutorial Let’s use the example from the previous worksheet but now with 1 unclassified data, that's the information we want to discover. We have 5 data (lines) in this example, each sample (data/line) have your attributes (characteristics), let’s  IMAGINE  that all these are images, each line would be an image and each column would be a image’s pixel.

KNN Tutorial Lets assume the data are named D1, D2, D3 and D4. We need to figure out which class/cat e gory DX should belong to. First we subtract each data characteristic of D1 from DX (1–2) = -1 Then square the result (-1) 2 = 1 We do this for all data characteristics, then sum the result (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² = 8 Then we take the square root   DX D1 D2 D3 D4

KNN Tutorial Lets assume the data are named D1, D2, D3 and D4. We need to figure out which class/cat e gory DX should belong to. We also calculate the distance of D2, D3, D4 from DX. DX D1 D2 D3 D4 Data and color Distance D1 (A) (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² = 8 D2 (A) (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² = 8 D3 (R) (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² = 0 D4 (R) (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² = 0 Data and color Distance D1 (A) (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² = 8 D2 (A) (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² + (1–2)² = 8 D3 (R) (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² = 0 D4 (R) (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² + (1–1)² = 0

KNN Tutorial Lets assume the data are named D1, D2, D3 and D4. We need to figure out which class/cat e gory DX should belong to. The data are sorted by distance. DX D1 D2 D3 D4 Data and color Distance D1 (A) 2.83 D2 (A) 2.83 D3 (R) D4 (R) Data and color Distance D3 (R) D4 (R) D1 (A) 2.83 D2 (A) 2.83 Assuming K=1 Then we take the DX belongs to class R . K=1

KNN Tutorial Lets assume the data are named D1, D2, D3 and D4. We need to figure out which class/cat e gory DX should belong to. The data are sorted by distance. DX D1 D2 D3 D4 Data and color Distance D1 (A) 2.83 D2 (A) 2.83 D3 (R) D4 (R) Data and color Distance D3 (R) D4 (R) D1 (A) 2.83 D2 (A) 2.83 Assuming K=3 Then we take the majority vote , then DX belongs to class R also. K=3
Tags