K- means clustering is one of the most important and easier clustering algorithms. Here, I'm sharing about k means algorithm and also the evaluation.
Size: 1.44 MB
Language: en
Added: Mar 30, 2020
Slides: 15 pages
Slide Content
K-means Clustering: Algorithm, Evaluation Methods, and Graph
Hello! I am Iffat Firozy I am here because I love to teach . 2
We are given a data set of items, with certain features, and values for these features (like a vector). The task is to categorize those items into groups. To achieve this, we will use the kMeans algorithm; an unsupervised learning algorithm. 3
The above algorithm in pseudocode: Specify number of clusters K. Initialize centroids by first shuffling the dataset and then randomly selecting K data points for the centroids without replacement. Keep iterating until there is no change to the centroids. i.e assignment of data points to clusters isn’t changing. Compute the sum of the squared distance between data points and all centroids. Assign each data point to the closest cluster (centroid). Compute the centroids for the clusters by taking the average of the all data points that belong to each cluster. 4
Flowchart of k-means clustering algorithm: 5
LETS’ SOLVE A PROBLEM 6
Problem on K-means clustering. Given are the points A = (1,2), B = (2,2), C = (2, 1), D = (-1, 4), E = (-2, -1), F = (-1,-1) a) Starting from initial clusters Cluster1 = {A} which contains only the point A and Cluster2 = {D} which contains only the point D, run the K-means clustering algorithm and report the final clusters. b) Draw the points on a 2-D grid and check if the clusters make sense. 7
Initially: 8 X Y A 1 2 B 2 2 C 2 1 D -1 4 E -2 -1 F -1 -1 CLUSTER X Y CENTROID ASSIGHNMENT K1 1 2 1,2 1 K2 -1 4 -1,4 2
For row B: Euclidean Distance: Here, K1 = = 1 K2= =3.60 9 CLUSTER X Y CENTROID ASSIGHNMENT K1 (1+2)/2 = 1.5 (2+2)/2= 2 1.5,2 1 K2 -1 4 -1,4 X Y A 1 2 B 2 2 C 2 1 D -1 4 E -2 -1 F -1 -1
For row C: Distance: Here, K1 = = 1.11 K2= =4.24 10 CLUSTER X Y CENTROID ASSIGHNMENT K1 (1.5+2)/2 = 1.75 (2+1)/2 = 1.5 1.75,1.5 1 K2 -1 4 -1,4 X Y A 1 2 B 2 2 C 2 1 D -1 4 E -2 -1 F -1 -1
For row E: Distance: Here, K1 = = 4.50 K2= =5.09 11 CLUSTER X Y CENTROID ASSIGHNMENT K1 (1.75-2)/2 = -0.125 (1.5-1)/2 = 0.25 -0.125, 0.25 1 K2 -1 4 -1,4 X Y A 1 2 B 2 2 C 2 1 D -1 4 E -2 -1 F -1 -4
For row F: Distance: Here, K1 = = 4.33 K2= =5 12 CLUSTER X Y CENTROID ASSIGHNMENT K1 (0.125-1)/2 = -.43 (.25-1)/2 = -.375 -.43, -1.85 1 K2 -1 4 -1,4 X Y A 1 2 B 2 2 C 2 1 D -1 4 E -2 -1 F -1 -1
Final Clustering & Assignments: 13 X Y ASSIGNMENT A 1 2 1 B 1.5 2 1 C 1.75 1.5 1 D -1 4 1 E .125 .25 1 F -..43 -.375 1