clustering algorithm in neural networks

agarshaSelvaraj 10 views 8 slides Sep 29, 2024
Slide 1
Slide 1 of 8
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8

About This Presentation

A clustering algorithm is a machine learning technique used to group a set of objects into clusters, where objects within the same cluster are more similar to each other than to those in other clusters. Clustering is a form of unsupervised learning, meaning that the algorithm learns patterns from un...


Slide Content

>Clustering algorithm
>Steps involved in clustering
>Content Delivery
>Conclusion with applications

Clustering Algorithm
>Clustering is a fundamental technique in unsupervised
learning.
>It involves grouping a set of data points into clusters
based on their
similarities.
>The goal is to partition the data in such a way that points in
the same cluster are more similar to each other than to those
in other clusters. So, the intra-cluster similarity between
objects is high and inter-cluster similarity is low.
>Important human activity used from early childhood in
distinguishing between different items such as cars and cats,
animals and plants etc.

Distance Metrics: Distance metrics quantify the similarity or dissimilarity between
pairs of data points within a dataset. For example, the Euclidean distance measures
the straight-line distance between two points in a multidimensional space.
Distance(X,Y) = Euclidean distance between X,Y
Cluster Assignment: Cluster assignment is the process of assigning each data point
to a specific cluster based on certain criteria, such as its proximity to cluster centroids
or the similarity with other data points in the cluster
Centroid: In clustering algorithms like k-means, the centroid represents the
center point of a cluster. It is calculated as the mean of all data points belonging
to that cluster.
Cluster Evaluation: Cluster evaluation metrics assess the quality of clustering results
by quantifying how well the clusters represent the underlying structure of the data.

Simple Clustering: K-means
Works with numeric data only
1)Pick a number (K) of cluster
centers (at random)
2) Assign every item to its
nearest cluster center (e.g.
using Euclidean distance)
3) Move each cluster center to
the mean of its assigned items
4) Repeat steps 2,3 until
convergence (change in cluster
assignments less than a
threshold)

a
b

Application
>market segmentation.
>social network
analysis.
>Market basket analysis
>medical imaging.
>image segmentation.
>anomaly detection.

Challenges
Dependency on Initial Guess
When using K-means, we have to start by guessing the initial positions of the cluster
centers. The final clustering results can be affected by this initial guess. Sometimes,
the algorithm may not find the best solution, leading to less accurate clusters.
Sensitivity to Outliers
K-means treats all data points equally and can be sensitive to outliers, which are
unusual or extreme data points. Outliers can distort the clustering process, causing the
algorithm to create less reliable clusters. Handling outliers properly is important to get
better results.
Need to Know the Number of Clusters
With K-means, we have to tell the algorithm how many clusters we expect
in the data.. Choosing the wrong number of clusters can lead to misleading
results. Methods like the elbow method or silhouette analysis can help
estimate the appropriate number of clusters, but it’s still a challenge.

Conclusion
Clustering algorithms offer a powerful means of organizing
complex datasets, aiding in pattern discovery and data
interpretation. They facilitate data compression, anomaly detection,
and informed decision-making across diverse domains. Their
unsupervised nature and versatility make them indispensable tools
in data analysis and machine learning applications.