Clustering in data analytics.This PPT tells about a clear view of clustering in non euclidiean space in data analytics using R
goviraj098765
9 views
9 slides
Sep 01, 2025
Slide 1 of 9
1
2
3
4
5
6
7
8
9
About This Presentation
This PPT tells about a clear view of clustering in non euclidiean space in data analytics using R
Size: 464.09 KB
Language: en
Added: Sep 01, 2025
Slides: 9 pages
Slide Content
Clustering in non-euclidean space
Introduction to Clustering in Non-Euclidean Space Definition of Clustering: Grouping similar data points into clusters based on certain criteria.
Non-Euclidean Space: Refers to spaces where the traditional Euclidean distance (e.g., straight-line distance) does not apply. Types of Non-Euclidean Spaces: Manifold learning
Graph-based data
Discrete spaces
Common Non-Euclidean Metrics: Cosine similarity, Jaccard similarity, etc. Why Non-Euclidean Space ? Limitations of Euclidean Space: It’s not always applicable to complex, high-dimensional, or non-linear data. Examples : Text data (similarity between documents, cosine distance).
Graph data (distance between nodes in networks).
Clustering Algorithms in Non-Euclidean Space K-means in Non-Euclidean Spaces: Requires adapting distance metrics (e.g., cosine distance). DBSCAN: Density-based clustering that can work with arbitrary distance functions. Hierarchical Clustering: Can use different distance metrics (e.g., Manhattan, cosine). Spectral Clustering: Often applied to graph data, using similarity matrices.
Explanation: Data: We created a tiny dataset with 3 points and 2 features.
Manhattan Distance: proxy:: dist () calculates the Manhattan distance between each pair of data points.
K- Medoids Clustering: The pam() function performs K- medoids clustering with k = 2 clusters.
Output: You will get the clusters assigned to each point and the medoids (representative points) for each cluster.
Conclusion: Clustering in Non-Euclidean Spaces: Essential for non-linear and complex data.
Utilizes alternative distance measures like cosine similarity or Jaccard . Tools in R: R offers robust tools for clustering in non-Euclidean spaces, including packages like proxy, cluster, and hclust . Applications : Natural language processing (NLP), image recognition, graph analysis.