Mahavir Education Trust’s Shah & Anchor Kutchhi Engineering College Chembur, Mumbai- 400088 (An Autonomous Institute Affiliated to University of Mumbai) Department of Electronics & Communication (Advanced Communication Technology) Signals of Progress: Unleash the Power of Advanced Communications presentation on Real-Time Digital Filtering in PLC for Process Control (Subject: Digital Signal Processing (ACCOR1PC201), V Semester) B y Dignag Pakhare (124BTA2007) Vinayak mali (124BTA2008) Krishna Prajapati (124BTA2014)
Outline What is PCA ? What is Dimensionality Reduction? Code Implementation Data- Preprocessing Feature Scaling Outpu t Applications References
What is PCA ? PCA (Principal Component Analysis) is a linear dimensionality reduction technique. Converts correlated features into uncorrelated variables called principal components. Captures directions of maximum variance in the data. Simplifies large datasets while retaining most of the information. Commonly used for feature extraction and data visualization Figure 1 ?
What is Dimensionality Reduction? Dimensionality Reduction is a technique used in machine learning and data analysis to reduce the number of input variables or features in a dataset while preserving its important information. In other words, it transforms high-dimensional data (many features) into a lower-dimensional space (fewer features) without losing much of the essential structure or patterns. Simplifies models — makes computation faster and reduces complexity. Removes noise — eliminates redundant or irrelevant features. Avoids overfitting — improves model generalization. Helps visualization — allows us to visualize data in 2D or 3D . Figure 2
Step 2: Load the Dataset Step 3: Feature Scaling Loads the CSV file that contains each player’s name and performance metrics like: | Player | Batting Avg | Bowling Avg | Strike Rate | Wickets | Only numeric features are selected. StandardScaler converts all values into the same scale (mean=0, std=1). 🧮 Example: If Virat Kohli’s batting average is 59, after scaling it might become +2.1 , showing how far above average he is.
Step 4: K-Means Clustering Step 5: Hierarchical Clustering . Creates 3 clusters ( batsmn , bowlers, all-rounders). The algorithm places centroids and assigns each player to the nearest cluster . Adds a new column showing each player’s cluster number. Uses Ward’s linkage (minimizes variance when merging clusters). Builds a tree structure (dendrogram) showing how players merge step by step.
Step 6: Visualization Step 7: Dendrogram Shows clusters visually — each color represents a group (batsmen, bowlers, or all-rounders). The dendrogram visually shows how players combine into clusters. The height where branches merge represents similarity distance .
Output Fig 1 Fig 2 Fig 3
Applications Player Performance Analysis: Helps identify strengths and weaknesses of players based on statistical data. Team Selection: Assists coaches in forming balanced teams by grouping similar performance types. Talent Scouting: Detects potential all-rounders or specialists from large player datasets. Strategy Planning: Enables data-driven match strategies by analyzing player clusters. Sports Analytics Research: Supports deeper insights in predictive modeling and sports data mining.