Principal Component Analysis PCA

7,773 views 28 slides Dec 15, 2023
Slide 1
Slide 1 of 28
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28

About This Presentation

Principal Component Analysis(PCA) technique was introduced by the mathematician Karl Pearson in 1901. It works on the condition that while the data in a higher dimensional space is mapped to data in a lower dimension space, the variance of the data in the lower dimensional space should be maximum.
...


Slide Content

Principal Component Analysis (PCA)

Outline What is PCA? Dimensionality Reduction. Why PCA? Important Terminologies. How does PCA Work? Applications of PCA Advantages and Limitations

Introduction Principal Component Analysis, commonly referred to as PCA, is a powerful mathematical technique used in data analysis and statistics. At its core, PCA is designed to simplify complex datasets by transforming them into a more manageable form while retaining the most critical information. reducing the dimensionality of dataset Increasing interpretability without losing information

Dimensionality Reduction Dimensionality reduction refers to the techniques that reduce the number of input variables in a dataset. Why DR? Less dimensions for a given dataset means less computation or training time Redundancy is removed after removing similar entries from the dataset Data Compression (Reduce storage space) It helps to find out the most significant features and skip the rest Leads to better human interpretations

Why PCA? Dimensionality Reduction Noise Reduction Visualization Feature Engineering Overfitting Problem Data Compression Machine Learning Processing

Important Terminologies Variance Covariance Eigenvalues Eigenvectors Principle Component

Important Terminologies (Variance) Variance is the sum of squares of differences between all numbers and means . Variance (σ²) = (Sum of the squared differences from the mean) / (Total number of values) In mathematical notation: σ² = Σ(x - μ)² / (n) Here: - μ is the mean of independent features - Mean (μ) = (Sum of all values) / (Total number of values)

Important Terminologies (Variance) The variance is a measure that indicates how much data scatter around the mean

Important Terminologies (Variance) In mathematical notation: σ² = Σ(x - μ)² / (n) .

Important Terminologies (Covariance) It is the relationship between a pair of random variables where change in one variable causes change in another variable. It can take any value between -infinity to +infinity, where the negative value represents the negative relationship whereas a positive value represents the positive relationship. It is used for the linear relationship between variables. It gives the direction of relationship between variables.

Important Terminologies (Covariance) The formula for the covariance (Cov) between two random variables X and Y, each with N data points, is as follows: Where: Cov(X,Y) is the covariance between X and Y. N is the number of data points. Xi and Yi represent individual data points for X and Y, respectively.

Important Terminologies (Covariance) X Y 10 40 12 48 14 56 8 21 Covariance Matrix

Compute Eigenvalues/EigenVectors Let A be square N*N matrix & x be non-zero vector for which : Ax = λx For some scalar values λ λ = Eigenvalue of matrix A. X = Eigenvector of matrix A. Eigenvalues : A-λI=0 [return n numbers of eigenvalues]

Compute Eigenvalue / Eigenvectors

How does PCA work? Step 1: Standardize the data. Step 2: Calculate the covariance matrix. Step 3: Compute the eigenvectors and eigenvalues. Step 4: Select the principal components. Step 5: Project data onto the new basis.

Step-By-Step Explanation of PCA (Principal Component Analysis) Step 1: Standardization The main aim of this step is to standardize the range of the attributes so that each one of them lie within similar boundaries μ is the mean of independent features σ is the standard deviation of independent features σ = √[ ∑(x - x̄)2 / N ]

Standardization Dataset: Consider a small dataset with two variables, X and Y, represented by the following data points: X: [2, 3, 5, 7, 10] Y: [4, 5, 7, 8, 11] For variable X: Mean (μX) = (2 + 3 + 5 + 7 + 10) / 5 = 5.4 Standard Deviation (σX) = √[Σ(Xi - μX)² / (n - 1)] = √[(0.64 + 0.04 + 0.16 + 1.44 + 20.25) / 4] ≈ 2.40 - For variable Y: Mean (μY) = (4 + 5 + 7 + 8 + 11) / 5 = 7 Standard Deviation (σY) = √[Σ(Yi - μY)² / (n - 1)] = √[(9 + 4 + 0 + 1 + 16) / 4] ≈ 2.38 Standardized X: [-1.25, -0.71, 0.36, 1.43, 0.17] Standardized Y: [-1.34, -0.87, 0.11, 0.61, 1.50]

Covariance Matrix Computation Covariance matrix is use to express the correlation between any two or more attributes in a multidimensional dataset Variance is denoted by Var Covariance is denoted by Cov

Covariance Matrix Computation Cov(X, X) Cov(X, Y) Cov(Y, X) Cov(Y, Y) Using the formula for covariance: Cov(X, X) = Σ(Standardized X * Standardized X) / (n - 1) = (1.56 + 0.50 + 0.13 + 2.05 + 0.03) / 4 ≈ 1.305 Cov(X, Y) = Σ(Standardized X * Standardized Y) / (n - 1) = (-1.67 + 0.62 + 0.04 + 0.88 + 0.26) / 4 ≈ 0.133 Cov(Y, X) = Σ(Standardized Y * Standardized X) / (n - 1) = (-1.67 + 0.62 + 0.04 + 0.88 + 0.26) / 4 ≈ 0.133 Cov(Y, Y) = Σ(Standardized Y * Standardized Y) / (n - 1) = (1.79 + 0.76 + 0.01 + 0.15 + 2.25) / 4 ≈ 1.24 Covariance Matrix: 1.305 0.133 0.133 1.24

Compute Eigenvalues and Eigenvectors of Covariance Matrix to Identify Principal Components Let's assume we find two eigenvalues and corresponding eigenvectors: Eigenvalue 1 (λ1) = 1.50 Eigenvector 1 (v1) = [0.707, 0.707] Eigenvalue 2 (λ2) = 1.05 Eigenvector 2 (v2) = [-0.707, 0.707]

Select the Principal Components. First Principle component is the direction of greatest variability(covariance) in the data Second is the next orthogonal(uncorrelated) direction of greatest variability

Project Data onto Principal Components To transform the data into the new principal component space, we dot-multiply the standardized data by the eigenvectors: New PC1 = (Standardized X * v1, Standardized Y * v1) New PC2 = (Standardized X * v2, Standardized Y * v2)

Applications of PCA Netflix Movie Recommendations Grocery Shopping Fitness Trackers Car Shopping Real Estate Manufacturing and Quality Control Sports Analytics Renewable Energy Smart Cities

Advantages of PCA Prevents Overfitting Speeds Up Other Machine Learning Algorithms Improves Visualization Dimensionality Reduction Noise Reduction

Limitations of PCA Linearity Assumption Loss of Interpretability Loss of Information Sensitivity to Scaling Orthogonal Components

Some Mathematical Problem Given the Following data ,Use PCA to reduce the dimension from 2 to 1 Feature Example 1 Example 2 Example 3 Example 4 X 4 8 13 7 Y 11 4 5 14

Reference https://www.simplilearn.com/tutorials/machine-learning-tutorial/principal-component-analysis https://www.geeksforgeeks.org/principal-component-analysis-pca/ https://www.cuemath.com/algebra/covariance-matrix/

Thank You Q&A