Principal Component Analysis(PCA) technique was introduced by the mathematician Karl Pearson in 1901. It works on the condition that while the data in a higher dimensional space is mapped to data in a lower dimension space, the variance of the data in the lower dimensional space should be maximum.
...
Principal Component Analysis(PCA) technique was introduced by the mathematician Karl Pearson in 1901. It works on the condition that while the data in a higher dimensional space is mapped to data in a lower dimension space, the variance of the data in the lower dimensional space should be maximum.
Principal Component Analysis (PCA) is a statistical procedure that uses an orthogonal transformation that converts a set of correlated variables to a set of uncorrelated variables.PCA is the most widely used tool in exploratory data analysis and in machine learning for predictive models. Moreover,
Principal Component Analysis (PCA) is an unsupervised learning algorithm technique used to examine the interrelations among a set of variables. It is also known as a general factor analysis where regression determines a line of best fit.
The main goal of Principal Component Analysis (PCA) is to reduce the dimensionality of a dataset while preserving the most important patterns or relationships between the variables without any prior knowledge of the target variables.
Principal Component Analysis (PCA) is used to reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables, retaining most of the sample’s information, and useful for the regression and classification of data.
Size: 1.5 MB
Language: en
Added: Dec 15, 2023
Slides: 28 pages
Slide Content
Principal Component Analysis (PCA)
Outline What is PCA? Dimensionality Reduction. Why PCA? Important Terminologies. How does PCA Work? Applications of PCA Advantages and Limitations
Introduction Principal Component Analysis, commonly referred to as PCA, is a powerful mathematical technique used in data analysis and statistics. At its core, PCA is designed to simplify complex datasets by transforming them into a more manageable form while retaining the most critical information. reducing the dimensionality of dataset Increasing interpretability without losing information
Dimensionality Reduction Dimensionality reduction refers to the techniques that reduce the number of input variables in a dataset. Why DR? Less dimensions for a given dataset means less computation or training time Redundancy is removed after removing similar entries from the dataset Data Compression (Reduce storage space) It helps to find out the most significant features and skip the rest Leads to better human interpretations
Why PCA? Dimensionality Reduction Noise Reduction Visualization Feature Engineering Overfitting Problem Data Compression Machine Learning Processing
Important Terminologies Variance Covariance Eigenvalues Eigenvectors Principle Component
Important Terminologies (Variance) Variance is the sum of squares of differences between all numbers and means . Variance (σ²) = (Sum of the squared differences from the mean) / (Total number of values) In mathematical notation: σ² = Σ(x - μ)² / (n) Here: - μ is the mean of independent features - Mean (μ) = (Sum of all values) / (Total number of values)
Important Terminologies (Variance) The variance is a measure that indicates how much data scatter around the mean
Important Terminologies (Variance) In mathematical notation: σ² = Σ(x - μ)² / (n) .
Important Terminologies (Covariance) It is the relationship between a pair of random variables where change in one variable causes change in another variable. It can take any value between -infinity to +infinity, where the negative value represents the negative relationship whereas a positive value represents the positive relationship. It is used for the linear relationship between variables. It gives the direction of relationship between variables.
Important Terminologies (Covariance) The formula for the covariance (Cov) between two random variables X and Y, each with N data points, is as follows: Where: Cov(X,Y) is the covariance between X and Y. N is the number of data points. Xi and Yi represent individual data points for X and Y, respectively.
Important Terminologies (Covariance) X Y 10 40 12 48 14 56 8 21 Covariance Matrix
Compute Eigenvalues/EigenVectors Let A be square N*N matrix & x be non-zero vector for which : Ax = λx For some scalar values λ λ = Eigenvalue of matrix A. X = Eigenvector of matrix A. Eigenvalues : A-λI=0 [return n numbers of eigenvalues]
Compute Eigenvalue / Eigenvectors
How does PCA work? Step 1: Standardize the data. Step 2: Calculate the covariance matrix. Step 3: Compute the eigenvectors and eigenvalues. Step 4: Select the principal components. Step 5: Project data onto the new basis.
Step-By-Step Explanation of PCA (Principal Component Analysis) Step 1: Standardization The main aim of this step is to standardize the range of the attributes so that each one of them lie within similar boundaries μ is the mean of independent features σ is the standard deviation of independent features σ = √[ ∑(x - x̄)2 / N ]
Standardization Dataset: Consider a small dataset with two variables, X and Y, represented by the following data points: X: [2, 3, 5, 7, 10] Y: [4, 5, 7, 8, 11] For variable X: Mean (μX) = (2 + 3 + 5 + 7 + 10) / 5 = 5.4 Standard Deviation (σX) = √[Σ(Xi - μX)² / (n - 1)] = √[(0.64 + 0.04 + 0.16 + 1.44 + 20.25) / 4] ≈ 2.40 - For variable Y: Mean (μY) = (4 + 5 + 7 + 8 + 11) / 5 = 7 Standard Deviation (σY) = √[Σ(Yi - μY)² / (n - 1)] = √[(9 + 4 + 0 + 1 + 16) / 4] ≈ 2.38 Standardized X: [-1.25, -0.71, 0.36, 1.43, 0.17] Standardized Y: [-1.34, -0.87, 0.11, 0.61, 1.50]
Covariance Matrix Computation Covariance matrix is use to express the correlation between any two or more attributes in a multidimensional dataset Variance is denoted by Var Covariance is denoted by Cov
Compute Eigenvalues and Eigenvectors of Covariance Matrix to Identify Principal Components Let's assume we find two eigenvalues and corresponding eigenvectors: Eigenvalue 1 (λ1) = 1.50 Eigenvector 1 (v1) = [0.707, 0.707] Eigenvalue 2 (λ2) = 1.05 Eigenvector 2 (v2) = [-0.707, 0.707]
Select the Principal Components. First Principle component is the direction of greatest variability(covariance) in the data Second is the next orthogonal(uncorrelated) direction of greatest variability
Project Data onto Principal Components To transform the data into the new principal component space, we dot-multiply the standardized data by the eigenvectors: New PC1 = (Standardized X * v1, Standardized Y * v1) New PC2 = (Standardized X * v2, Standardized Y * v2)
Applications of PCA Netflix Movie Recommendations Grocery Shopping Fitness Trackers Car Shopping Real Estate Manufacturing and Quality Control Sports Analytics Renewable Energy Smart Cities
Advantages of PCA Prevents Overfitting Speeds Up Other Machine Learning Algorithms Improves Visualization Dimensionality Reduction Noise Reduction
Limitations of PCA Linearity Assumption Loss of Interpretability Loss of Information Sensitivity to Scaling Orthogonal Components
Some Mathematical Problem Given the Following data ,Use PCA to reduce the dimension from 2 to 1 Feature Example 1 Example 2 Example 3 Example 4 X 4 8 13 7 Y 11 4 5 14