ajaysubramanicareer
8 views
25 slides
Aug 30, 2025
Slide 1 of 25
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
About This Presentation
Linear Regression
Size: 3.79 MB
Language: en
Added: Aug 30, 2025
Slides: 25 pages
Slide Content
LINEAR REGRESSION Aakash S 310623104002 Aditya Vaibhav J 310623104006 Ajay S 310623104009 Akash A 310623104010 Bharath D 310623104029 Chezhian V 310623104032 Subject:Foundations of Data Science Class:II CSE-A Date:14/08/2024
Mathematical Foundation of Linear regression 01 Introduction Model training and evaluation Table of Content 03 02 04 Data preprocessing and regression metrics 05 Applications and Limitation
Introduction Linear regression is a statistical method used to model the relationship between a dependent variable and one or more independent variables. It involves fitting a straight line through a set of data points to predict the value of one variable based on the other. Linear regression is also a type of machine-learning algorithm more specifically a supervised machine-learning algorithm that learns from the labelled datasets
Mathematical Foundation Univariate Linear Regression Univariate linear regression involves a single independent variable. The relationship between this single independent variable and the dependent variable is modeled with a linear equation. It’s the simplest form of linear regression. Equation:
Multivariate Linear Regression Multivariate linear regression extends univariate linear regression to multiple independent variables. It models the relationship between two or more independent variables and the dependent variable. Equation:
DATA PREPROCESSING : Data collection. Data integration Data Cleaning i. Handling Missing data ii. Handling outliers iii. Residual analysis 4. Data transformation
Model Training and Evaluation Splitting data Training set (75%) Testing set (25%) Cross-validation k-Fold Cross-Validation Leave-One-Out Cross-Validation (LOOCV) Understanding Model Coefficients Eg- y=β0 + β1x1 + β2x2
Analyzing Performance Overfitting Underfitting Evaluation metrics Mean absolute error Mean squared error R-squared Root Mean squared error
Implementing Linear Regression in Python 1. Importing the libraries and dataset 2. Data Preprocessing Now that we have imported the libraries, we will perform data preprocessing. 3. Finding the coefficient Now, the task is to find a line that fits best in the above scatter plot so that we can predict the response for any new feature values
Estimating Coefficients Function Plotting Regression Line Function Main Function plot_regression_line (x, y, b)
OUTPUT:
Assumptions of Simple Linear Regression 1. Linear relationship: There exists a linear relationship between the independent variable, x, and the dependent variable, y. 2. Independence: The residuals are independent. There is no correlation between consecutive residuals in time series data. 3. Homoscedasticity: The residuals have constant variance at every level of x. 4. Normality: The residuals of the model are normally distributed.
Applications of Linear Regression Sales Forecasting Predict future sales based on historical sales data, marketing spend, and other relevant factors. Financial Modeling Estimate future stock prices or asset values using financial indicators and economic data. Customer Segmentation Identify customer groups with similar spending patterns or preferences. Behavioral Trends : Examine trends like response to promotions to refine segments and improve engagement.
Applications in Business and Finance Linear regression is a versatile tool that has numerous applications in business and finance. It is used to analyze trends, make forecasts, and improve decision-making. Financial Forecasting Predict future stock prices, interest rates, or economic growth. Marketing Analytics Analyze the effectiveness of advertising campaigns and customer segmentation. Sales Forecasting Predict future sales based on historical data, economic indicators, and marketing campaigns. Risk Management Estimate the probability of default on loans or other financial instruments.
Limitations and Considerations While linear regression is a powerful tool, it has some limitations that should be considered when applying it. Non-linear Relationships Linear regression is not suitable for modeling non-linear relationships between variables. Data Quality The quality of the data used for regression analysis is crucial. Outlier Impact Outliers can have a significant impact on the regression results. Overfitting A model that is too complex may overfit the data, leading to poor predictions on new data.
Conclusion Linear regression models relationships between variables with simplicity and clarity. It is a cornerstone of data science and machine learning, providing a straightforward method for predictive modeling and relationship analysis. Its applications span various fields, including finance and healthcare, making it invaluable for deriving insights and making informed decisions.
Riddles 1. I draw a line, from low to high, To predict the points that touch the sky.
Ans: Least Squares
2.I tell you how well your line does fit, If you should celebrate or throw a fit. I range from zero to one, What is my name, under the regression sun?
Ans : R-squared
Find the odd one out: A) linear relationship B) Independence C) Handling outliers D) Normality