Laptop_Price_Prediction_Project_Presentation.pptx

govardhansingu1 307 views 44 slides Jul 10, 2024
Slide 1
Slide 1 of 44
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44

About This Presentation

1) Laptop Price Prediction Model Using Machine Learning & Python :
Objective: The objective of this project is to develop a reliable predictive machine learning model that assist consumers in predicting the fair market value price of laptops based on various features such as specifications, bran...


Slide Content

By Govardhan Student Id - S8244 Odin School

Project Overview : The Project Overview is to develop a robust machine learning model that predicts laptop prices accurately for our client M/s SmartTech Co.. As the market for laptops continues to expand with a myriad of brands and specifications, having a precise pricing model becomes crucial for both consumers and manufacturers.

Objetives : “ S tay competitive in the market” Accurate Pricing: Accurately predict laptop prices based on various features, to stay competitive in the market. Assess the impact of Brand on pricing, providing insights into brand perception and market demand. Constraints : SmartTech Co. to strategically, to position its laptops in the market . . The Company needs to improve its share in laptop market, by assessing how different features & different Brands influence in Laptop pricing. . Insights into the factors influencing laptop prices, empowering SmartTech Co. in marke t positioning and strategy. Business Objective and Constraints :

CRISP – ML(Q) Methodology CRISP-ML (Cross-Industry Standard Process for Machine Learning) is one of the Standard Processes used in Industry to overcome business problems with the help of Machine Learning models.

Technical Stacks

Business Understanding Data Understanding Data Collection EDA Data Preprocessing Data Visualization Model Selection Model Evaluation Metrics Model deployment Hyper-Parameter Tuning Project Architecture

Dataset consists of : 1303 Rows and 13 Columns 'Unnamed: 0.1’ - This is just a Serial Number 'Unnamed: 0’, - This is just a Serial Number 'Company’, - Brand Name of Laptop. 'TypeName’, - Type of Laptop specifies the type 'Inches', - Screen Size of Laptop ' ScreenResolution ’, - mixed with 3 features of laptop. ' Cpu ’, - Cpu column says about the brand & Model of the processor. 'Ram’, - Random Acess Memory 'Memory’ - This column has values combined of different type of Storages ' Gpu ’, - graphics processing unit (GPU) helps handle graphics-related ' OpSys ’ - Operating system of the laptop 'Weight’ - Weight of Laptop in Kg 'Price’ - Price of the Laptop Data Exploration & Understanding

Most of the Columns in the DataSet are Noisy and some Columns Contains lot of information mixed with other multiple features. We perform slicing the columns & Extracting Data Exploration & Understanding

Extracting from Noisy Columns ScreenResolution X_Resolution TouchScreen IPS Y_Resolution Memory FlashStorage SSD HDD Hybrid Inserted ppi column Inches

Price Column is Right skewed. Most Laptop Price Range is distributed in between 20,000 and 70, 000 . The Target column is not Normally distributed.

Company : It is Brand Name of Laptop. So Check Company vs Average Price comparison with Box plot. Acer, HP,ASUS,DELL,Lenovo starts from Low range Budget & Average price is also in budget.

TypeName : Type of Laptop specifies the type. When compared the type to Price Notebook & Netbook are in budget, where as Gaming & Workstation is highly cost . Ultrabook & 2in1 Convertible is Moderate in between.

Inches column datatype plotted against Price . It seems price not much depends on with Inches .

Ppi : As the resolution and inches are extracted, Added a new column ppi:Pixel Per Inch , Which says about the image quality on the screen, which is an important feature for laptop. From above Scatter plot, the ppi influences the price of the column .

CPU : Cpu column says about the brand & Model of the processor. As each brand has many models, the models are grouped into their particular brand/Model. Intel core i7 is high Cost Processor, Intel core i5 is next highest, intel core i3, AMD & other intel processors are in less budget below 50,000.

Ram : Random Acess Memory. With increase in the Ram capacity, we see increase in price. Positively Correlated.

Gpu_brand : The graphics processing unit (GPU) in your device helps handle graphics-related work like graphics, effects, and videos. The cost increases with Brand. AMD is in budget, Nvidia is Highly cost. Gpu influences price.

OpSys : This is Operating system of the laptop. Windows has starts from Low budget to High range laptops, Mac starts from high Range from 50000 . OS influences price.

Checking Correlation between Numerical values Heat Map to check the Correlation

Numerical Columns

Dropped Unnecessary Columns that are splitted /extracted : [' ScreenResolution ', ' Cpu ', 'Memory', ' Gpu ', ' OpSys '] Dropped the Columns Inches, x_resolution , y_resolution - As the ppi is calculated based on this 3 columns: ['Inches', ' x_resol ', ' y_resol ’] Independent Variables : # Numerical Columns : ['Ram', 'Weight', ' ppi ', 'HDD', 'SSD', 'Price', 'Hybrid', ' Flash_Storage '] # Categorical Columns : Nominal : ['Company', 'TypeName', 'IPS', 'Touchscreen', ' Cpu_Processor ', ' Gpu_brand ', 'OS' ] Ordinal : No Ordinal Columns Target Variable : ‘Price’

MACHINE LEARNING MODEL

Gradient Boosting Regression

Model_Name R-squared (R²) Mean Absolute Error (MAE) Linear Regression 0.803047385207909 0.21097822649880946 Random Forest Regression 0.8750357724697907 0.15464693040956579 Gradient Boosting Regressor 0.8773723726767352 0.16359943883063252 Metrics Conclusion : Random Forest Model has high Accuracy and less Mean Squared Error, Hence it is considered for Hyper parameter Tuning

Random Forest Regressor - Hyperparameter Tuning

Random_search

Grid_search

RANDOM FOREST REGRESSION with GRID SEARCH BEST PARAMETERS RANDOM FOREST REGRESSION with Random SEARCH BEST PARAMETERS

Questionnaires

Brand of the laptop significantly influence its price? Conclusion : Hence we can say Brand of the laptop significantly influence its Price M odel prediction for laptops from lesser-known brands : Model has Hig h accuracy. The accuracy of our Lesser Known Brands model is 94.0% Mean Absolute Error (MAE): 0.152989 Mean Squared Error (MSE): 0.0476263 Root Mean Squared Error (RMSE): 0.21823452

Model perform on laptops with High-end specifications compared to Budget laptops? Metrics High End Laptops Model Budget Laptops Model accuracy 71.0 % 81.0 % MAE 0.16589435402600275 0.14780626075787437 MSE 0.047716541653714244 0.04162336390317959 RMSE 0.21844116291055182 0.20401804798394574 Conclusion : From the above Metrics, we can say Budget Laptop Model is performing well than High-end-Specification Model.

Implement a mechanism for the model to make predictions for new laptops entering the market.

INSIGHTS Price is Right Skewed and most of the laptop prices ranges between 20,000 to 70,000. The RAM , Brand, Laptop type, display type-IPS, SSD, Resolution( ppi ) type are the features more impacting the Prices of Laptop. The different Brands of Laptop Significantly impacts the laptop’s price as we observed in Boxplot.

Limitations and Challenges : Price Volatility : Laptop prices can change frequently due to market dynamics, new releases, and technological advancements, making it difficult for the model to stay accurate over time. Encoding categorical variables (e.g., brand, CPU type) can be challenging and may require domain-specific knowledge to handle properly. Hyperparameter Tuning : Selecting and tuning the hyperparameters of the model is often challenging and time-consuming. Limitations are as there are many different brands with different categories in each specifications, there are so many extreme values some feature which can’t be treated affects the price prediction. Data Size: Insufficient data can limit the model's ability to learn patterns, leading to poor performance. Data Bias: the dataset is biased (e.g., over-representation of certain brands or specifications), the model may not generalize well to new data. Some features and specifications or some laptop configurations can’t be able to applied.

MY PROJECT WORKING VIDEO

THANK YOU