1) Laptop Price Prediction Model Using Machine Learning & Python :
Objective: The objective of this project is to develop a reliable predictive machine learning model that assist consumers in predicting the fair market value price of laptops based on various features such as specifications, bran...
1) Laptop Price Prediction Model Using Machine Learning & Python :
Objective: The objective of this project is to develop a reliable predictive machine learning model that assist consumers in predicting the fair market value price of laptops based on various features such as specifications, brand, and other relevant factors.
Tools Used: Python Libraries (Pandas, Matplot, Seaborn, SK Learn), Machine Learning(Linear Reg,Random Forest, XGBoost), Jupyter Notebook, Dataset
Project Outcome: The primary outcome of the project is a predictive model with GUI Interface capable of 90% accurately estimating the price of a laptop given its specifications. The model's performance metrics (such as accuracy, mean absolute error, etc.) will be evaluated to gauge its effectiveness. Additionally, insights into which features most significantly influence laptop prices can be derived from the model.
Learning Outcomes:
Exploratory Data Analysis, Data Preprocessing, Feature Selection, Model Selection, Hyperparameter Tuning, Model Deployment, GUI Interface
Data Preprocessing: Cleaning and preparing the dataset for training, handling Null values, missing values, Outliers and encoding categorical variables.
Feature Selection: Identifying the most relevant features that impact laptop prices using techniques like correlation analysis or feature.
Model Selection and Evaluation: Experimenting with various machine learning algorithms (Linear regression, Random forest, XGBoost) to determine the best-performing model. Evaluating models using appropriate metrics and techniques such as cross-validation.
Hyperparameter Tuning: Optimize model performance by fine-tuning hyperparameters with techniques like Grid Search CV or Random Search CV.
Deployment Considerations: Understanding considerations for deploying a machine learning model into a real-world application, including scalability, interpretability, and ongoing maintenance.
Societal Impact: To estimate fair prices for laptops based on objective data Empowering consumers to make informed decisions. Showcasing transparency, how different features affect laptop pricing, potentially influencing pricing strategies and consumer expectations.
Size: 37.42 MB
Language: en
Added: Jul 10, 2024
Slides: 44 pages
Slide Content
By Govardhan Student Id - S8244 Odin School
Project Overview : The Project Overview is to develop a robust machine learning model that predicts laptop prices accurately for our client M/s SmartTech Co.. As the market for laptops continues to expand with a myriad of brands and specifications, having a precise pricing model becomes crucial for both consumers and manufacturers.
Objetives : “ S tay competitive in the market” Accurate Pricing: Accurately predict laptop prices based on various features, to stay competitive in the market. Assess the impact of Brand on pricing, providing insights into brand perception and market demand. Constraints : SmartTech Co. to strategically, to position its laptops in the market . . The Company needs to improve its share in laptop market, by assessing how different features & different Brands influence in Laptop pricing. . Insights into the factors influencing laptop prices, empowering SmartTech Co. in marke t positioning and strategy. Business Objective and Constraints :
CRISP – ML(Q) Methodology CRISP-ML (Cross-Industry Standard Process for Machine Learning) is one of the Standard Processes used in Industry to overcome business problems with the help of Machine Learning models.
Technical Stacks
Business Understanding Data Understanding Data Collection EDA Data Preprocessing Data Visualization Model Selection Model Evaluation Metrics Model deployment Hyper-Parameter Tuning Project Architecture
Dataset consists of : 1303 Rows and 13 Columns 'Unnamed: 0.1’ - This is just a Serial Number 'Unnamed: 0’, - This is just a Serial Number 'Company’, - Brand Name of Laptop. 'TypeName’, - Type of Laptop specifies the type 'Inches', - Screen Size of Laptop ' ScreenResolution ’, - mixed with 3 features of laptop. ' Cpu ’, - Cpu column says about the brand & Model of the processor. 'Ram’, - Random Acess Memory 'Memory’ - This column has values combined of different type of Storages ' Gpu ’, - graphics processing unit (GPU) helps handle graphics-related ' OpSys ’ - Operating system of the laptop 'Weight’ - Weight of Laptop in Kg 'Price’ - Price of the Laptop Data Exploration & Understanding
Most of the Columns in the DataSet are Noisy and some Columns Contains lot of information mixed with other multiple features. We perform slicing the columns & Extracting Data Exploration & Understanding
Extracting from Noisy Columns ScreenResolution X_Resolution TouchScreen IPS Y_Resolution Memory FlashStorage SSD HDD Hybrid Inserted ppi column Inches
Price Column is Right skewed. Most Laptop Price Range is distributed in between 20,000 and 70, 000 . The Target column is not Normally distributed.
Company : It is Brand Name of Laptop. So Check Company vs Average Price comparison with Box plot. Acer, HP,ASUS,DELL,Lenovo starts from Low range Budget & Average price is also in budget.
TypeName : Type of Laptop specifies the type. When compared the type to Price Notebook & Netbook are in budget, where as Gaming & Workstation is highly cost . Ultrabook & 2in1 Convertible is Moderate in between.
Inches column datatype plotted against Price . It seems price not much depends on with Inches .
Ppi : As the resolution and inches are extracted, Added a new column ppi:Pixel Per Inch , Which says about the image quality on the screen, which is an important feature for laptop. From above Scatter plot, the ppi influences the price of the column .
CPU : Cpu column says about the brand & Model of the processor. As each brand has many models, the models are grouped into their particular brand/Model. Intel core i7 is high Cost Processor, Intel core i5 is next highest, intel core i3, AMD & other intel processors are in less budget below 50,000.
Ram : Random Acess Memory. With increase in the Ram capacity, we see increase in price. Positively Correlated.
Gpu_brand : The graphics processing unit (GPU) in your device helps handle graphics-related work like graphics, effects, and videos. The cost increases with Brand. AMD is in budget, Nvidia is Highly cost. Gpu influences price.
OpSys : This is Operating system of the laptop. Windows has starts from Low budget to High range laptops, Mac starts from high Range from 50000 . OS influences price.
Checking Correlation between Numerical values Heat Map to check the Correlation
Numerical Columns
Dropped Unnecessary Columns that are splitted /extracted : [' ScreenResolution ', ' Cpu ', 'Memory', ' Gpu ', ' OpSys '] Dropped the Columns Inches, x_resolution , y_resolution - As the ppi is calculated based on this 3 columns: ['Inches', ' x_resol ', ' y_resol ’] Independent Variables : # Numerical Columns : ['Ram', 'Weight', ' ppi ', 'HDD', 'SSD', 'Price', 'Hybrid', ' Flash_Storage '] # Categorical Columns : Nominal : ['Company', 'TypeName', 'IPS', 'Touchscreen', ' Cpu_Processor ', ' Gpu_brand ', 'OS' ] Ordinal : No Ordinal Columns Target Variable : ‘Price’
MACHINE LEARNING MODEL
Gradient Boosting Regression
Model_Name R-squared (R²) Mean Absolute Error (MAE) Linear Regression 0.803047385207909 0.21097822649880946 Random Forest Regression 0.8750357724697907 0.15464693040956579 Gradient Boosting Regressor 0.8773723726767352 0.16359943883063252 Metrics Conclusion : Random Forest Model has high Accuracy and less Mean Squared Error, Hence it is considered for Hyper parameter Tuning
Random Forest Regressor - Hyperparameter Tuning
Random_search
Grid_search
RANDOM FOREST REGRESSION with GRID SEARCH BEST PARAMETERS RANDOM FOREST REGRESSION with Random SEARCH BEST PARAMETERS
Questionnaires
Brand of the laptop significantly influence its price? Conclusion : Hence we can say Brand of the laptop significantly influence its Price M odel prediction for laptops from lesser-known brands : Model has Hig h accuracy. The accuracy of our Lesser Known Brands model is 94.0% Mean Absolute Error (MAE): 0.152989 Mean Squared Error (MSE): 0.0476263 Root Mean Squared Error (RMSE): 0.21823452
Model perform on laptops with High-end specifications compared to Budget laptops? Metrics High End Laptops Model Budget Laptops Model accuracy 71.0 % 81.0 % MAE 0.16589435402600275 0.14780626075787437 MSE 0.047716541653714244 0.04162336390317959 RMSE 0.21844116291055182 0.20401804798394574 Conclusion : From the above Metrics, we can say Budget Laptop Model is performing well than High-end-Specification Model.
Implement a mechanism for the model to make predictions for new laptops entering the market.
INSIGHTS Price is Right Skewed and most of the laptop prices ranges between 20,000 to 70,000. The RAM , Brand, Laptop type, display type-IPS, SSD, Resolution( ppi ) type are the features more impacting the Prices of Laptop. The different Brands of Laptop Significantly impacts the laptop’s price as we observed in Boxplot.
Limitations and Challenges : Price Volatility : Laptop prices can change frequently due to market dynamics, new releases, and technological advancements, making it difficult for the model to stay accurate over time. Encoding categorical variables (e.g., brand, CPU type) can be challenging and may require domain-specific knowledge to handle properly. Hyperparameter Tuning : Selecting and tuning the hyperparameters of the model is often challenging and time-consuming. Limitations are as there are many different brands with different categories in each specifications, there are so many extreme values some feature which can’t be treated affects the price prediction. Data Size: Insufficient data can limit the model's ability to learn patterns, leading to poor performance. Data Bias: the dataset is biased (e.g., over-representation of certain brands or specifications), the model may not generalize well to new data. Some features and specifications or some laptop configurations can’t be able to applied.