Predict Your Profits: Optimizing Ad Campaigns with Data-Driven Insights

jadavvineet73 135 views 15 slides Jul 09, 2024
Slide 1
Slide 1 of 15
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15

About This Presentation

Predict Your Way to Marketing Success: A Data Science Approach to Optimizing Ad Campaign Performance for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/


Slide Content

Prediction Product AD Campaign Performance Presented by :Gravan Dmello

Introduction An advertising campaign is a set of advertisements that work together to promote a product or service. An ad campaign is designed around a specific and unique theme to create brand awareness about the company’s product or service. An advertising campaign can be a series of different individual ads or the same ad across mediums used to create awareness and interest in a product or service. This is achieved through different forms of media, including radio, television, print advertising, direct mail, or the internet. The intended objective of this project is to develop a robust supervised machine learning model designed to accurately forecasts key performance indicators (KPIs) for future product ad campaign. By achieving this goal, we aim to provide them with valuable insights that can inform strategic decision-making, optimize resource allocation, and enhance overall marketing effectiveness.

Dataset Information Here are the key details about the dataset used in this project: Our has 731 entries and 11 columns.  The columns  include  There are three columns with float data types and eight with integer data types.  The dataset has 5 categorical variables namely limit_infor , campaign_type , campaign_level , product_level , resource_amount  whose values have been represented by single digit numbers. Here, the target variable is " orders " limit_infor   limits or restrictions associated with the marketing campaign or product.(0,1) campaign_type   type of marketing campaign, such as email, social media, print advertising,etc (0,1,2,3,4,5,6) campaign_level   level or scale of the marketing campaign, for example, national, regional, or local.(0,1) product_level   level or category of the product being marketed, such as high-end, mid-range, or budget.(1,2,3) resource_amount   resources (e.g., budget, personnel, or materials) allocated for the marketing campaign.(1,2,3,4,5,6,7,8,9) email_rate   email delivery rate or open rate.  price selling price of the product. discount_rate   discounts or promotional offers associated with the product. hour_resources the number of labor hours or human resources dedicated to the marketing campaign or product sales efforts. campaign_fee fees or costs associated with running the marketing campaign. orders number of orders or sales generated for the product during the marketing campaign.

Exploratory Data Analysis (EDA) EDA is used to  provides a provides a better understanding of data set variables and the relationships between them. The dataset had no duplicates and 2 missing values in "price" which was 0.27% of total data. Hence, the rows with missing values were dropped. While observing the relationship of numeric values with "orders", the campaign_fee had one outlier which was removed for a cleaner data. 

Exploratory Data Analysis (EDA) Correlation coefficient revealed that ' discount_rate ' (0.232),  email_rate ' (0.627), ' hour_resouces ' (0.663), and ' campaign_fee ’ (0.762) have positive correlations with 'orders'.  'price' (-0.102) has a weak negative correlation with 'orders'. To ensure consistent scales for numerical features,  MinMax Scaler  was employed  during preprocessing.

Visualizations The independent variables " campaign_fee ", " hour_resources " and " email_rate " have linear relationship with  target variable "orders" .

Visualizations There is a non-linear relationship between price and the number of orders ' The ' discount_rate ' have positive relationship with 'orders'.

In this step, we divided the dataset into two parts: X and y. X contains all the independent variables, which are the features used to make predictions. Meanwhile, y represents the dependent variable or target variable, which is the outcome we want to predict. The dataset was split into training and testing sets. An 80:20 ratio was used, with 80% of the data allocated to training and 20% to testing, and the test size set to 0.2. A random state of 42 was specified to ensure the reproducibility of results across different runs Splitting the data into X and y Train-Test Split

Model Selection The Prediction Product AD Campaign Performance is a regression problem. Hence following models were used: Linear Regression is best for simple, linear relationships and offers high interpretability. Support vector machine is versatile for both linear and non-linear relationships but can be computationally expensive. Random Forest is powerful for complex, non-linear relationships and provides robust performance but is less interpretable and more computationally intensive.

Predictions: Linear Regression Support Vector Machine RMSE on Train Score: 0.04030 RMSE on Train Score: 0.04123 RMSE on Test Score: 0.04055 RMSE on Test Score: 0.04047 Difference between RMSE on train and test set 0.00025 Difference between RMSE on train and test set 0.00076 Observation: Linear regression have shown slightly better results than SVM  

Predictions: Random Forest Regression Before Tuning  After Tuning  RMSE on Train Score: 0.0211 RMSE on Train Score: 0.0312 RMSE on Test Score: 0.0445 RMSE on Test Score: 0.0455 Difference between RMSE on train and test set 0.0233 Difference between RMSE on train and test set 0.0233 Observation: There is no significant improvement in RMSE values after hypertuning the values

Feature Importance (Random Forest Regression) ' campaign_fee ' has the highest importance , ' hour_resources ' have moderate importance and price have minimal importance compared to other two. The remaining features ( email_rate , discount_rate , campaign_type , resource_amount , product_level , campaign_level , limit_infor ) have negligible importance in the model.

The Best Performing ML Model

Conclusions The analysis and predictions provide valuable insights that can significantly enhance the ad campaign performance for the company. Campaign Fee and Hour Resources : Increasing campaign fees and allocating more resources positively influence the number of orders, suggesting that investments in these areas are likely to yield higher returns. Pricing Strategy : Higher prices tend to reduce the number of orders. Therefore, maintaining competitive and minimal prices can attract more customers and boost sales. Discount Rates : While higher discount rates can slightly increase the number of orders, their impact is minimal. This indicates that focusing primarily on pricing and resource allocation may be more effective than relying heavily on discounts. Model Performance : Linear regression outperforms SVM and random forest regression due to the linear relationship between features and the target variable. This finding underscores the importance of using a simpler, well-suited model to avoid overfitting and ensure accurate predictions. By leveraging these insights, the company can strategically allocate resources, optimize pricing, and fine-tune their ad campaigns to maximize effectiveness and improve the overall return on investment (ROI).