Analysis and Forecasting of Sparkling Wine Sales Group 5 Garima Mathur- 2314018 Rahatul Ashafeen - 2314036 Rajeev Ranjan Kumar - 2314038 Sandeep Solanki- 2314046 Vijay Vasha - 2314062
Introduction Forecasting is critical to management as marketing relies on forecasting to predict demand and future sales. Forecasting helps in business planning, optimizing inventory, reducing costs, and ensuring customer satisfaction Objective : Built a model to forecast monthly sales for the next 12 months. Analysed historical monthly sales data of a company. Created multiple forecast models and recommended the optimum forecasting model to predict monthly sales for the next 12 months along with appropriate lower and upper confidence limits.
Description of Dataset Time series data has been considered, which shows monthly sales over the period of 1980 to 1995. We have converted the data into date format and given the column name as Year-Month Key variables include Timeline (Month-Year) and sales volume. Data spans from to Jan-1980 to Jul-1995, providing a comprehensive view of the market. The yearly boxplots also show that the Sales have increased till 1987 and decreased till 1992 and increased in 1194 and 1995. We see an increasing trend till 1988 and a decrease after that, and we can also observe seasonality.
Time Series Analysis Trend Seasonality Noise Original Data
Forecasting As we observed, there is high seasonality and a slight trend hence, we opted to follow the models: Simple exponential smoothing (For comparison purpose only) Holt’s Exponential Smoothing (For comparison purpose only) Winter’s Methods Seasonal ARIMA
Stationarity Check From below plots we can see that original data is not stationary . From the above plots, we can say that there seems to be a seasonality in the data. We see that at 5% significant level the Time Series is non-stationary. Augmented Dickey-Fuller Test - Let's run the Augmented Dicky Fuller Test on the timeseries and verify the null hypothesis that the TS is non-stationary. Rolling Standard deviation and rolling mean clearly has variation over time and this is not a stationary series. Also, the P-value statistic is way more than the critical values. Difference of order 1 data is indeed stationary for given alpha.
Winter’s Model
Winter’s Model Insights MAPE 12.85%, means the average forecast error is 12.85% of the actual values. The accuracy would be approximately 100%−12.85%=87.15% R² and Adjusted R² : The R² value of 0.914 and the adjusted R² value of 0.913 indicate that the model explains a high proportion of the variance in the dependent variable. The Level Smoothing Weight (0.049) has a p-value of 0.0866, indicating that it is not statistically significant at the 0.05 level , but it is relatively close. This means the level smoothing weight might have some impact, but it is not strong. The Trend Smoothing Weight (0.00000106) has a high p-value (0.9557), indicating that it is not statistically significant . This suggests that the trend component does not contribute much to the model. The Seasonal Smoothing Weight (0.45) has a very low p-value (<0.0001), indicating that it is highly statistically significant. This means that the seasonal component plays a significant role in the model
Winter’s Model
First Order and Seasonal Differencing The high standard deviation suggests that the data points are spread out considerably around the mean. The ADF test results are all negative, which is generally a good sign for stationarity Single Mean ADF: This tests for stationarity around a non-zero mean with no trend. Trend ADF: This tests for stationarity around a trend (increasing or decreasing line) with an intercept.
Seasonal ARIMA (0,0,0) (1,1,1) 12 Model
Seasonal ARIMA (0,0,0) (1,1,1) 12 Model Goodness of Fit: R-Square: Indicates that approximately 91.6% of the variance in the data is explained by the model, suggesting a very good fit. R-Square Adjusted: (91.4%) Slightly lower than R-Square, accounting for the number of predictors, but still indicates a very good fit. MAPE (Mean Absolute Percentage Error): An average forecast error of about 12.4%, indicating reasonably accurate predictions.
Seasonal ARIMA (0,0,0) (1,1,1) 12 Model Parameter Estimates: AR2,12 (Auto-Regressive Term): Not statistically significant (p-value > 0.05 ), indicating that the second lag in the 12-period seasonality does not contribute significantly to the model. MA2,12 (Moving Average Term): Highly significant (p-value < 0.05) , indicating that the second lag in the 12-period seasonality is an important component of the model. Intercept: Not statistically significant (p-value > 0.05), suggesting it does not contribute much to the model. This is expected in a differenced series where the mean is close to zero.
Seasonal ARIMA (0,0,0) (1,1,1) 12 Model
Seasonal ARIMA (0,0,0) (1,1,1) 12 Model Conclusion: The Seasonal ARIMA (0,0,0)(1,1,1)12 model provides a very good fit to the data, explaining over 91% of the variance. The model is reasonably accurate in its forecasts, with acceptable MAPE and MAE values
Seasonal ARIMA (0,1,0) (1,1,1) 12 Model
Seasonal ARIMA (0,1,0) (1,1,1) 12 Model Goodness of Fit: The R-Square value indicates that approximately 85.46% of the variance in the dependent variable is explained by the model. This suggests a strong fit. The Adjusted R-Square of 85.2%, which adjusts for the number of predictors in the model, is very close to the R-Square, reinforcing the model's goodness of fit. MAPE indicates that the model's predictions are off by about 16.79% from the actual values. This is a moderate level of error and could be acceptable depending on the context.
Seasonal ARIMA (0,1,0) (1,1,1) 12 Model Parameter Estimates AR(2,12) ( AutoRegressive term): The AR term is not statistically significant (p > 0.05), indicating that this autoregressive component may not be contributing meaningfully to the model. MA(2,12) (Moving Average term): The MA term is statistically significant (p < 0.0001), suggesting that this moving average component is an important part of the model. Intercept: The intercept term provides the baseline level for the model. Since no standard error or t-value is provided, its significance cannot be directly assessed here.
Seasonal ARIMA (0,1,0) (1,1,1) 12 Model Conclusion: Overall, the Seasonal ARIMA (0,1,0)(1,1,1)[12] model appears to be a strong model with a significant MA component, though there might be room for refinement by reconsidering the AR term.
Seasonal ARIMA (0,1,0) (1,1,1) 12 Model
State Space Smoothing Models(MMdM12)
State Space Smoothing Models Model Accuracy : RSquare (Coefficient of Determination) : This value indicates that approximately 97.18% of the variance in the dependent variable is explained by the model. It suggests a very good fit. RSquare Adj (Adjusted R-Square) : 0.969132 means adjusted for the number of predictors in the model. It is slightly lower than the R-Square but still indicates a very good fit. MAPE (Mean Absolute Percentage Error) : 6.713497- This indicates the average percentage error between the predicted and actual values. A lower MAPE value indicates better model performance. Summary The model shows a strong fit with an R-Square of 0.971788 and an adjusted R-Square of 0.969132, indicating that it explains most of the variance in the data. The model is relatively complex with 18 parameters, but the AIC and BIC values are reasonable, suggesting a balance between fit and complexity. The accuracy of the model is high, as indicated by the low sigma (0.099285), MAPE (6.713497), and MAE (151.4044) values. Overall, the time series state space smoothing model appears to be well-fitted to the data and performs well in terms of predictive accuracy.
State Space Smoothing Models (MMdM12)
Comparison of Forecasting Methods Summary Best Performing Models : Winters Method (Additive) and Seasonal ARIMA(0, 0, 0)(1, 1, 1)12 in the ARIMA class are the most efficient and preferred models, with high RSquare values, low AIC and SBC, and low MAPE and MAE. State Space Smoothing Models : MMdM12 and MAdM12 show high accuracy and excellent fit (high RSquare ) but are not preferred due to higher AIC and SBC values. Poorly Performing Models : Linear (Holt) Exponential Smoothing and Simple Exponential Smoothing (Zero to One) have negative RSquare values, high AIC and SBC, and very high MAPE and MAE, making them unsuitable for accurate forecasting.
Recommendation Based on the observations and after time series analysis we can suggest to company that they expect almost same sales as last year with slight increases. They should stock the wine based on the months like there is high demand from Oct to Dec and December has the highest sales. They can target more customer centric ads during festive seasons like in December and January to increase the sales. They should also concentrate on the ways to increase the sales from the analysis we can see that sales are also most constant after 1987 , there is not considerable growth.