Time Series Forecasting Using TBATS Model.pptx

GowthamKumar470818 701 views 20 slides Jan 20, 2023
Slide 1
Slide 1 of 20
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20

About This Presentation

Time Series Forecasting Using TBATS Model


Slide Content

TIME SERIES FORECASTING USING TBATS MODEL Team Members: D22017 – P Gowtham Kumar D22041 – R Sai Sasi Sekhar D22049 – B Vinay Kumar Course: TF

Introduction TBATS is a forecasting method to model time series data TBATS is an Acronym for key features of the model The main aim of this is to forecast time series with complex seasonal patterns using exponential smoothing There can be many types of seasonality's present (e.g., time of day, daily, weekly, monthly, yearly). 2

ADVANTAGES OF TBATS Many time series exhibit complex and multiple seasonal patterns (e.g., hourly data that contains a daily pattern, weekly pattern and an annual pattern). The most popular models (e.g. ARIMA and exponential smoothing) can only account for one seasonality. TBATS model has the capability to deal with complex seasonality's (e.g., non-integer seasonality, non-nested seasonality and large-period seasonality) with no seasonality constraints, making it possible to create detailed, long-term forecasts. 3

How to Forecast Time Series With Multiple Seasonality's W e often encounter seasonality in time series.  Seasonality  is the periodical variation in our series. It is a cycle that occurs over a fixed period in our series. Here, we can clearly see seasonal cycle, as every year, the number of air passengers peaks around the month of July and falls down again. To forecast this series, we can simply use a  SARIMA model , since there is only one seasonal period with a length of one year. 4 Monthly total number of air passengers for an airline, from January 1949 to December 1960. We notice a clear seasonal pattern in the series, with more people travelling during the months of June, July, and August

Now, things get complicated when we are working with high frequency data. For example, an hourly time series can exhibit a daily, weekly, monthly and yearly seasonality, meaning that we now have multiple seasonal periods. Take a look at the hourly traffic volume on the Interstate 94 shown below. 5 Hourly traffic volume, westbound, on the interstate 94 in Minneapolis, Minnesota. Here we can see both a daily seasonality (more cars are on the road during the day than during the night), But also a weekly seasonality (more car are on the road Monday to Friday, than during the weekends).

6 Looking at the data above, we can see that we have two seasonal periods! First, we have a daily seasonality, as we see that more cars travel on the road during the day than during the night. Second, we have a weekly seasonality, as traffic volume is higher during weekdays than during the weekend. In this case, a SARIMA model cannot be used, because we can only specify one seasonal periods, whereas we definitely have two seasonal periods in our data: a daily seasonality and a weekly seasonality. Using BATS  and  TBATS models, we can fit and forecast time series that have more than one seasonal period.

7 BATS Model Exponential smoothing is a family of forecasting methods. The general idea behind these forecasting methods is that future values are a weighted average of past values, with the weights decaying exponentially as we go back in time. Forecasting methods include SES, DES and TES. State-space modelling is a framework in which a time series is seen as a set of observed data that is influenced by a set of unobserved factors. The state-space model then expresses the relationship between the two sets. Again, this must be seen as a framework, as an ARMA model can be expressed as a state-space model. Box-Cox transformation is a power transformation that helps make the series stationary, by stabilizing the variance and mean over time. ARMA errors is a process in which we apply an ARMA model on the residuals of the time series in order to find any unexplained relationship. Usually, the residuals of a model should be totally random, unless some information was not captured by the model. Here, we use an ARMA model to capture any remaining information in the residuals. Trend is a component of a time series that explains the long-term change in the mean value of the series. When we have a positive trend, then our series is increasing over time. With a negative trend, the series decreases over time. The seasonal component is what explains the periodical variation in the series.

8 To summarize, BATS is an extension of exponential smoothing methods that combines a Box-Cox transformation to handle non-linear data and uses an ARMA model to capture autocorrelation in the residuals. The advantage of using BATS is that it can treat non-linear data, solve the autocorrelation problem in residuals since it uses an ARMA model, and it can take into account multiple seasonal periods. However, the seasonal periods must be integer numbers, otherwise BATS cannot be applied. For example, suppose that you have weekly data with a yearly seasonality, then your period is 365.25/7 which is approximately 52.2. In that case, BATS is ruled out. Furthermore, BATS can take a long time to fit if the seasonal period is very large, meaning that it is not suitable if you have hourly data with a monthly (the period would be 730). Thus, the TBATS model was developed to address that situation.

TBATS It uses the same components as the BATS model, however it represents each seasonal period as a trigonometric representation based on Fourier series. This allows the model to fit large seasonal periods and non-integer seasonal periods. It is thus a better choice when dealing with high-frequency data and it usually fits faster than BATS. A pplying the models to forecast the next seven days of hourly traffic volume. 9

First, double-seasonal Holt–Winters (DSHW) exponential smoothing equation with additive trend and additive seasonality is shown below. This model ( Eqs . 7 to 11), developed by Taylor (2003), was an extension of the Holt–Winters exponential smoothing. 10

BATS The following equations show the extension of double-seasonal Holt–Winters (DSHW), called Box–Cox transformation, ARMA errors, trend, and multiple seasonal patterns (BATS). These are expressed by Eqs . 12 to 17 (De Livera 2010). 11

TBATS The following equations show the extension of BATS model by adapting Eqs . (12) to (17) with the following expressions. This adaptation is called TBATS model ( Eqs . 18 to 21) (De Livera et al. 2011). 12

Of course, we recognize the plot from the beginning of this article and notice that the traffic volume is indeed lower during the weekend than during the weekdays. Also, we see a daily seasonality, with traffic being heavier during the day than at night. Therefore, we have two periods: the daily period has a length of 24 hours, and the weekly period has a length of 168 hours. Let’s keep that in mind as we move on to modeling. 13

14 Modeling For modeling of data. Here, we use the  sktime   package. This framework which brings many statistical and machine learning methods for time series. It also uses a similar syntax convention to  scikit-learn , making it easy to use. The first step is to define our target and define the forecast horizon. Here, the target is the traffic volume itself. For the forecast horizon, we wish to predict one week of data. Since we have hourly data, we must then predict 168 timesteps (7 * 24) into the future.

Inference That’s a bit anticlimactic, but let’s understand why this happened. It is possible that our dataset is too small. It might be that the sample that we used for testing turns out to favor the baseline model. One way to verify would be to forecast multiple 168 hour-horizon, to see if the baseline model still outperforms the rest. Also, it can be that we were too strict with the models’ parameters. Here, we forced both models to use Box-Cox transformations and remove the trend component. However, we could have not specified those parameters, and the model would have tried both possibilities for each parameter and select the one with the lowest  AIC (Akaike’s Information Criterion) . While this makes the training process longer, it might also result in better performance from BATS and TBATS. Nevertheless, a key takeaway is that a building a baseline model is very important for any forecasting project. 15

Conclusion BATS and TBATS models, were used to forecast time series that have more than one seasonal period, in which case a SARIMA model cannot be used. We applied both models to forecast the hourly traffic volume, but it turned out that our baseline remained the best performing model. Nevertheless, we saw that BATS and TBATS can indeed model time series with complex seasonality's. 16

Potential improvements Forecast multiple 168 hour-horizon and check if the baseline is indeed the most performant model. the entire dataset original dataset  can be used, which contains much more data than what we worked with. Do not specify the parameters  use_box_cox ,  use_trend , and  use_damped_trend , and allow the model to make the best selection based on the AIC. 17

Key takeaways Always build a baseline model when forecasting BATS and TBATS can be used for modeling time series with complex seasonality BATS works well when the periods are short and integer numbers TBATS trains faster than BATS and works with seasonal periods that are not integers 18

References https://towardsdatascience.com/how-to-forecast-time-series-with-multiple-seasonalities-23c77152347e https://www.sktime.org/en/stable/api_reference/auto_generated/sktime.forecasting.tbats.TBATS.html https://blog.tenthplanet.in/time-series-forecasting-tbats/ https://medium.com/intive-developers/forecasting-time-series-with-multiple-seasonalities-using-tbats-in-python-398a00ac0e8a https://otexts.com/fpp2/complexseasonality.html 19

Thanks! 20
Tags