ARIMA - Statistical Analysis for Data Science

sridhark868747 21 views 7 slides May 17, 2024
Slide 1
Slide 1 of 7
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7

About This Presentation

Arima


Slide Content

A time series is a data series consisting of several values over a time interval. e.g. daily Stock Exchange closing point, weekly sales and monthly profit of a company etc. Typically, in a time series it is assumed that value at any given point of time is a result of its historical values. This assumption is the basis of performing a time series analysis. ARIMA technique exploits the auto-correlation (Correlation of observation with its lags) for forecasting. So talking mathematically, Vt = p( Vt -n) + e It means value (V) at time "t" is a function of value at time "n" instance ago with an error (e). Value at time "t" can depend on one or various lags of various order . Example : Suppose Mr. X starts his job in year 2010 and his starting salary was $5,000 per month. Every years he is appraised and salary reached to a level of $20,000 per month in year 2014. His annual salary can be considered a time series and it is clear that every year's salary is function of previous year's salary (here function is appraisal rating). Time Series

ARIMA (Box-Jenkins Approach) ARIMA   stands for Auto-Regressive Integrated Moving Average. It is also known as  Box-Jenkins approach . It is one of the most popular techniques used for time series analysis and forecasting purpose. ARIMA, as its full form indicates that it involves two components : 1. Auto-regressive component 2. Moving average component

1. Auto-regressive Component It implies relationship of a value of a series at a point of time with its own previous values. Such relationship can exist with any order of lag. Lag - Lag is basically value at a previous point of time. It can have various orders as shown in the table below. It hints toward a pointed relationship.

2. Moving average components : It implies the current deviation from mean depends on previous deviations. Such relationship can exist with any number of lags which decides the order of moving average. Moving Average - Moving Average is average of consecutive values at various time periods.  It can have various orders as shown in the table below. It hints toward a distributed relationship as moving itself is derivative of various lags. Moving average is itself considered as one of the most rudimentary methods of forecasting. So if you drag the average formula in excel further (beyond Dec-15), it would give you forecast for next month.

Plot the time series data Check volatility - Run Box-Cox transformation to stabilize the variance Check whether data contains seasonality. If yes, two options - either take seasonal differencing or fit seasonal arima model. If the data are non-stationary: take first differences of the data until the data are stationary  Identify orders of p,d and q by examining the ACF/PACF Try your chosen models, and use the AICC/BIC to search for a better model.  Check the residuals from your chosen model by plotting the ACF of the residuals, and doing a portmanteau test of the residuals. If they do not look like white noise, try a modified model. Check whether residuals are normally distributed with mean zero and constant variance  Once step 7 and 8 are completed, calculate forecasts Note :  The  auto.arima  function() automates step 3 to 6 . ARIMA Modeling Steps

Many of the simple time series models are special cases of ARIMA Model Simple Exponential Smoothing ARIMA(0,1,1) Holt's Exponential Smoothing  ARIMA(0,2,2) White noise ARIMA(0,0,0) Random walk ARIMA(0,1,0) with no constant Random walk with drift ARIMA(0,1,0) with a constant Autoregression ARIMA(p,0,0) Moving average ARIMA(0,0,q)
Tags