2.2.5.1.2 Algorithm for ARIMA

ARIMA model means an autoregressive integrated moving average model. And it may include autoregressive(AR), moving average (MA) or differencing. In this app, nag function nag_tsa_multi_inp_model_estim (g13bec) is used to fit an ARIMA model [1], and nag function nag_tsa_multi_inp_model_forecast (g13bjc) is used to forecast future values by a known ARIMA model [2].

ARIMA Model

For a general ARIMA model,

\begin{equation}\tag{1}
\begin{split}
\nabla ^d \nabla_s^D y_t &= c + w_t \\
w_t &= \Phi_1 w_{t-s} + \Phi_2 w_{t-2s} + ... + \Phi_P w_{t-Ps} + e_t - \Theta_1 e_{t-s} - \Theta_2 e_{t-2s} - ... - \Theta_Q e_{t-Qs} \\
e_t &= \phi_1 e_{t-1} + \phi_2 e_{t-2} + ... + \phi_p e_{t-p} + a_t - \theta_1 a_{t-1} - \theta_2 a_{t-2} - ... - \theta_q e_{t-q} 
\end{split}
\end{equation}
,

where y_t is the input time series (t = 1 ... n), P, Q, D, p, q, d are orders of seasonal autoregressive, seasonal moving average, seasonal differencing, autoregressive, moving average and differencing respectively. And s is the seasonal period. c is the mean of the differenced values, \Phi_i (i=1 ... P), \Theta_i (i=1 ... Q), \phi_i (i=1 ... p), \theta_i (i=1 ... q) are coefficients for seasonal autoregressive, seasonal moving average, autoregressive and moving average. a_t is the residual.

Estimation

Residual series a_t can be obtained by y_t in equation 1. Sum squares of residuals:

\begin{equation}\tag{2}S = \sum_{-\infty}^n a_t^2 \end{equation}

Estimation Criterion

Three criteria are available:

  • Least Squares
 D = S

Iterate by minimizing D.

  • Exact Likelihood

y_i, \; i=0, -1, ... are considered as unobserved random variables with known distribution.

 D = M \times S

where the multiplier M is a function calculated from the ARIMA model arguments.

Minimizing D is equivalent to maximizing the exact likelihood of the data.

  • Marginal Likelihood
 D = M \times S

but with a different value of M. It is distinct from exact likelihood method only if the mean term is included in the model.

In this app, Marquardt method [4] is used to minimize the objective function.

Quantities

  • Residual a_t

Residuals are available at t \ge 1 + d + s \times D.

  • Fitted y

\hat{y}_t = y_t - a_t

  • Residual Degrees of Freedom

Differenced series length is: N = n - d - s \times D, and df = N - (\text{number of parameters}) .

  • Residual Variance

erv = \frac{S}{df}

  • Covariance Matrix of Parameters

C = erv \times H^{-1}

where H is the linearised least squares matrix in the final iteration.

Forecast

To predict time series y_t at t = n + 1, ... n + L, set a_t = 0 for t = n + 1, ... n + L, and calculate the predicted value by reversing Eq 1.

\begin{equation}\tag{3}
\begin{split}
e_t &= \phi_1 e_{t-1} + \phi_2 e_{t-2} + ... + \phi_p e_{t-p} + a_t - \theta_1 a_{t-1} - \theta_2 a_{t-2} - ... - \theta_q e_{t-q} \\
w_t &= \Phi_1 w_{t-s} + \Phi_2 w_{t-2s} + ... + \Phi_P w_{t-Ps} + e_t - \Theta_1 e_{t-s} - \Theta_2 e_{t-2s} - ... - \Theta_Q e_{t-Qs} \\
y_t &= (\nabla ^d \nabla_s^D)^{-1} (c + w_t)
\end{split}
\end{equation}

The forecast error variance of y_{n+L} can be calculated as:

S_L^2 = V_n \times (\psi_0^2 + \psi_1^2 + ... + \psi_{L-1}^2)

where Vn is the residual variance of the ARIMA model, and \psi_i is the "psi-weights" of the model as defined in [3].

Reference

  1. nag_tsa_multi_inp_model_estim (g13bec)
  2. nag_tsa_multi_inp_model_forecast (g13bjc)
  3. George E. P. Box and Gwilym M. Jenkins (1976). Time Series Analysis: Forecasting and Control. (Revised Edition) Holden–Day
  4. D. W. Marquardt (1963). "An algorithm for least squares estimation of nonlinear parameters". J. Soc. Indust. Appl. Math. 11 431.