Home  /  Products  /  Stata 18  /  Model selection for ARIMA and ARFIMA

<- See Stata 18's new features

Highlights

• Model selection for ARIMA and ARFIMA models

• AIC, BIC, and HQIC information criteria

Want to find the best ARIMA or ARFIMA model for your data? Compare potential models using AIC, BIC, and HQIC. Use the new arimasoc and arfimasoc commands to select the best number of autoregressive and moving-average terms.

Researchers using autoregressive moving-average (ARMA) models must decide on the proper number of lags to include for the autoregressive and moving-average parameters in their models. Information criteria, which balance model fit against model parsimony, often guide the choice of the maximum number of lags.

arimasoc and arfimasoc assist in model selection by fitting a collection of autoregressive integrated moving average (ARIMA) or autoregressive fractionally integrated moving average (ARFIMA) models and computing information criteria for each model. arimasoc and arfimasoc compute the Akaike information criterion (AIC), the Bayesian information criterion (BIC), and the Hannan–Quinn information criterion (HQIC). The selected model is the one with the lowest value of the information criterion.

#### Let's see it work

We would like to fit an ARMA model for the output gap. We use arimasoc to fit candidate models with a maximum autoregressive lag of 4 and a maximum moving average lag of 3.

 . webuse usmacro
(Federal Reserve Economic Data - St. Louis Fed)

. arimasoc ogap, maxar(4) maxma(3)
Fitting models (20): .................... done

Lag-order selection criteria

Sample: 1954q3 thru 2010q4                        Number of obs = 226

Model           LL         df        AIC        BIC       HQIC

ARMA(0,0)   -549.4036          2   1102.807   1109.648   1105.568
ARMA(0,1)   -435.0753          3   876.1507   886.4123   880.2919
ARMA(0,2)    -361.249          4   730.4981   744.1802   736.0196
ARMA(0,3)    -330.844          5   671.6879   688.7906   678.5898
ARMA(1,0)   -292.3313          3   590.6625   600.9241   594.8037
ARMA(1,1)   -281.5762          4   571.1524   584.8345   576.6739
ARMA(1,2)   -275.3697          5   560.7395   577.8422   567.6414
ARMA(1,3)    -274.029          6    560.058   580.5812   568.3403
ARMA(2,0)   -276.5956          4   561.1912   574.8733   566.7127
ARMA(2,1)   -273.9052          5   557.8104   574.9131   564.7123
ARMA(2,2)   -273.2799          6   558.5599   579.0831   566.8422
ARMA(2,3)   -273.2587          7   560.5174   584.4611   570.1801
ARMA(3,0)   -273.2421          5   556.4843    573.587   563.3862
ARMA(3,1)   -273.1883          6   558.3766   578.8998   566.6589
ARMA(3,2)   -273.0747          7   560.1494   584.0931   569.8121
ARMA(3,3)   -272.9944          8   561.9888   589.3531   573.0319
ARMA(4,0)   -273.2006          6   558.4012   578.9244   566.6835
ARMA(4,1)   -273.0027          7   560.0055   583.9492   569.6682
ARMA(4,2)    -273.071          8    562.142   589.5063   573.1851
ARMA(4,3)   -272.9868          9   563.9735   594.7584    576.397
Selected (max) LL:   ARMA(4,3)
Selected (min) AIC:  ARMA(3,0)
Selected (min) BIC:  ARMA(3,0)
Selected (min) HQIC: ARMA(3,0)


The output table provides information about each model, including the maximized log likelihood, the number of parameters estimated, and the AIC, BIC, and HQIC.

Below the output table, the selected model from each criterion is listed. The log-likelihood is maximized for the model with the most parameters, the ARMA(4,3). The AIC, BIC, and HQIC all select the more parsimonious ARMA(3,0) model for the output gap. We can now fit our selected model

. arima ogap, arima(3,0,0)
(output omitted)


and proceed to investigate model predictions, forecasts, etc.

Fitting an ARFIMA model instead of an ARIMA model? Instead of typing

. arimasoc y, maxvar(4) maxma(3)


you type

. arfimasoc y, maxvar(4) maxma(3)


#### Tell me more

Read more about ARIMA and ARFIMA model selection in the Stata Time-Series Reference Manual; see [TS] arimasoc and [TS] arfimasoc.

View all the new features in Stata 18 and, in particular, New in time series.