Thanks for the detailed explanation! This seems to work. Although if
I use the 'condition' option, I still have the same problem (matsize
being too small) when I try to make out-of-sample
forecasts. 'diffuse' works OK, if slowly, as advised.
--- In email@example.com, vwiggins@s... (Vince Wiggins,
> Clarence Tam <Clarence.Tam@l...> asks whether he needs to have
> to estimate an arima model with an MA term at the 52nd lag,
> > [...] Model diagnostics suggest that there's a residual seasonal
> > correlation (at week 52) both in the ACF and PACF. My next step
> > going to be to include an additional AR or MA term to account for
> > this, but I'm not sure how to do it. I've tried:
> > . arima DS52.lnreps, ar(1) ma(1 52) noconstant
> > but Stata says that the matsize is too small, even though it's set
> > at the maximum of 800 (I'm using Intercooled Stata 8.0).
> > Does anyone have any suggestions on how to get round this problem
> > (preferably ones that don't involve upgrading to Stata SE...)?
> Clarence does not need to upgrade to SE.
> The message he received after his -arima- command should have been,
> matsize too small, must be max(AR, MA+1)^2
> use -diffuse- option or type -help matsize-
> In this case, with the maximum MA being 52, the message implies
that a matrix
> size of 53^2=2809 is required, and that would indeed require
> first suggestion in the message, however, will let him use
> to estimate the model. If Clarence types,
> . arima DS52.lnreps, ar(1) ma(1 52) noconstant diffuse
> he should be able to estimate the model.
> By default -arima- uses a Kalman filter to produce unconditional
> likelihood estimates of the specified model. To obtain the
> estimates the Kalman filter must be initialized with the expected
value of the
> initial state vector and the MSE of this vector. These initial
> on the current parameter estimates and in computing the MSE we must
> square matrix the size of the state vector -- max(AR, MA+1)^2.
Thus, the need
> for such a large matrix. These are the most efficient estimates
for the model
> because the initial state vector and its MSE are forced to conform
> current parameter estimates.
> We can, however, obtain slightly less efficient estimates by
assuming that the
> initial state vector is zero and its variance is unknown and
> infinite. This is what the -diffuse- option specifies. This
> essential down-weights the initial observations until the data
itself can be
> used to develop a state vector and its MSE.
> With large datasets, the two estimates tend to be close.
> Even though this model has only 4 parameters, including sigma, the
> filter iterations may be somewhat slow because the filter must
> state vector that is the maximum of the largest AR or MA term and
will thus be
> flopping around some pretty large matrices to compute the
likelihood at each
> observation. For this reason, I would recommend that Clarence use
> -condition- option to estimate the model,
> . arima DS52.lnreps, ar(1) ma(1 52) noconstant condition
> The -condition- option specifies conditional-maximum likelihood
> rather than unconditional. These estimates to not require
maintaining a state
> vector. Specifically, all pre-sample values of the white noise,
> autocorrelated, u_t, disturbances are taken to be 0 and the MSE of
> taken to be constant over the entire sample. Effectively this
means that the
> initial observations in the sample get just as much weight as the
> end observations even though we know less about them. We know less
> the process is autocorrelated and this implies that knowing the past
> observations tells us something about the current observation, and
> nothing is known about the pre-sample observations.
> What unconditional maximum likelihood effectively does is use the
> estimates to imply information about the pre-sample while optimally
> down-weighting this information so that the initial observations
get a little
> less weight that the remaining observations.
> What the -diffuse- option effectively does is to say we know
nothing about the
> pre-sample and accordingly down-weights the initial observations in
> even more.
> What conditional maximum likelihood effectively does is assume that
> pre-sample values are their long-run expected value of zero, that
we know this
> just as well as we know later later, and accordingly weights the
> observations equally with the remaining observations.
> With large datasets, it generally does not matter which method we
> the contribution of the initial observations is dominated by the
> data. Note, however, that "large" must be used carefully when the
> large autocorrelation terms.
> -- Vince
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: