|Title||Obtaining different results when executing xttobit on the same data after resorting|
|Author||Jean Marie Linhart, StataCorp, and Vince Wiggins, StataCorp|
I recently ran a do-file that contained several xttobit models changing the sort order in between.
I was surprised to find that I got different results. The puzzle to me is that the iterations that fit the full model are identical to what I estimated the other day (through iteration 6), and then they diverge.
The data have not changed,although they were resorted, and the correct sample and variables are being used. The only difference is the sort. Is there some random component due to the use of quadrature?
xttobit, and other commands that fit random-effects models such as xtlogit, xtprobit, xtcloglog, xtintreg, xtoprobit, xtologit, and xtpoisson, re normal, estimate random-effects models by using adaptive Gauss–Hermite quadrature to approximate the high-dimension integrals that are part of the likelihood for these models.
As the user suggests, there is a random component to quadrature in that the within-panel sort affects the likelihood approximation. Addition and other arithmetic operations are not commutative on a computer, and sort order can affect any calculation, but it particularly affects on the quadrature approximation to the likelihood.
Quadrature is one of the most accepted approaches to fitting these models, but there are three cases where it often breaks down: (1) large panel sizes, (2) high within-panel correlation, and (3) variables that are constant or near constant within panel. I don’t know if any of these are true for these data, but any observation that contributes in an extreme way to the likelihood can cause problems. See [XT] quadchk for a good discussion of these issues.
Stata’s quadchk command can help tremendously in assessing whether your data are appropriate for estimation using the quadrature approximation. quadchk works with all the estimation commands that use quadrature, and I definitely recommend that the user try quadchk on the model. I also heartily recommend that people fitting a random-effects model by quadrature check whether the quadrature is stable for their model. If you’re using Stata, use quadchk to do this.
We believe that if your model has a good quadchk then you can trust the results, even if there are (slight) differences when the data have a different sort order.
We have tried to point everyone using commands that use quadrature to quadchk by providing a Technical Note or example in the manual entry. In hindsight, these suggestions could have been stronger.
Commands using quadrature have been the source of some debate around StataCorp. None of us are wholly comfortable with estimators that are prone to instability, even if that instability arises only in extreme cases. That is why we feel so strongly about providing quadchk to assess the appropriateness of the estimator for a given dataset. The near-consensus here is that these estimators are valuable to those who need them even though they require care from all who use them. They are stable for most datasets. Admittedly, these are leading-edge models, and fitting them requires more understanding of numerical and approximation issues on the part the user than do most other estimation commands.