This page contains only historical information and is not about the current
release of Stata.
Please see our capabilities page
for information on the current version of Stata.
Statistical features in Stata 8
New statistics available in Stata 8 are categorized under
 |
* |
 |
Many economic time series are cointegrated and require specialized
statistical methods to analyze them. Economic variables, such as
consumption, investment, and income, tend to grow over time, while the
differences between any two of those variables never deviate too far
from a constant equilibrium value. VECMs are used to model such
relationships.
Stata's VECM suite includes commands for testing for cointegration and
determining the number of cointegrating relationships, choosing the lag
order, and fitting the model. Additional commands facilitate
post-estimation diagnostic analyses, including testing for stability,
autocorrelated residuals, and normality.
|
|
* |
 |
The new vec command fits cointegrated
vector error-correction models, also known as VECMs.
|
|
* |
 |
The new vecrank command
produces statistics used to determine the number of cointegrating equations in a VECM.
|
|
* |
 |
The new fcast command replaces the old command varfcast and
produces dynamic forecasts of the dependent variables after fitting a VAR,
SVAR, or VECM.
|
|
* |
 |
The new irf command replaces the old command varirf and
does everything the old command did and more. irf estimates the
impulse–response functions, cumulative impulse–response functions,
orthogonalized impulse–response functions, structural
impulse–response functions and forecast-error variance decompositions
(FEVDs) after fitting a VAR, SVAR, or VECM. Results can be graphed and
presented in tables.
The old varirf command continues to work but is not documented. If
you have old .irf files, they will work with the old varirf
command and the new irf command.
|
|
* |
 |
The varsoc command can be used to obtain lag-order selection statistics
for VECMs, as well as VARs.
|
|
* |
 |
The new veclmar command computes Lagrange-multiplier test statistics
for residual autocorrelation after fitting a VECM.
|
|
* |
 |
The new vecnorm command computes a series of test statistics against
the null hypothesis that the disturbances are normally distributed after
fitting a VECM. For each equation, and for all equations jointly, three
statistics are computed: a skewness statistic, a kurtosis statistic, and the
Jarque–Bera statistic.
|
|
* |
 |
The new vecstable command checks the eigenvalue stability condition
after fitting a VECM.
|
|
* |
 |
The new vecstable command and the command varstable
now have a graph option that produces publication-quality graphs to
facilitate interpreting and presenting the stability results.
|
|
* |
 |
The new haver command makes it easy to load and to analyze the economic
and financial databases available from Haver analytics.
|
- Stata now can fit vector autoregression (VAR) and structural
vector autoregression (SVAR) models.
A suite of new commands allows you to estimate, tabulate, and graph
impulse–response functions, cumulative impulse–response
functions, orthogonalized impulse–response functions, structural
impulse–response functions, and their confidence intervals, along
with forecast-error variance decompositions and structural
forecast-error variance decompositions. This suite also allows
graphical comparisons of IRFS and variance decompositions across models
and orderings.
A full suite of diagnostic and testing tools is also provided, including
Granger causality tests, Lagrange-multiplier (LM) test for
residual autocorrelation, tests for normality of the disturbances,
lag-order selection statistics, eigenvalue stability checks, and Wald
tests that the endogenous variables of a given lag are zero, both for
each equation separately and for all equations jointly.
- The new tssmooth command smooths and predicts univariate
time series using weighted or unweighted moving-average,
single-exponential smoothing, double-exponential smoothing,
Holt–Winters nonseasonal smoothing, Holt–Winters seasonal
smoothing, or nonlinear smoothing.
- The new archlm command computes a Lagrange-multiplier test
for autoregressive conditional heteroskedasticity (ARCH) effects in the
residuals after regress.
- The new bgodfrey command computes the
Breusch–Godfrey Lagrange-multiplier (LM) test for serial
correlation in the disturbances after regress.
- The new durbina command computes the Durbin (1970)
alternative statistic to test for serial correlation in the disturbances
after regress when some of the regressors are not strictly
exogenous.
- The new dfgls command performs the modified
Dickey–Fuller t test for a unit root (proposed by Elliott,
Rothenberg, and Stock (1996)) using models with 1 to maxlags lags
of the first-differenced variable in an augmented Dickey–Fuller
regression.
- The existing arima command may now be used with the
by prefix command, and it now allows prediction in loops
over panels.
* new in Stata 8 as of July 2004
- The new xthtaylor command fits panel-data random-effects
models using the Hausman–Taylor and the Amemiya–MaCurdy
instrumental-variables estimators.
- The new xtfrontier command fits stochastic production or
cost frontier models for panel data allowing two different
parameterizations for the inefficiency term: a time-invariant model and
the Battese–Coelli (1992) parameterization of time effects.
- The existing xtivreg command will now optionally report
first-stage results of Baltagi's EC2SLS random-effects estimator.
- The existing xttobit and xtintreg commands
can now predict after estimation the probability that the dependent variable
is uncensored, the corresponding expected value E(y |
#_a<y<#_b), and the expected value of the dependent
variable truncated at the censoring point(s).
- Using stcox, you can now fit Cox semiparametric
proportional-hazards models that allow for gamma-distributed frailty. In
this model, frailty is assumed to be shared across groups of observations.
Previously, if you wanted to analyze multivariate survival data using the
Cox model, you would fit a standard model and account for the correlation
within groups by adjusting the standard errors for clustering. Now, you
can directly model the correlation by assuming a latent gamma-distributed
random effect or frailty; observations within group are correlated because
they share the same frailty. Estimation is done via penalized likelihood.
You can estimate the frailty variance and obtain group-level frailty
estimates.
sts graph and stcurve
(after stcox) can now plot estimated hazard functions,
which are calculated as weighted kernel smooths of the estimated hazard
contributions.
- streg has new option
shared(varname) for fitting parametric
shared frailty models, which are analogous to random-effects models for
panel data. streg can also fit frailty models in which the
frailties are assumed to be randomly distributed at the observation level.
Post-estimation, predictions conditional on frailty equal to 1, and
unconditional predictions (predictions averaged over the frailty
distribution) are available.
- Stata's stepwise and fractional polynomial specification-search methods
now work with stcox and streg.
- Stata's programmable maximum likelihood estimation routine ml
has new options that automatically handle the production of survey
estimators, including stratification and estimation on a subpopulation.
- Survey estimation is now available for the Heckman selection model and the
Heckman selection model applied to probit.
- Survey estimation is now available for negative-binomial regression and
generalized negative-binomial regression.
- Constraints may now be applied to equations using survey estimators,
as with Stata's other estimators.
- Point estimates, standard errors, and confidence intervals are
now available for linear combinations of estimated parameters,
as with Stata's other estimators.
- Point estimates, standard errors, and confidence intervals are now
available for nonlinear combinations of estimated parameters.
- Estimators for nonlinear combinations and generalized predictions
are available.
- Ward's linkage hierarchical clustering and Ward's method (also known as
minimum-variance clustering) are now available.
- Weighted-average linkage hierarchical clustering, supplementing the
previously available average linkage clustering, is now available.
- Centroid linkage hierarchical clustering is now available.
- Median linkage hierarchical clustering, also known as Gower's method,
is now available.
- Stopping rules may now be specified. Two popular stopping rules are
provided: the Calínski and Harabasz pseudo-F index (Calínski
and Harabasz [1974]) and the Duda and Hart Je(2)/Je(1) index with
associated pseudo-T-squared (Duda and Hart [1973]). Additional
stopping rules can be added.
- Two new dissimilarity measures have been added: squared Euclidean
distance and the Minkowski distance metric with argument a
raised to the a power.
The following new estimation procedures are available, in addition to the new
estimators listed in previous sections:
- MANOVA and MANCOVA, with balanced and unbalanced designs, including
designs with missing cells, and with factorial, nested, or mixed designs.
- Rank-ordered logit model, also known as the exploded logit model,
is a generalized McFadden's choice model as fitted by
clogit. In the choice model, only the alternative that
maximizes utility is observed. rologit fits the
corresponding model in which the preference ranking of the alternatives is
observed, not just the alternative that is ranked first.
rologit supports incomplete rankings and ties
(``indifference'').
- Stochastic frontier models with technical or cost-inefficiency effects.
Also, Stata 8 includes the following new and enhanced commands:
- New command mfp selects the fractional polynomial model
that best predicts the dependent variable from the independent variables.
- The new nlcom command computes point estimates, standard
errors, t and Z statistics, p-values, and confidence
intervals for nonlinear combinations of coefficients after any estimation
command. Results are displayed in the table format commonly used for
displaying estimation results. The standard errors are based on the
``delta method''.
- The new predictnl command produces
nonlinear predictions after any Stata estimation command and can
optionally calculate the variance, standard errors, Wald-test statistics,
significance levels, and pointwise confidence intervals for these
predictions. Unlike with testnl and nlcom, the quantities
generated by predictnl can vary over the observations in the data.
The standard errors and other inference-related quantities are based on
the ``delta method''.
- The new bootstrap command replaces the old
bstrap and bs commands.
bootstrap has an improved syntax and allows for
stratified sampling.
- Existing command bsample now accepts the
strata() option and has a new weight() option that allows
you to save the sample frequency instead of changing the data in memory.
- The existing bstat command can now construct
bias-corrected and accelerated (BCa) confidence intervals. In
addition, bstat is now an e-class
command, meaning that all the post-estimation commands can be used on
bootstrap results.
- Existing command jknife now accepts the
cluster() option.
- New command permute estimates p-values for
permutation tests based on Monte Carlo simulations. These estimates can
be one sided or two sided.
- Existing command sample has new option
count that allows samples of the specified number of
observations (rather than a percentage) to be drawn.
- New command simulate replaces simul and
provides improved syntax for specifying simulations.
- Existing command statsby has a new syntax, new options,
and now allows time-series operators.
- The new estimates command provides a new, consistent way
to store and refer to estimation results. Post-estimation commands that
make comparisons across models, such as lrtest and
hausman, previously had their own idiosyncratic ways to
store and refer to estimation results. These commands now support a
unified way of retrieving estimation results utilizing the new
estimates suite.
- New command suest is a post-estimation
command that combines multiple estimation results (parameter vectors and
their variance–covariance matrices) into simultaneous results with a
single stacked parameter vector and a robust (sandwich)
variance–covariance matrix. The estimation results to be combined
may be based on different, overlapping, or even the same data. After
creating the simultaneous estimation results, you can use test or
testnl to obtain Hausman-type tests for cross-model hypotheses.
suest supports survey data.
- New command imtest performs the information matrix test
for a regression model. In addition, it provides the
Cameron–Trevedi decomposition of the IM-test in tests for
heteroskedasticity, skewness, and kurtosis, as well as White's original
heteroskedasticity test.
- New command szroeter performs Szroeter's test for
heteroskedasticity in a regression model.
- Existing command hettest now provides option
rhs to test for heteroskedasticity in the independent
variables. It now also supports multiple comparison testing.
- Existing command tabulate has output changes, new
features, and expanded limits.
Three new statistics are available for twoway tabulations: the expected
number in each cell, the contribution to Pearson's chi2, and
the contribution to the likelihood-ratio chi2.
tabulate now respects set linesize, so you can
produce wide tables.
tabulate for oneway tabulations has new option sort, which
puts the table in descending order of frequency.
- Existing command tabstat can now
produce tables containing the variance and/or the standard error of
the mean.
- Existing command roctab has new option
specificity to graph sensitivity versus specificity,
instead of the default sensitivity versus (1-specificity).
- Existing command ologit now has option or
to display results as odds ratios (display exponentiated coefficients).
- Existing command adjust can now display predicted
probabilities when used after svylogit,
svyprobit, xtlogit, and
xtprobit.
- rvpplot has been extended to work after anova.
In addition, cprplot and acprplot have new options
lowess and mspline that allow putting a lowess curve or
median spline through the data.
 |
* |
 |
Existing command clogit has new options robust
and cluster. In addition, clogit has been converted from a
built-in command to one that now uses ml. As a result,
clogit now supports options that are available to ml-programmed
estimators, such as constraint() for linear constraints.
|
* new in Stata 8 as of July 2004
|
|