In addition to mixed models, survey statistics, multivariate statistics, and multinomial probit, many other new estimators and a host of statistical features have been added in Stata 9.

Categories

Also see the separate sections on multinomial mixed models, survey statistics, multivariate statistics, and multinomial probit

- New estimation command
slogitfits the stereotype logistic regression model for categorical dependent variables. This model can be viewed as either a generalization of the multinomial logistic regression model (mlogit) or a generalization of the ordered logistic regression model (ologit) that relaxes the proportional–odds assumption. See[R] slogit.

Predicted statistics afterslogitinclude the linear predictor, the probability of any or all outcomes, and the standard error of the linear predictor. See[R] slogit postestimation.

- New estimation command
ivprobitfits probit regression models of binary outcomes with endogenous regressors. Estimation can be performed by maximum likelihood estimation (MLE) or by Newey’s minimum chi-squared two-step estimation, but note that some postestimation facilities, such as computing marginal effects withmfx, are available only after ML estimation—the two-step estimator imposes a transformation that invalidates many postestimation results. See[R] ivprobit.

- New estimation command
ivtobitfits linear regression models with censored dependent variables by maximum likelihood estimation or by Newey’s minimum chi-squared two-step estimation (but see the note about the the two-step estimator above). See[R] ivtobit.

- New estimation command
ztpfits a zero-truncated Poisson model of event counts with truncation at zero.

Predicted statistics afterztpinclude the linear predictor and its standard error, the predicted number of events, the incidence rate, the conditional mean, and the likelihood score. See[R] ztpand[R] ztp postestimation.

- New estimation command
ztnbfits a zero-truncated negative binomial model of event counts with truncation at zero and over- or underdispersion.

Predicted statistics afterztnbinclude the linear predictor and its standard error, the predicted number of events, the incidence rate, the conditional mean, and the likelihood scores. See[R] ztnband[R] ztnb postestimation.

- New estimation commands
mean,ratio,proportion, andtotalestimate means, ratios, proportions, and totals over the entire sample or over groups within the sample. When estimating over groups, the entire covariance matrix (VCE) is estimated. These are full estimation commands and support a range of postestimation facilities, such as linear and nonlinear tests among the groupstestandtestnland linear and nonlinear combinations of group-level statisticslincomandnlcom. All four commands support several SE and VCE estimates: robust, cluster-robust, bootstrap, jackknife, and observed information matrix (the default).

mean,ratio, andproportionalso support direct standardization across strata (groups) using thestdize()andstdweight()options.

See[R] mean,[R] ratio,[R] proportion, and[R] total.

- To avoid conflict with the new
meancommand, existing commandmeanshas been renamedameans, with synonymsgmeansandhmeans.

- Existing command
nlhas a new syntax that makes estimating nonlinear least-squares regressions easier. For most models, estimation is now as easy as typing the nonlinear expression. Full programmability has been retained for complex models, and the old syntax continues to work.

nlalso now supports robust (white/sandwich) and cluster-robust SE and VCE estimates, including two popular adjustments that can dramatically improve the small-sample performance of robust SE and VCE estimates.

A number of new reporting and estimation options have also been added. See[R] nl.

- New option
vce()selects how standard errors (SEs) and covariance matrix of the estimated parameters are estimated by most estimation commands. Choices arevce(oim),vce(opg),vce(robust),vce(jackknife), andvce(bootstrap), although the choices can vary estimator by estimator.vce(robust)is a synonym forrobust, and you can use either. What is new arevce(jackknife)andvce(bootstrap).

vce(bootstrap)specifies that the standard errors, significance tests, and confidence intervals be normal-based bootstrap estimates, rather than the default analytic estimates based on the observed information matrix. You can also produce percentile-based or bias-corrected confidence intervals after estimation usingestat bootstrap; see[R] bootstrap postestimation.

vce(jackknife)specifies that the standard errors, significance tests, and confidence intervals be jackknife estimates.

Bothvce(bootstrap)andvce(jackknife)will automatically perform either observation or cluster sampling, whichever is appropriate for the estimator.

Notably, bothvce(bootstrap)andvce(jackknife)compute bootstrapped or jackknifed estimates of the complete VCE matrix. This means that many of Stata’s postestimation commands are available. You can form linear and nonlinear combinations or functions of the parameters and obtain jackknife or normal-based bootstrap standard errors and confidence intervals for the combinations using[R] lincomand[R] nlcom. Similarly, you can perform linear and nonlinear tests using[R] testand[R] testnl.

- New command
estatcentralizes the computing and reporting of additional statistics after estimation just aspredictdoes with predictions.estatallows subcommands.estat summarize, for instance, reports summary statistics for the estimation sample and can be used after any estimator.estatalso allows subcommands that are specific to the estimation command. To find out what is available after a command, see the corresponding postestimation entry. For example, after[R] regress, see[R] regress postestimation, or after[XT] xtmixed, see[XT] xtmixed postestimation.

Existing postestimation commands have been brought into theestatframework:The original commands continue to work, but are undocumented.(*) The new command works after

Estimation command Old command New estatcommandregressovtestestat ovtesthettestestat hettestszroeterestat szroetervifestat vifimtestestat imtestregressdwstatestat dwatson(time series) durbinaestat durbinaltbgodfreyestat bgodfreyarchlmestat archlmanovaovtestestat ovtesthettestestat hettestlogitandlstatestat classification(*)logisticlfitestat gof(*)poissonpoisgofestat gofstcoxstphtestestat phtestxtgeextcorrestat wcorrelationprobit, as well as afterlogitandlogistic; the old command worked afterlogitandlogisticonly.

Threeestatsubcommands are available after almost all estimators:

estat icreports Akaike’s and Schwarz’s Bayesian information criteria (AIC and BIC).

estat summarizereports summary statistics on the variables in the estimation model for the estimation sample.

estat vcereports the covariance (VCE) or correlation matrix estimates. (estat vcereplace the oldvcecommand, and has more features.)

- Stata has many new prefix commands (commands that behave like
by:andxi:). New prefix commands includestatsby:,bootstrap:,jackknife:,permute:,simulate:,stepwise:,svy:, androlling:. For instance, to obtain the standard error and confidence interval of the mean, you might typeor to obtain survey-adjusted estimates you might type. jackknife: mean earningsafter. svy: mean earningssvysetting your data.

See[R] bootstrap,[R] jackknife,[R] permute,[TS] rolling[R] simulate,[R] stepwise,[D] statsby, and[SVY] svy.

- New prefix commands
bootstrap:andjackknife:replace old commandsbsandjknife, and in addition to having better syntax, they also provide new features:

- They have enhanced handling and reporting of expressions.

- They post their results as estimation results with a complete VCE. Most postestimation facilities may now be used after them and will be based on the bootstrap or jackknife VCE. These include

adjustadjusted predictions estimatescataloging estimation results lincomlinear combinations with SEs, tests, and CIs nlcomnonlinear combinations with SEs, tests, and CIs mfxcomputing marginal effects and elasticities predictpredictions, residuals, probabilities, etc. predictnlgeneralized nonlinear predictions with SEs and CIs testWald tests of simple and composite linear hypotheses testnlWald tests of nonlinear hypotheses

- They produce a model test when applied to the coefficients of estimation commands.

- They allow option
seed(#)to set the random-number seed.

- They allow option
reject(exp)to reject replicates that explicitly matchexp.

bootstrap:uses the normal distribution instead of the Student’s t distribution to compute the normal approximation confidence intervals.

jackknife:now allowsfweights to be specified.

See[R] bootstrapand[R] jackknife.

- New prefix command
statsby:replaces old commandstatsby(not a prefix) and provides enhanced handling and reporting of expressions, allows weights, and allows string variables in the optionby(). See[D] statsby.

- New prefix command
stepwise:replaces old commandswand, in addition to working with all the previous estimators, also works with[R] intregand[R] scobit.

- Existing prefix command
xi:has new optionnoomitthat prevents it from omitting a category when generating category indicators for group variables. See[R] xi.

- New command
tetrachoriccomputes a tetrachoric correlation matrix for a set of binary variables. See[R] tetrachoric.

- Existing command
suest, which combines estimation results for subsequent testing, is easier to use and has new features:

- Scores are now computed for the models you combine; you no longer need to save scores when estimating.

suest, used aftersvy:estimation, now accounts for your survey design.

suestnow works more smoothly with certain estimation commands that previously required special treatment, includingregress,ologit, andoprobit.

suestnow works with all models estimated byclogit, rather than only those with a single positive outcome per group.

See[R] suest.

- Existing command
clogithas new features:

- Robust and cluster-robust SE and VCE estimates are now supported via options
robustandcluster().

- Linear constraints on the parameters are now implemented via option
constraints().

- New option
vce()allows SE and VCE estimates to be computed using OIM (the default), OPG, bootstrap, and jackknife.

See[R] clogit.

- Option
level()now allows noninteger confidence levels to be specified. See[R] level.

- Existing command
predictnow generates equation-level scores after most maximum likelihood estimation commands; see the documentation ofpredictin the postestimation entry for each estimation command.

- Existing command
cumulhas a new optionequalto create equal cumulative values for ties. See[R] cumul.

- Existing command
estimates tablenow allows you to specify more models, and the command wraps the table if necessary. Also allowed are new options

equations(), which matches equations by number rather than by name.

coded, which displays the table in a compact, symbolic format.

modelwidth(), which sets the number of characters for displaying model names.

See[R] estimates.

testafteranovaandmanovahas two new options for performing Wald tests:

mtest()implements three methods to adjust for multiple tests: Bonferroni, Holm, and Šidák.

test()makes specifying contrasts easier by accepting a matrix containing the contrast.

See[R] anova postestimation.

- Commands
ciandciihave new optionsexact,wilson,agresti,jeffreys, andwaldfor computing different types of binomial confidence intervals. See[R] ci.

- Command
hausmanhas new optiondf()for controlling the degrees of freedom. See[R] hausman.

- Command
predicthas new optionscorefor returning equation-level scores. See[R] predict.

- Command
mfxis now faster and has new optionvarlist()for computing effects of specific variables. See[R] mfx.

- Command
mfphas the new optionaicfor selecting models using the Akaike information criterion (AIC). See[R] mfp.

- Commands
tabulateandtabiwith theexactoption are now significantly faster.

- In existing command
mlogit, optionbasecathas been renamedbaseoutcome()for better consistency with the terminology of choice models. See[R] mlogit.

- Existing commands
spearmanandktaunow allow more than two variables to be specified and have more flexible output. See[R] spearman.

- Existing command
bsamplefor sampling with replacement (bootstrap sampling) now supports weighted bootstrap resampling using the newweight()option. See[R] bsample.

- Existing command
bstatfor reporting bootstrap results has a number of new reporting options. In addition,bstatpreviously computed percentile and other confidence intervals. This is now handled byestat bootstrap, used after any bootstrap estimation, includingbstat. See[R] bstatand[R] bootstrap postestimation.

- Most maximum likelihood estimators now test for convergence using the Hessian-scaled gradient, g*inv(H)*g'. This criterion ensures that the gradient is close to zero when scaled by the Hessian (the curvature of the likelihood or pseudolikelihood surface at the optimum) and provides greater assurance of convergence for models whose likelihoods tend to be difficult to optimize, such as those for
arch,asmprobit, andscobit. You can set the tolerance level for this test with new optionnrtolerance(), show the Hessian-scaled gradient in the iteration log with optionshownrtol, and turn the test off with optionnonrtolerance. See[R] maximize.

- Existing command
sethas new settingmaxiter—default value 16000—that specifies the maximum number of iterations to be performed by all estimation commands. You change this setting by typingsetmaxiter#, and you may add optionpermanentlyto retain the setting in future Stata sessions.

- Existing command
arimacan now estimate multiplicative seasonal ARIMA (SARIMA) models; see new optionssarima(),mar(), andmma()in[TS] arima.

- New command
rollingperforms rolling-window or recursive estimations, including regressions, and collects statistics from the estimation on each window; see[TS] rolling.

- The
[TS]manual now has a glossary that defines commonly used terms in time-series analysis and explains how we use them in the manual; see the glossary of[TS].

- Many existing commands that previously did not allow time-series operators now do. These commands include
areg,binreg,biprobit,boxcox,cloglog,cnsreg,glm,heckman,heckprob,hetprob,impute,intreg,logistic,logit,lowess,mvreg,nbreg,orthog,pcorr,poisson,probit,pwcorr,rreg,testparm,treatreg,truncreg,xtcloglog,xtgls,xtintreg,xtlogit,xtpoisson,xtprobit,xtgee,xtreg,xtsum, andxttobit.

- Many commands requiring time-series data now work on a single panel from a panel dataset when that panel is selected using an
ifexpression or aninqualifier. Those commands includeac,corrgram,cumsp,dfgls,dfuller,pac,pergram,pperron,wntestb,wntestq, andxcorr. New commandsestat archlm,estat bgodfrey,estat dwatson, andestat durbinalt, which replace commandsarchlm,bgodfrey,dwstat, anddurbina, also work on a single panel from a panel dataset.

- The dialogs for analyzing IRF results are much improved. The dialogs now populate lists of models and variables from the current IRF results that may be chosen for producing tables and graphs. The improved dialogs include
irf cgraph,irf ctable,irf graph,irf ograph, andirf table.

- Existing command
dfullerhas new optiondriftfor testing the null hypothesis of a random walk with drift. The algorithm for calculating MacKinnon’s approximate p-values is also now more accurate in cases where the p-value is relatively large; see[TS] dfuller.

- Existing commands
corrgramandpachave new optionywthat computes partial autocorrelations using the Yule–Walker equations instead of the default regression-based method; see[TS] corrgram.

- Time-series operators are now better displayed in estimation and other result tables.

- New command
estat durbinalt—used afterregress—brings together what was previously done by commandsdwstat,durbina,bgodfrey, andarchlm. The new commands areestat dwatson,estat durbina,estat bgodfrey, andestat archlm. See[R] regress postestimation time series.

- The ability of
arimaandarchto estimate standard errors using either the observed information matrix (OIM) or the outer product of gradients (OPG) has been consolidated under the newvce()option.

(What follows was first released in Stata 8.2.)

- New command
vecfits cointegrated vector error-correction models (VECMs) using Johansen’s method; see[TS] vec.

- New command
vecrankproduces statistics used to determine the number of cointegrating vectors in a VECM, including Johansen’s trace and maximum-eigenvalue tests for cointegration; see[TS] vecrank.

- New command
fcast—which replaces old commandvarfcast—produces and graphs dynamic forecasts of the dependent variables after fitting a VAR, SVAR, or VECM; see[TS] fcast.

- New command
irf—which replaces the old commandvarirf—does everything the old command did and more.irfestimates the impulse–response functions, cumulative impulse–response functions, orthogonalized impulse–response functions, structural impulse–response functions, and forecast error-variance decompositions after fitting a VAR, SVAR, or VECM.irfcan also make graphs and tables of the results. See[TS] irf.

varirfcontinues to work but is no longer documented.irfaccepts.vrfresult files created byvarirf.

- Existing command
varsoccan now be used to obtain lag-order selection statistics for VECMs, as well as VARs; see[TS] varsoc.

- New command
veclmarcomputes Lagrange-multiplier statistics for autocorrelation after fitting a VECM; see[TS] veclmar.

- New command
vecnormtests whether the disturbances in a VECM are normally distributed. For each equation and for all equations jointly, three statistics are computed: a skewness statistic, a kurtosis statistic, and the Jarque–Bera statistic. See[TS] vecnorm.

- New command
vecstablechecks the eigenvalue stability condition after fitting a VECM; see[TS] vecstable.

- New command
vecstableand the existing commandvarstablehave a new graph option for presenting the stability results. See[TS] vecstableand[TS] varstable.

- The output of the following commands has been standardized to improve formatting:
var,svar,vargranger,varlmar,varnorm,varsoc,varstable, andvarwle.

- New command
havermakes it easy to load and analyze economic and financial databases available from Haver Analytics; see[TS] haver.

- The big news is the new commands
xtmixed—Stata now fits linear mixed models. See the section on linear mixed models.

- New features have been added to the maximum likelihood estimators that do not have closed-form solutions and require numeric evaluation of the likelihood. These estimators include
xtlogit,xtprobit,xtpoisson,xtcloglog,xtintreg, andxttobit.

- The likelihood may now be approximated using adaptive Gauss–Hermite quadrature (the new default) or nonadaptive quadrature (the previous default). Adaptive quadrature substantially increases the accuracy of the approximation, particularly on difficult problems such as data with large panel sizes or data with a large variance for the random effects.

- Linear constraints may now be imposed using the new option
constraints(). Constraints are specified the standard way; see[R] constraint.

- New option
intpoints()replaces old optionquad(), althoughquad()continues to work. The new name is more meaningful, especially when used with estimators that integrate likelihoods using methods other than quadrature.

- Existing command
xtregnow allows optionsrobustandcluster()when estimating fixed-effects (FE) and random-effects (RE) models; see[XT] xtreg.

- Most
[XT]commands that previously did not allow time-series operators now support them. These commands includextgls,xtreg,xtsum,xtcloglog,xtintreg,xtlogit,xtpoisson,xtprobit,xttobit, andxtgee.

- New command
xtrcis old commandxtrchh, renamed, and with new features. New optionbetareports the best linear predictors (BLUPs) for the group-specific coefficients, along with their standard errors and confidence intervals. For details, see[XT] xtrc.

predictafterxtrchas the new optiongroup()to compute the BLUPs of the dependent variable using the BLUPs of the coefficients.

- New command
xtlineplots panel data and allows either overlaid or separate graphs for each panel; see[XT] xtline.

- New section
[XT] glossarydefines commonly used terms and how they are used by us.

- The [ST] manual now has a glossary that defines commonly used terms in survival (or duration) analysis and often explains how these terms are used in the manual; see the glossary of
[ST].

- New command
estatcan be used afterstcoxandstreg. In addition to the standardestatstatistics—information criteria, estimation sample summary, and formatted variance–covariance matrix (VCE)—statistics specific to the proportional-hazards estimator are available afterstcox. These include

estat concordancecomputes Harrell’s C and Somers' D statistics measuring concordance—agreement of predictions with observed failure order.

estat phtestreplaces the existingstphtestfor computing tests and graphs of the proportional hazards assumption.stphtestcontinues to work.

See[ST] stcox postestimationand[ST] streg postestimation.

- Existing command
sts graphhas new optionscihazardandper(#).cihazarddraws pointwise confidence bands around the smoothed hazard function, andper()specifies the units used to report the survival or failure rate. See[ST] sts.

- Existing command
stcurvenow plots over an evenly spaced grid, producing smooth curves, even in small samples; see[ST] stcurve.

- Existing command
sts graphhas new optionsatriskopts()andlostopts()that let you control how the labels for at-risk and lost observations look (their color, font size, etc.); see[ST] sts.

- Existing command
stcihas new options for controlling how the plotted survival line looks (color, thickness, etc.) and for adding titles, controlling legends, and all other characteristics of the graph; see[ST] stci.

Commandml, for implementing user-written maximum likelihood estimators, has many new features:

- New option
technique()sets the optimization technique. BHHH, DFP, and BFGS optimization techniques are now available; the default technique remains modified Newton–Raphson.

- New option
vce()sets the type of covariance matrix calculations that will be made.

vce(oim)specifies the observed information matrix (OIM), also called the Hessian-based estimator; this is (and always has been) the default.

vce(opg)specifies the outer product of the gradients (OPG). This is new.

vce(robust)specifies Taylor-series linearization, also known as the Huber or White estimator and, in Stata, as simply robust.

- Most estimators written with
mlnow support estimation with survey data and correlated data with no additional programming. This support includes correct treatment of multistage designs, weighting, stratification, poststratification, and finite-population corrections, as well as access to linearization, jackknife, and bootstrap variance estimators. For a discussion, see[P] program properties.

mlhas always allowed linear constraints to be applied using the optionconstraints()with no additional programming. It now handles irrelevant constraints more elegantly. Irrelevant constraints are those that have no impact on the model. Previously, irrelevant constraints caused an error message. Now they are flagged and ignored.

- When linear constraints are imposed,
mlnow applies a Wald test for the overall fit of the model, rather than attempting a likelihood-ratio (LR) test, which is often inappropriate.

mlhas new subcommandscorefor generating scores after fitting a model.

mlhas new optiondiparm_options()that automatically performs transformations of ancillary parameters.

mlnow saves the gradient vector ine(gradient).

mlhas new optionsearch(norescale)that prevents rescaling when searching for starting values.

mlhonors the new setting for maximum iterations,set maxiter#, and will iterate a maximum of#iterations, even if convergence has not been achieved.

mlnow displays a prominent message in the footer of the estimation results when convergence is not achieved. This message continues to be shown on redisplay of estimation results.

mlhas new optionnofootnoteto suppress printing the new message warning if convergence is not achieved.

mltests for convergence using the Hessian-scaled gradient—g*inv(H)*g'. This is a true convergence criterion that ensures that the gradient is close to zero when scaled by the Hessian (the curvature of the likelihood or pseudolikelihood surface at the optimum). This new criterion is particularly important when maximizing difficult likelihoods to prevent stopping the maximization too soon.

- New option
nrtolerance()lets you change the tolerance for the Hessian-scaled gradient convergence criterion; the default isnrtolerance(1e-5).

- New option
displays the criterion value of the Hessian-scaled gradient at each iteration.shownrtolerance

- New undocumented command
mlmatbysumhelps you compute the Hessian of panel-data likelihoods and is of interest to those seeking the speed that comes with programming your own second-derivative calculations; seemlmatbysum.

mlhas two new undocumented subcommands—ml holdandml unhold—to assist in solving nested optimization problems; seeml_hold.

See[R] mlfor more information on these features. Anyone programming estimators usingmlshould read the bookMaximum Likelihood Estimation with Stata. Many of the features mentioned above are discussed and applied to real problems in the book.