Stata 15 help for whatsnew10to11

What's new in release 11.0 (compared with release 10)

This file lists the changes corresponding to the creation of Stata release 11.0:

+---------------------------------------------------------------+ | help file contents years | |---------------------------------------------------------------| | whatsnew Stata 15.0 and 15.1 2017 to present | | whatsnew14to15 Stata 15.0 new release 2017 | | whatsnew14 Stata 14.0, 14.1, and 14.2 2015 to 2017 | | whatsnew13to14 Stata 14.0 new release 2015 | | whatsnew13 Stata 13.0 and 13.1 2013 to 2015 | | whatsnew12to13 Stata 13.0 new release 2013 | | whatsnew12 Stata 12.0 and 12.1 2011 to 2013 | | whatsnew11to12 Stata 12.0 new release 2011 | | whatsnew11 Stata 11.0, 11.1, and 11.2 2009 to 2011 | | this file Stata 11.0 new release 2009 | | whatsnew10 Stata 10.0 and 10.1 2007 to 2009 | | whatsnew9to10 Stata 10.0 new release 2007 | | whatsnew9 Stata 9.0, 9.1, and 9.2 2005 to 2007 | | whatsnew8to9 Stata 9.0 new release 2005 | | whatsnew8 Stata 8.0, 8.1, and 8.2 2003 to 2005 | | whatsnew7to8 Stata 8.0 new release 2003 | | whatsnew7 Stata 7.0 2001 to 2002 | | whatsnew6to7 Stata 7.0 new release 2000 | | whatsnew6 Stata 6.0 1999 to 2000 | +---------------------------------------------------------------+

Most recent changes are listed first.

--- more recent updates -------------------------------------------------------

See whatsnew11.

--- Stata 11.0 release 13jul2009 ----------------------------------------------

Remarks

We will list all the changes, item by item, but first, here are the highlights:

1. Stata now allows factor variables! In estimation, you can now fit models by typing, for example,

. regress y i.sex i.group i.sex#i.group age (1) . regress y i.sex##i.group age (same as 1) . regress y i.sex i.group i.region i.sex#i.group i.sex#i.region i.group#i.region (2) i.sex#i.group#i.region age . regress y i.sex##i.group##i.region age (same as 2)

and Stata will form for itself the indicator variables for sex, group, and region, and their interactions. You do not use the old xi command, and no new variables will be created in your data. You can form interactions of factor variables with continuous variables, and continuous variables with continuous variables by using the c. prefix:

. regress y i.sex##i.group##i.region age c.age#c.age (3) . regress y i.sex##i.group##i.region age i.sex##i.group##i.region#c.age (4) c.age#c.age i.sex##i.group##i.region#c.age#c.age . regress y i.sex##i.group##i.region##c.age (same as 4) i.sex##i.group##i.region##c.age#c.age

This new factor-variable notation is understood by nearly every Stata estimation command, so you can type, for example,

. logistic outcome i.treatment##i.sex age bp c.age#c.bp

Factor variables work with summarize and list, too:

. list outcome i.treatment##i.sex

Factor variables have lots of additional features; see [U] 11.4.3 Factor variables.

2. Stata 11's new postestimation command margins estimates margins and marginal effects. Included are estimated marginal means, least-squares means, average and conditional marginal and partial effects, average and conditional adjusted predictions, predictive margins, and more. There are few users who will not find margins useful. It will be well worth your time to read [R] margins.

3. Stata's new mi suite of commands performs multiple imputation. There is so much to say that mi gets its own manual.

mi provides methods for the analysis of incomplete data, data for which some values are missing, and provides both the imputation and estimation steps. mi's estimation step combines the estimation and pooling steps. Multivariate normal imputation is provided, along with five univariate methods that can be used alone or as building blocks for multivariate imputation.

mi can import already imputed data, including data from NHANES and ice. mi solves the problem of keeping multiple datasets in sync. You can create or drop variables or observations just as if you were working with one dataset. You can merge, append, and reshape data, all of which is to say that you can perform data management either before or even after forming the imputations.

Included is an interactive control panel that provides access to almost all of mi's capabilities and guides you through the steps of analysis.

See [MI] intro.

4. The new Variables Manager is the one-stop place to go to manage your variables. Click on the Variables Manager button or type varmanage. You can change names, labels, display formats, and storage types. You can define and edit notes, and define and edit value labels. The Variables Manager is useful even for those who have thousands of variables in their data; just type part of the name in the filter at the top left. See [D] varmanage and [GS] 7 Using the Variables Manager (GSM, GSU, or GSW).

5. The Data Editor is all new. It is now a live view onto your data, which means that you can run a Stata command and see the changes reflected immediately. You can apply filters to view subsets of your data, take snapshots so that you can undo changes, and enter dates and times the natural way. See [D] edit and [GS] 6 Using the Data Editor (GSM, GSU, or GSW).

6. The Do-file Editor under Windows is all new, too. Syntax highlighting and code folding are provided. There is no limit to file size. See [D] doedit.

7. You can now put bold and italic text, Greek letters, symbols, superscripts, and subscripts on graphs! See [G-4] text.

8. If you are not reading this on your computer, you could be. Stata now has PDF manuals -- [GS], [U], [D], [G], [MI], [MV], [R], [ST], [SVY], [TS], [XT], [P], [M], and [I] -- and they are shipped with every copy of Stata. Select Help > PDF Documentation. Even better, the manuals are integrated into the help system. From a help file, you can jump directly to the relevant page just by clicking on the reference. There is nothing more to know.

There are other exciting new features in this release depending on who you are and what interests you. These include

o competing-risks regression models; see [ST] stcrreg

o GMM estimation; see [R] gmm

o state-space (Kalman filtering) modeling; see [TS] sspace

o multivariate GARCH; see [TS] dvech

o dynamic-factor models; see [TS] dfactor

o unit-root tests for panel data; see [XT] xtunitroot

o error structures for linear mixed models; see [XT] xtmixed

o standard errors for BLUPs in linear mixed models; see [XT] xtmixed

o object-oriented programming in Mata; see [M-2] class

o full model-based optimization in Mata; see [M-5] moptimize()

o numerical derivative function in Mata; see [M-5 deriv()

Each of these, and more, is covered in the sections that follow.

What's new in the GUI and command interface

1. As mentioned in the highlights, the new Variables Manager is the one-stop place to go to manage your variables. See [D] varmanage and [GS] 7 Using the Variables Manager (GSM, GSU, or GSW).

2. Also a highlight is the new Data Editor, a live view onto your data. See [D] edit and [GS] 6 Using the Data Editor (GSM, GSU, or GSW).

3. The Do-file Editor is all new under Windows and provides syntax highlighting and code folding. See [D] doedit.

4. You doubtlessly have already noticed that Stata's Results window now has a white background. Stata has several new color schemes, and the one you are seeing is called Standard. What was the default scheme in Stata 10 is called Classic, so if you want it back, select Edit > Preferences > General Preferences... and change the scheme for the Results window to it. You can try the other schemes or make your own and save it in Custom 1, Custom 2, or Custom 3.

5. In Stata for Windows, you can now choose from among five different default layouts for the overall size and position of Stata's windows or, just as previously, you can make your own. Select Edit > Preferences > Load Preference Set and pick a layout. In addition to Factory Settings, available are Compact Window Layout and three Presentation layouts optimized for different projector resolutions.

6. Output scrolling in the Results window is now significantly faster. Also, the upper limit of set scrollbufsize has been increased to 2,000,000. See [R] set.

7. In Stata for Windows, Graph windows no longer float.

8. In Stata for Windows, existing command windows manage has new subcommand prefs for loading and saving named preference sets; type help window manage for details.

9. Stata for Unix(GUI) now supports copying graphs to the Clipboard in bitmap format.

10. Stata for Mac now supports copying graphs to the Clipboard in PDF format.

11. Stata for Mac's graphical user interface (GUI) has been completely rewritten in Apple's Cocoa programming interface.

12. Stata for Mac is now available as a universal binary that runs natively on 32-bit Intel- or PowerPC-based Macs and 64-bit Intel-based Macs to deliver optimal performance for all three architectures in a single package.

What's new in data management

1. Existing command merge has all new syntax. It is easier to use, easier to read, and makes it less likely that you will make a mistake. Merges are classified as 1:1, 1:m, m:1, and m:m. When you type merge 1:1, you are saying that you expect the observations to match one-to-one. merge 1:m specifies a 1-to-many merge; m:1, a many-to-1 merge; and m:m, a many-to-many merge. New options assert() and keep() allow you to specify what you expect the outcome to be and what you want to keep from it. For instance,

. merge 1:1 subjid using filename, assert(match)

means that you expect all the observations in both datasets to match each other, whereas

. merge 1:1 subjid using filename, assert(match using) keep(match)

specifies that you expect each observation to either match or be solely from the using data and, assuming that is true, you want to keep only the matches.

Sorting of both the master and the using datasets is now automatic.

The new merge does not support merging multiple files in one step. Merge the first two datasets, then merge that result with the next dataset, and so on.

merge now aborts with error if variables are string in one dataset and numeric in the other unless new option force is specified.

See [D] merge. The old merge syntax continues to work.

2. Existing command append has several new features: 1) it will work even if there are no data in memory; 2) multiple files can be appended in one step; and 3) new option generate(newvar) creates a variable indicating the source of the observations, numbered 0, 1, .... append now aborts with error if variables are string in one dataset and numeric in the other unless new option force is specified. See [D] append. Old behavior is preserved under version control.

3. Stata's default memory allocations have changed:

a. Stata/SE and Stata/MP now default to allocating 50 M of memory rather than 10 M. Stata/IC now defaults to 10 M rather than 1 M. Stata's required footprint has not grown; we reset these defaults because users were resetting to larger numbers anyway.

b. Stata/IC now defaults matsize to 400 rather than 200; the default for Stata/SE and Stata/MP remains 400. The default for Small Stata is now 100 rather than 40.

4. Existing command order now does what order, move, and aorder did; see [D] order. Old commands aorder and move continue to work but are no longer documented.

5. New commands zipfile and unzipfile compress and uncompress files and directories in zip archive format. See [D] zipfile.

6. New command changeeol converts text from one operating system's end-of-line format to another's. Stata does not care about end-of-line format, but some editors and other programs do. See [D] changeeol.

7. New command snapshot saves to disk and restores from disk copies of the data in memory. snapshot is used by the new Data Editor. An important feature of the Data Editor is that it can log all the changes you make interactively. snapshot will show up in those logs. snapshot really is a command of Stata, so you can replay logs to duplicate past efforts. For your own use, however, it is better if you continue using preserve and restore. See [D] snapshot.

8. You can now copy-and-paste commands from logs and execute them without editing out the period (the dot prompt) in front! Stata 11 ignores leading periods.

9. Existing command notes has new options search, replace, and renumber. See [D] notes.

10. Concerning value labels:

a. Existing command label define has new option replace so that you do not have to drop the value label before redefining it.

b. New command label copy copies value labels.

c. Existing command label values now allows a varlist, so you can label (or unlabel) a group of variables at the same time.

See [D] label.

11. Existing command expand has new option generate(newvar) that makes it easier to distinguish original from duplicated observations. See [D] expand.

12. Concerning egen:

a. New function rowmedian(varlist) returns, observation by observation, the median of the values in varlist.

b. New function rowpctile(varlist), p(#) returns, observation by observation, the #th row percentile of the values within varlist.

c. Existing function mode(varname) with option missing treats missing values as a category. When version is set to 10 or less, missing does not treat missing as a category.

d. Existing functions total(exp) and rowtotal(varlist) have new option missing. If all values of exp or varlist for an observation are missing, then that observation in newvar will be set to missing.

See [D] egen.

13. Existing command copy now allows copying a file to a directory without having to type the filename twice; see [D] copy.

14. Existing command clear now allows clear matrix to clear all Stata matrices (as distinguished from Mata matrices) from memory; see [D] clear.

15. Existing command outfile now exports date variables as strings rather than their underlying numeric value. Under version control, old behavior is restored. See [D] outfile.

16. Existing command reshape now preserves variable and value labels when converting from long to wide and restores variable and value labels when converting from wide to long. Thus the value and variable labels for the i variable, which exists in long form but not in wide form, are restored when converting back from wide to long. The value labels of the xij variables are similarly restored. Prior behavior is preserved when version is 10 or earlier. See [D] reshape.

17. Existing command collapse now allows new statistics semean, sebinomial, and sepoisson for obtaining the standard error of the mean. See [D] collapse.

18. Existing command destring allows new option dpcomma to convert to numeric form string representation of numbers using commas as the decimal point. See [D] destring.

19. Concerning existing command odbc:

a. odbc insert now uses parameterized inserts, which are faster.

b. The dialogs for odbc load and odbc insert can now store a data-source user ID and password for a Stata session.

c. odbc query has new options verbose and schema. verbose lists any data source alias, nickname, typed table, typed view, and view along with tables so that data from these table types can be loaded. schema lists schema names with the table names if the data source returns schema information.

d. odbc insert has a new dialog.

e. Existing option dsn() now allows the data source to be up to 499 characters.

f. odbc now reports driver errors directly. Previously, odbc would issue the error "ODBC error; type -set debug on- and rerun command to see extended error information" when an ODBC driver issued an error.

g. odbc, with set debug on, for security reasons no longer displays the data source name, user ID, and password used for connecting to your data source.

See [D] odbc.

20. New function strtoname() converts a general string to a string meeting Stata's naming conventions. Also, existing functions lower(), ltrim(), proper(), reverse(), rtrim(), and upper() now have synonyms strlower(), strltrim(), ..., and strupper(). Both sets of names work equally well. See [FN] String functions.

21. New function soundex() returns the soundex code for a name, consisting of a letter followed by three numbers. New function soundex_nara() returns the U.S. Census soundex for a name, also consisting of a letter followed by three numbers, but produced by a different algorithm. See [FN] String functions.

22. New functions sinh(), cosh(), asinh(), and acosh() join existing functions tanh() and atanh() to provide the hyperbolic functions. See [FN] Trigonometric functions.

23. New functions binomialp(); hypergeometric() and hypergeometricp(); nbinomial(), nbinomialp(), and nbinomialtail(); and poisson(), poissonp(), and poissontail() provide distribution and probability mass for the binomial, hypergeometric, negative binomial, and Poisson distributions. See [FN] Statistical functions.

24. New functions invnbinomial() and invnbinomialtail(), and invpoisson() and invpoissontail() provide inverses for the negative binomial and Poisson distributions. See [FN] Statistical functions.

25. Algorithms for the existing functions normal() and lnnormal() have been improved to operate in 60% and 75% of the time, respectively, while giving equivalent double-precision results.

26. New functions rbeta(), rbinomial(), rchi2(), rgamma(), rhypergeometric(), rnbinomial(), rnormal(), rpoisson(), and rt() produce random variates for the beta, binomial, chi-squared, gamma, hypergeometric, negative binomial, normal, Poisson, and Student's t distributions, respectively.

Old function uniform() has been renamed to runiform(), but uniform() continues to work.

Thus all random-variate functions start with r.

See [FN] Random-number functions.

27. Existing command drawnorm now uses new function rnormal() to generate random variates. When version is set to 10 or earlier, drawnorm reverts to using invnormal(uniform()). See [FN] Random-number functions.

28. Existing command describe now respects the width of the Results window when formatting output; see [D] describe.

29. Existing command renpfix now returns the list of variables changed in r(varlist); see [D] rename.

30. Previously existing command impute still works but is now undocumented. It is replaced by the new multiple-imputation command mi. See the Multiple-Imputation Reference Manual.

What's new in statistics (general)

1. The highlight of this release is statistics related, namely, factor variables. We have already said a lot about them. You will not be able to avoid them. You will not want to avoid them. See [U] 11.4.3 Factor variables.

2. The new postestimation command margins is also a highlight of this release. margins estimates margins and marginal effects. Included are estimated marginal means, least-squares means, average and conditional marginal and partial effects, average and conditional adjusted predictions, predictive margins, and more. We urge you to read [R] margins.

margins replaces old commands mfx and adjust. mfx and adjust are no longer documented but continue to work under version control.

3. New command mi performs multiple imputation; see [MI] intro.

4. New command misstable makes tables that help you understand the pattern of missing values in your data; see [R] misstable.

5. New command gmm implements the generalized method of moments estimator. gmm allows linear and nonlinear models; allows one-step, two-step, and iterative estimators; works with cross-sectional, time-series, and panel data; and allows panel-style instruments. To fit a model, you need only write the expressions of the moments. See [R] gmm.

6. Concerning factor variables:

a. Factor variables may be specified with almost all estimation commands (see item 6g below).

b. If an estimation command works with factor variables, so do its postestimation commands. If the postestimation command accepts or requires a varlist, factor variables may be specified.

c. Factor variables may be specified with existing commands list and summarize.

d. Commands that allow factor variables also allow new options affecting how output appears: vsquish, baselevels, allbaselevels, noemptycells, and noomitted. Many commands that work with factor variables, such as estat summarize, estat vce, and the like, also allow the above options. Estimation commands also allow new option coeflegend. See [R] estimation options.

coeflegend is useful when you wish to access the coefficients or standard errors individually using _b[] or _se[], such as when you are using lincom, nlcom, or test. coeflegend provides what you need to type.

vsquish reduces the amount of white space used vertically to display results.

Stata used to drop covariates because of collinearity before performing estimation. This is now handled differently. Stata dropped variables for three reasons: because they were 1) base levels of factors, 2) levels corresponding to interactions where there were no data, and 3) truly collinear. These are now identified separately.

New option baselevels says to report reason 1 in main effects.

New option allbaselevels says to report reason 1 in all terms.

New option noemptycells says not to report reason 2.

New option noomitted says not to report reason 3.

e. New command fvset allows you to specify default base levels and design settings for variables that can be recorded in the dataset and so remembered from one session to the next; see [R] fvset.

f. New command set emptycells drop specifies that all estimation commands drop covariates associated with empty cells from estimation. The default is set emptycells keep. If you have sufficient memory, it is better to keep the covariates because then new postestimation command margins can better identify nonestimability.

g. Factor variables are allowed with the following estimation commands: anova, areg, binreg, biprobit, blogit, bootstrap, bprobit, clogit, cloglog, dfactor, dvech, eivreg, frontier, glm, glogit, gnbreg, gprobit, heckman, heckprob, hetprob, intreg, ivprobit, ivregress, ivtobit, jackknife, logistic, logit, manova, mlogit, mprobit, mvreg, nbreg, newey, ologit, oprobit, poisson, prais, probit, reg3, regress, rologit, rreg, scobit, slogit, sspace, stcox, streg, sureg, svy, tobit, treatreg, truncreg, xtcloglog, xtfrontier, xtgee, xtgls, xtintreg, xtivreg, xtlogit, xtmelogit, xtmepoisson, xtmixed, xtnbreg, xtpcse, xtpoisson, xtprobit, xtrc, xtreg, xtregar, xttobit, zinb, zip, ztnb, and ztp.

7. anova and manova now use Stata's new factor-variable syntax, which means new estimation and postestimation features and a few changes to what you type.

a. In other estimation commands, covariates are assumed to be continuous unless i. is specified in front of variable names. In anova and manova, covariates are assumed to be factors unless c. is specified.

b. To form an interaction, you now use varname#varname rather than varname*varname. A * now means variable-name expansion. A | continues to be used to indicate nesting.

c. varname1##varname2 can now be specified to indicate full factorial layout, i.e, varname1 varname2 varname1#varname2. You can use varname1##varname2##varname3 to form 3-way factorial layouts, and so on.

d. No longer allowed are negative and noninteger levels for categorical variables. Options category(), class(), and continuous() are no longer allowed; instead, factor-variable notations i. and . are used where there might be ambiguity.

e. Reporting option regress is no longer allowed. To redisplay results, use the regress command after anova, or the mvreg command after manova.

f. Option detail is no longer allowed nor necessary. Output produced by anova and manova is self explanatory, and you can use regress or mvreg if you want factor-level information.

g. Option noanova is no longer allowed. To suppress output, type quietly in front of the command just as you would with any other estimation command.

h. New option dropemptycells makes anova and manova more space efficient by dropping from e(b) and e(V) any interactions for which there are no observations. The disadvantage is that new postestimation command margins then cannot identify nonestimability and issue the appropriate warnings; see [R] margins.

i. The following postestimation commands now work after anova just as they do after regress: dfbeta, estat imtest, estat szroeter, estat vif, hausman, lrtest, margins, predictnl, nlcom, suest, testnl, and testparm. Full estat hettest syntax is now allowed, too.

j. The following postestimation commands now work after manova just as they do after mvreg: margins, nlcom, predictnl, and testnl.

k. Existing command test used after anova now allows all the syntaxes allowed after regress while continuing to allow the special syntaxes for anova.

l. Existing command test used after manova now allows all the syntaxes allowed after mvreg while continuing to allow the special syntaxes for manova.

Old anova and manova syntaxes continue to work under version control. See [R] anova and [MV] manova.

8. Concerning the bootstrap and jackknife prefix commands:

a. They may now be used with anova and manova.

b. bootstrap's new option jackknifeopts() allows options to be passed to jackknife for computing acceleration values for BCa confidence intervals.

c. bootstrap no longer overwrites the macro e(version), which the command being prefixed saved.

9. Concerning fractional polynomial regression:

a. Existing commands fracpoly and mfp have a new syntax. They are now prefix commands, so you type fracpoly, ... estimation_command and mfp, ... : estimation_command. Old syntax continues to be understood.

b. Option adjust() used by fracpoly, mfp, and fracgen is renamed center(). The old option continues to be understood.

c. fracpoly now works with intreg; see [R] intreg.

d. mfp now works with intreg; see [R] intreg.

See [R] fracpoly and [R] mfp.

10. Concerning the existing estimates command:

a. estimates save has new option append, which allows results to be appended to an existing file. See [R] estimates save.

b. estimates use and estimates describe using have new option number(#), which specifies the results to be used or described. See [R] estimates save and [R] estimates describe.

c. estimates table now supports factor variables and time-series-operated variables and so supports the new options vsquish, noomitted, baselevels, allbaselevels, and noemptycells; see [R] estimates table.

11. Concerning existing estimation command ivregress:

a. New postestimation command estat endogenous for use with ivregress 2sls and ivregress gmm performs tests of whether endogenous regressors can be treated as exogenous; see [R] ivregress postestimation.

b. New option perfect for use with ivregress 2sls and ivregress gmm allows perfect instruments; it skips checking whether endogenous regressors are collinear with excluded instruments (see [R] ivregress).

12. Concerning regress:

a. Existing postestimation command dfbeta now names the variables it creates differently. Variables are now named _dfbeta_# rather than DFname. The old naming convention is restored under version control.

b. New option notable suppresses display of the coefficient table.

See [R] regress.

13. Constraints are now allowed by existing estimation commands blogit, bprobit, logistic, logit, ologit, oprobit, and probit. New option collinear specifies not to omit collinear variables from the model.

14. New option nocnsreport for use on estimation commands suppresses display of constraints. See [R] estimation options.

15. Existing command pcorr can now calculate semipartial correlation coefficients; see [R] pcorr.

16. Existing command pwcorr has new option listwise to omit observations in which any of the variables contain missing and thus mimic correlate's treatment of missing values, while maintaining access to all of pwcorr's other features; see [R] correlate.

17. Existing estimation command glm now allows option ml in family(nbinomial ml) to allow estimation via maximum likelihood; see [R] glm.

18. Existing estimation commands asmprobit and asroprobit have several new features:

a. New option factor(#) specifies that a factor covariance structure with dimension # be used.

b. New option favor(speed | space) allows you to set the speed/memory tradeoff. favor(speed) is the default.

c. New option nopivot specifies that interval pivoting not be used in integration. By default, the programs pivot the wider of the integration intervals into the interior of the multivariate integration. Although this improves the accuracy of the quadrature estimate, discontinuities may result in the computation of numerical second-order derivatives.

d. New postestimation command estat facweights specifies that the covariance factor weights be displayed in matrix form.

e. Existing postestimation command estat correlation now uses a default output format of %9.4f instead of the previous %6.3f.

See [R] asmprobit, [R] asroprobit, [R] asmprobit postestimation, and [R] asroprobit postestimation.

19. biprobit with option constraints() specified now applies these constraints when fitting the comparison models. As such, we can now report a likelihood-ratio (LR) test of the comparison model test instead of a Wald test. To obtain a Wald comparison test, type test [athrho]_cons after fitting the model.

20. Existing quality-control commands cchart, pchart, rchart, xchart, and shewhart have new option nograph, which suppresses the display of the graph. These commands also now return in r() the relevant values displayed in the charts. Also, pchart has new option generate(), which saves the variables plotted in the chart. See [R] qc.

21. predict used after mlogit, mprobit, ologit, oprobit, and slogit now defaults to predicting the probability of observing the first outcome. Previously, the outcome() option was required.

22. Existing estimation command reg3 now reports large-sample statistics by default when constraints are specified, regardless of the estimator used.

23. Several estimation commands now accept existing convergence-criterion options nrtolerance(#) and nonrtolerance. Commands include blogit, factor, logit, mlogit, ologit, oprobit, probit, rologit, stcox, and tobit. The default is nrtolerance(1e-5).

24. Existing estimation commands exlogistic and expoisson allow option memory() to be more than 512 MB; see [R] exlogistic and [R] expoisson.

25. Existing command ssc, which obtains user-written software from the Statistical Software Components archive, has new syntax ssc hot to list the most-downloaded submissions; see [R] ssc.

What's new in statistics (longitudinal data/panel data)

1. New command xtunitroot performs the Levin-Lin-Chu, Harris-Tzavalis, Breitung's, Im-Pesaran-Shin, Fisher-type, and Hadri Lagrange multiplier tests for unit roots on panel data. See [XT] xtunitroot.

2. Concerning existing estimation command xtmixed:

a. xtmixed now allows modeling of the residual-error structure of the linear mixed models. Five structures are available: independent, exchangeable, autoregressive (AR), moving average (MA), and unstructured. Use new option residuals(). Within residuals(), you may also specify suboption by(varname) to obtain heteroskedastic versions of the above structures. For example, specifying residuals(independent, by(sex)) will estimate distinct residual variances for both males and females.

b. xtmixed has new options matlog and matsqrt, which specify the matrix square root and matrix logarithm variance-component parameterizations, respectively. Previously, xtmixed supported the matrix logarithm parameterization only. Now xtmixed supports both parameterizations and the default has changed to matsqrt. Previous default behavior is preserved under version control.

c. xtmixed now supports time-series operators.

See [XT] xtmixed.

3. predict after xtmixed now allows new option reses for obtaining standard errors of predicted random effects (best linear unbiased predictions). See [XT] xtmixed postestimation.

4. Concerning existing estimation command xtreg:

a. Specifying xtreg, re vce(robust) now means the same as xtreg, re vce(cluster panelvar. The new interpretation is robust to a broader class of deviations. The old interpretation is available under version control.

b. Similarly, specifying xtreg, fe vce(robust) now means the same as xtreg, fe vce(cluster panelvar) in light of the new results by Stock and Watson (2008).

c. xtreg now allows the in range qualifier.

See [XT] xtreg.

5. All xt estimation commands now allow Stata's new factor-variable varlist notation, with the exception of commands xtabond, xtdpd, xtdpdsys, and xthtaylor. See [U] 11.4.3 Factor variables. Also, estimation commands allow the standard set of factor-variable-related reporting options; see [R] estimation options.

6. New postestimation command margins is available after all xt estimation commands; see [R] margins.

7. Concerning existing estimation commands xtmelogit and xtmepoisson:

a. They have new option matsqrt, which allows you to explicitly specify the default matrix square-root parameterization.

b. They now support time-series operators.

See [XT] xtmelogit and [XT] xtmepoisson.

8. As of Stata 10.1, existing estimation commands xtmixed, xtmelogit, and xtmepoisson require that random-effects specifications contain an explicit level variable (or _all) followed by a colon. Previously, if these were omitted, a level specification of _all: was assumed, leading to confusion when only the colon was omitted. To avoid this confusion, omitting the colon now produces an error, with previous behavior preserved under control.

9. Existing command xttab now returns the matrix of results in r(results) and the number of panels in r(n). See [XT] xttab.

What's new in statistics (time series)

1. New estimation command sspace fits linear state-space models by maximum likelihood. In state-space models, the dependent variables are linear functions of unobserved states and observed exogenous variables. This includes VARMA, structural time-series, some linear dynamic, and some stochastic general-equilibrium models. sspace can estimate stationary and nonstationary models. See [TS] sspace.

2. New estimation command dvech estimates diagonal vech multivariate GARCH models. These models allow the conditional variance matrix of the dependent variables to follow a flexible dynamic structure in which each element of the current conditional variance matrix depends on its own past and on past shocks. See [TS] dvech.

3. New estimation command dfactor estimates dynamic-factor models. These models allow the dependent variables and the unobserved factor variables to have vector autoregressive (VAR) structures and to be linear functions of exogenous variables. See [TS] dfactor.

4. Estimation commands newey, prais, sspace, dvech, and dfactor allow Stata's new factor-variable varlist notation; see [U] 11.4.3 Factor variables. Also, these estimation commands allow the standard set of factor-variable-related reporting options; see [R] estimation options.

5. New postestimation command margins, which calculates marginal means, predictive margins, marginal effects, and average marginal effects, is available after arch, arima, newey, prais, sspace, dvech, and dfactor. See [R] margins.

6. New display option vsquish for estimation commands, which allows you to control the spacing in output containing time-series operators or factor variables, is available after all time-series estimation commands. See [R] estimation options.

7. New display option coeflegend for estimation commands, which displays the coefficients' legend showing how to specify them in an expression, is available after all time-series estimation commands. See [R] estimation options.

8. predict after regress now allows time-series operators in option dfbeta(); see [R] regress postestimation. Also allowing time-series operators are regress postestimation commands estat szroeter, estat hettest, avplot, and avplots. See [R] regress postestimation.

9. Existing estimation commands mlogit, ologit, and oprobit now allow time-series operators; see [R] mlogit, [R] ologit, and [R] oprobit.

10. Existing estimation commands arch and arima now accept maximization option showtolerance; see [R] maximize.

11. Existing estimation command arch now allows you to fit models assuming that the disturbances follow Student's t distribution or the generalized error distribution, as well as the Gaussian (normal) distribution. Specify which distribution to use with option distribution(). You can specify the shape or degree-of-freedom parameter, or you can let arch estimate it along with the other parameters of the model. See [TS] arch.

12. Existing command tsappend is now faster. See [TS] tsappend.

What's new in statistics (survival analysis)

1. Stata's new stcrreg command fits competing-risks regression models. In a competing-risks model, subjects are at risk of failure because of two or more separate and possibly correlated causes. See [ST] stcrreg. Existing command stcurve will now graph cumulative incidence functions after stcrreg; see [ST] stcurve.

2. Stata's new multiple-imputation features may be used with stcox, streg, and stcrreg; see [MI] intro.

3. Factor variables may now be used with stcox, streg, and stcrreg. See [U] 11.4.3 Factor variables.

4. New postestimation command margins, which calculates marginal means, predictive margins, marginal effects, and average marginal effects, is available after stcox, streg, and stcrreg. See [R] margins.

5. New reporting options baselevels and allbaselevels control how base levels of factor variables are displayed in output tables. New reporting option noemptycells controls whether missing cells in interactions are displayed.

These new options are supported by estimation commands stcox, streg, and stcrreg, and by existing postestimation commands estat summarize and estat vce. See [R] estimation options.

6. New reporting option noomitted controls whether covariates that are dropped because of collinearity are reported in output tables. By default, Stata now includes a line in estimation and related output tables for collinear covariates and marks those covariates as "(omitted)". noomitted suppresses those lines.

noomitted is supported by estimation commands stcox, streg, and stcrreg, and by existing postestimation commands estat summarize and estat vce. See [R] estimation options.

7. New option vsquish eliminates blank lines in estimation and related tables. Many output tables now set off factor variables and time-series-operated variables with a blank line. vsquish removes these lines.

vsquish is supported by estimation commands stcox, streg, and stcrreg, and by existing postestimation command estat summarize. See [R] estimation options.

8. Estimation commands stcox, streg, and stcrreg support new option coeflegend to display the coefficients' legend rather than the coefficient table. The legend shows how you would type a coefficient in an expression, in a test command, or in a constraint definition. See [R] estimation options.

9. Estimation commands streg and stcrreg support new option nocnsreport to suppress reporting constraints; see [R] estimation options.

10. Concerning predict:

a. predict after stcox offers three new diagnostic measures of influence: DFBETAs, likelihood displacement values, and LMAX statistics. See [ST] stcox postestimation.

b. predict after stcox can now calculate diagnostic statistics basesurv(), basechazard(), basehc(), mgale(), effects(), esr(), schoenfeld(), and scaledsch(). Previously, you had to request these statistics when you fit the model by specifying the option with the stcox command. Now you obtain them by using predict after estimation. The options continue to work with stcox directly but are no longer documented. See [ST] stcox postestimation.

c. predict after stcox and streg now produces subject-level residuals by default. Previously, record-level or partial results were produced, although there was an inconsistency. This affects multiple-record data only because there is no difference between subject-level and partial residuals in single-record data. This change affects predict's options mgale, csnell, deviance, and scores after stcox (and new options ldisplace, lmax, and dfbeta, of course); and it affects mgale and deviance after streg. predict, deviance was the inconsistency; it always produced subject-level results.

For instance, in previous Stata versions you typed

. predict cs, csnell

to obtain partial Cox-Snell residuals. One statistic per record was produced. To obtain subject-level residuals, for which there is one per subject and which predict stored on each subject's last record, you typed

. predict ccs, ccsnell

In Stata 11, when you type

. predict cs, csnell

you obtain the subject-level residual. To obtain the partial, you use the new partial option:

. predict cs, csnell partial

The same applies to all the other residuals. Concerning the inconsistency, partial deviances are now available.

Not affected is predict, scores after streg. Log-likelihood scores in parametric models are mathematically defined at the record level and are meaningful only if evaluated at that level.

Prior behavior is restored under version control. See [ST] stcox postestimation, [ST] streg postestimation, and [ST] stcrreg postestimation.

11. stcox now allows up to 100 time-varying covariates as specified in option tvc(). The previous limit was 10. See [ST] stcox.

12. Existing commands stcurve and estat phtest no longer require that you specify the appropriate options to stcox before using them. The commands automatically generate the statistics they require. See [ST] stcurve and [ST] stcox PH-assumption tests.

13. Existing epitab commands ir, cs, cc, and mhodds now treat missing categories of variables in by() consistently. By default, missing categories are now excluded from the computation. This may be overridden by specifying by()'s new option missing. See [R] epitab.

14. Existing command sts list has new option saving(), which creates a dataset containing the results. See [ST] sts list.

What's new in statistics (multivariate)

1. New command mvtest performs multivariate tests on means, covariances, and correlations (both one-sample and multiple-sample), and it performs tests of univariate, bivariate, and multivariate normality. Included are Box's M test for covariances, and for tests of normality, the Doornik-Hansen omnibus test, Henze-Zirkler test, Mardia's multivariate kurtosis test, and Mardia's multivariate skewness test. See [MV] mvtest.

2. The new factor-variable syntax allowed throughout Stata affects manova even though manova always allowed factor variables. See [MV] manova.

a. manova has an all-new syntax. The old syntax continues to work under version control.

b. manova, just like anova, adopts the new factor-variable syntax, but with a twist. In other Stata commands, continuous is assumed and you use i.varname to indicate a categorical variable. In manova and anova, categorical is assumed and you use c.varname to indicate continuous. Thus the options category(), class(), and continuous() are no longer used.

c. To form an interaction, you use varname1#varname2. Previously, you used varname1*varname2. A * now means variable-name expansion, just as it does on other commands, so you could type manova y* = a b* a#b*. The | symbol continues to be used for nesting.

d. You can now use varname1##varname2 as a shorthand for full factorial, meaning varname1 varname2 varname1#varname2. You can use varname1##varname2##varname3 for 3-way factorial, and so on.

3. Existing command mvreg may now be used after manova to show results in regression-style format, just as regress can be used after anova. See [MV] manova.

4. Existing command test after manova, in addition to allowing the special syntax previously provided, now allows all the standard test syntax, too. See [MV] manova postestimation.

5. Existing commands predictnl, nlcom, testnl, and testparm may now be used after manova; see [R] predictnl, [R] nlcom, [R] testnl, and [R] test.

6. New postestimation command margins may be used after manova. See [R] margins.

7. manova now requires that categorical variables take on nonnegative integer values. Previously, a categorical variable could take on values -1, 2.5, 3.14159, etc., although few did. Arbitrary values are still allowed under version control. See [MV] manova.

8. manova's new option dropemptycells removes unobserved levels from the model rather than setting their coefficients to zero. Statistically, the approaches are equivalent. Computationally, a larger matsize is required when empty cells are retained. In models with many interactions, you may need to specify this option. See [MV] manova and see [R] set emptycells.

9. Programmers: The row and column names on e(b), e(V), etc., after manova are now meaningful and follow standard factor-variable notation. See What's new in [P] intro.

10. Existing command biplot has several improvements:

a. biplot can now be used with larger datasets. Previously, the row dimension was limited by Stata's maximum matsize.

b. biplot has new option generate(), which saves the coordinates of observations in variables.

c. biplot has new options rowover() and row#opts(), which allow highlighting groups of observations on the graph and customizing the look of the graph.

d. New option rowlabel() makes customizing rows easier.

e. biplot now drops constant variables from the computation.

f. biplot now uses an improved version of the singular value decomposition, which may result in sign differences and slight differences in values.

g. rowopts(), colopts(), and negcolopts() now allow names to contain simple and compound quotes.

h. biplot did not honor option scheme(economist) for separate graphs (option separate). This has been fixed.

11. Existing command canon's default output has changed. It previously displayed something that looked like estimation output but was not because standard errors were conditional. The output now looks like you would expect. The conditional output can be obtained by specifying new option stderr or under version control (set version to 10 or earlier).

12. The manual now includes a glossary; see [MV] Glossary.

What's new in statistics (survey)

1. New command margins, a highlight of the release, may be used after estimation, whether survey or not, but will be of special interest to those doing survey estimation. One aspect of margins -- predictive margins -- was developed by survey statisticians for reporting survey results.

margins lets you explore the response surface of a fitted model in any metric of interest -- means, linear predictions, probabilities, marginal effects, risk differences, and so on. margins can evaluate responses for fixed values of the covariates or for observations in a sample or subsample. Average responses can be obtained, not just responses that are conditional on fixed values of the covariates. Survey-adjusted standard errors and confidence intervals are reported based on a linearized variance estimator of the response that accounts for the sampling distribution of the covariates. Thus inferences can be made about the population. See [R] margins.

2. Survey estimators may be used with Stata's new multiple-imputation features. Either svyset your data before you mi set your data or use mi svyset afterward. See [MI] intro.

3. Survey commands now report population and subpopulation sizes with a larger number of digits, reserving scientific notation only for sizes greater than 99 trillion.

4. Survey estimation commands may now be used with factor variables; see [U] 11.4.3 Factor variables.

5. New reporting options baselevels and allbaselevels control how base levels of factor variables are displayed in output tables. New reporting option noemptycells controls whether missing cells in interactions are displayed. These new options are supported by existing prefix command svy and existing postestimation commands estat effects and estat vce. See [R] estimation options.

6. New reporting option noomitted controls whether covariates that are dropped because of collinearity are reported in output tables. By default, Stata now includes a line in estimation and related output tables for collinear covariates and marks those covariates as "(omitted)". noomitted suppresses those lines.

noomitted is supported by prefix command svy and postestimation commands estat effects and estat vce. See [R] estimation options.

7. New option vsquish eliminates blank lines in estimation and related tables. Many output tables now set off factor variables and time-series-operated variables with a blank line. vsquish removes these lines.

vsquish is supported by prefix command svy and postestimation command estat effects.

8. Prefix command svy now supports new option coeflegend to display the coefficients' legend rather than the coefficient table. The legend shows how you would type a coefficient in an expression, in a test command, or in a constraint definition. See [R] estimation options.

9. Prefix command svy now supports new option nocnsreport to suppress reporting constraints; see [R] estimation options.

What's new in statistics (multiple imputation)

1. All of it. Multiple imputation is about the analysis of data for which some values are missing. See [MI] intro.

2. New command misstable makes tables that help you understand the pattern of missing values in your data; see [R] misstable and [MI] mi misstable.

3. Estimation commands that may be used with mi estimate include the following:

Command Description ----------------------------------------------------------------- Linear regression models regress Linear regression cnsreg Constrained linear regression mvreg Multivariate regression

Binary-response regression models logistic Logistic regression, reporting odds ratios logit Logistic regression, reporting coefficients probit Probit regression cloglog Complementary log-log regression binreg GLM for the binomial family

Count-response regression models poisson Poisson regression nbreg Negative binomial regression gnbreg Generalized negative binomial regression

Ordinal-response regression models ologit Ordered logistic regression oprobit Ordered probit regression

Categorical-response regression models mlogit Multinomial (polytomous) logistic regression mprobit Multinomial probit regression clogit Conditional (fixed-effects) logistic regression

Quantile regression models qreg Quantile regression iqreg Interquantile range regression sqreg Simultaneous-quantile regression bsqreg Quantile regression with bootstrap standard errors

Survival regression models stcox Cox proportional hazards model streg Parametric survival models stcrreg Competing-risks regression

Other regression models glm Generalized linear models areg Linear regression with a large dummy-variable set rreg Robust regression truncreg Truncated regression

Descriptive statistics mean Estimate means proportion Estimate proportions ratio Estimate ratios

Survey regression models svy: Estimation commands for survey data (excluding commands that are not listed above) -----------------------------------------------------------------

What's new in graphics

1. A release highlight, text in graphs now supports multiple fonts. You can display symbols, Greek letter, subscripts, superscripts, as well as text in multiple font faces including bold and italic. See [G-4] text. Everything is automatic, but you can set up the fonts to be used; see [G-2] graph set, [G-3] ps_options, and [G-3] eps_options.

2. Stata's Graph Editor can now record a series of edits and apply them to other graphs; see Graph Recorder in [G-1] graph editor. You can also apply recorded edits from the command line. See [G-2] graph play and see option play(recordingname) in [G-3] std_options and [G-2] graph use.

3. The dialog box for graph twoway now allows plots to be reordered when multiple plots have been defined.

What's new in programming

1. The big news in programming concerns parsing varlists containing factor variables, dealing with factor variables, and processing matrices whose row or column names contain factor variables.

a. syntax will allow varlists to contain factor variables if new specifier fv is among the specifiers in the description of the varlist, for instance,

syntax varlist(fv) [if] [in] [, Detail]

Similarly, syntax will allow a varlist option to include factor variables if fv is included among its specifiers:

syntax varlist(fv) [if] [in] [, Detail] EQ(varlist fv)

See [P] syntax.

b. You can use resulting macro `varlist' as the varlist for any Stata command that allows factor varlists.

c. Factor varlists come in two flavors, general and specific. An example of a general factor varlist is mpg i.foreign. The corresponding specific factor varlist might be

mpg i(0 1)b0.foreign

A specific factor varlist is specific with respect to a given problem, which is to say, a given dataset and subsample. The specific varlist identifies the values taken on by factor variables and the base.

Users usually specify general factor varlists, although they can specify specific ones. In the process of your program, a factor varlist, if it is general, will become specific. This is usually automatic.

Existing commands _rmcoll and _rmdcoll now accept a general or specific factor varlist and return a specific varlist in r(varlist). See [P] _rmcoll.

Existing command ml accepts a general or specific factor varlist and returns a specific varlist, in this case in the row and column names of the vectors and matrices it produces; see [R] ml. The same applies to Mata's new moptimize() function, which is equivalent to ml; see [M-5] moptimize().

Similarly, all Stata estimation commands that allow factor varlists return the specific varlist in the row and column names of e(b) and e(V).

Factor varlist mpg i(0 1)b0.foreign is specific. The same varlist could be written mpg i0b.foreign i1.foreign, so that is specific, too. The first is specific and unexpanded. The second is specific and expanded. New command fvexpand takes a general or specific (expanded or unexpanded) factor varlist, if or in, and returns a fully expanded, specific varlist. See [P] fvexpand.

New command fvunab takes a general or specific factor varlist and returns it in the same form, but with variable names unabbreviated. See [P] unab.

d. Matrix row and column names are now generalized to include factor variables. The row or column names contain the elements from a fully expanded, specific factor varlist. Because a fully expanded, specific factor varlist is a factor varlist, the contents of the row or column names can be used with other Stata commands as a varlist. Unrelatedly, the equation portion of the row or column name now has a maximum length of 127 rather than the previous 32.

e. The treatment of variables that are omitted because of collinearity has changed. Previously, such variables were dropped from e(b) and e(V) except by regress, which included the variables but set the corresponding element of e(b) to zero and similarly set the corresponding row and column of e(V) to zero. Now all Stata estimators that allow factor variables work like regress.

Also, if you want to know why the variable was dropped, you can look at the corresponding element of the row or column name. The syntax of an expanded, specific varlist allows operators o and b. Operator o indicates omitted either because the user specified omitted or because of collinearity; b indicates omitted because of being a base category. For instance, o.mpg would indicate that mpg was omitted, whereas i0b.foreign would indicate that foreign=0 was omitted because it was the base category. Either way, the corresponding element of e(b) will be zero, as will the corresponding rows and columns of e(V).

This new treatment of omitted variables -- previously called dropped variables -- can cause old user-written programs to break. This is especially true of old postestimation commands not designed to work with regress. If you set version to 10 or earlier before estimation, however, then estimation results will be stored in the old way and the old postestimation commands will work. The solution is

. version 10 . estimation_command ... . old_postestimation_command ... . version 11

When running under version 10 or earlier, you may not use factor variables with the estimation command.

f. Because omitted variables are now part of estimation results, constraints play a larger role in the implementation of estimators. Omitted variables have coefficients constrained to be zero. ml now handles such constraints automatically and posts in e(k_autoCns) the number of such constraints, which can be due to the variable being used as the base, being empty, or being omitted. makecns similarly saves in r(k_autoCns) the number of such constraints, and in r(clist), the constraints used. The matrix of constraints is now posted with ereturn post and saved, as usual, in e(Cns). ereturn matrix no longer posts constraints. Old behavior is preserved under version control. See [R] ml, [P] makecns, and [P] ereturn.

g. There are additional commands to assist in using and manipulating factor varlists that are documented only online; type help undocumented in Stata.

2. Factor variables also allow interactions. Up to eight-way interactions are allowed.

a. Consider the interaction a#b. If each took on two levels, the unexpanded, specific varlist would be i(1 2)b1.a#i(1 2)b1.b. The expanded, specific varlist would be 1b.a#1b.b 1b.a#2.b 2.a#1b.b 2.a#2.b.

b. Consider the interaction c.x#c.x, where x is continuous. The unexpanded and expanded, specific varlists are the same as the general varlist: c.x#c.x.

c. Consider the interaction a#c.x. The unexpanded, specific varlist is i(1 2).a#c.x, and the expanded, specific varlist is 1.a#c.x 2.a#c.x.

d. All of these varlists are handled in the same way that factor variables are handled, as outlined in item 1 above.

3. New command fvrevar creates equivalent, temporary variables for any factor variables, interactions, or times-series-operated variables so that older commands can be easily converted to working with factor variables. We hasten to add that, in general, Stata does not follow the fvrevar approach. Think of this fvrevar as a generalization of tsrevar. See [R] fvrevar.

4. Factor variables lead to a number of additions to what is saved in e() and sometimes r():

a. Estimation commands that post e(V) now post the corresponding rank of the matrix in scalar e(rank).

b. Estimation commands that allow constraints now post the constraints matrix in matrix e(Cns).

c. In many estimation commands allowing constraints, and in the programming command makecns, scalar e(k_autoCns) is now posted containing the sum of the number of base, empty, and omitted constraints.

d. Programming command makecns now save the constraints used in macro r(clist).

e. Estimation commands that allow factor variables now post in macro e(asbalanced) the name of each factor variable participating in e(b) that was fvset design asbalanced and post in macro e(asobserved) the name of each factor variable participating in e(b) that was fvset design asobserved.

f. Estimation commands now post in macros how new command margins is to treat their prediction statistics when the statistics require special treatment. These macros are e(marginsok), e(marginsnotok), and e(marginsprop).

e(marginsok) specifies the name of predictors that are to be allowed and that appear to violate margins' usual rules, such as dependent variables being involved in the calculation.

e(marginsnotok) are statistics that margins fails to identify as violating assumptions but that do and should not be allowed.

e(marginsprop) provides special signals as to how statistics for the estimator must be handled. Currently allowed are combinations of addcons, noeb, and nochainrule. addcons means that the estimated equations have no constant even if the user did not specify noconstant at estimation time. noeb means that the estimator does not store the covariate names in the column names of e(b). nochainrule means that the chain rule may not be used to calculate derivatives.

g. Matrix e(V_modelbased), the model-based VCE, is now posted by most estimation commands that allow robust variance estimation by bootstrap and jackknife.

h. Existing command sktest now returns in matrix r(N) the matrix of observation counts and in matrix r(Utest) the matrix of test results.

5. Existing command estimates describe using now saves in scalar r(nestresults) the number of sets of estimation results saved in the .ster file.

6. Existing command correlate saves in matrix r(C) the correlation or covariance matrix.

7. Existing command ml has been rewritten. It is now implemented in terms of new Mata function and optimization engine moptimize(). The new ml handles automatic or implied constraints, posts some additional information to e(), and allows evaluators written in Mata as well as ado. See [R] maximize for an overview and see [R] ml and [M-5] moptimize().

8. Existing command estimates save now has option append, which allows storing more than one set of estimation results in the same file; see [R] estimates save.

9. Existing commands ereturn post and ereturn repost now work with more commands, including logit, mlogit, ologit, oprobit, probit, qreg, _qreg, regress, stcox, and tobit. Also, ereturn post and ereturn repost now allow weights to be specified and save them in e(wtype) and e(wexp). See [P] ereturn.

10. Existing command markout has new option sysmissok, which excludes observations with variables equal to system missing (.) but not to extended missing (.a, .b, ..., .z); see [P] mark. This has to do with new emphasis on imputation of missing values; see [MI] intro.

11. New commands varabbrev and unabbrev make it easy to temporarily reset whether Stata allows variable-name abbreviations; see [P] varabbrev.

12. New programming function smallestdouble() returns the smallest double-precision number greater than zero; see [FN] Programming functions.

13. creturn has new returned values:

a. c(noisily) returns 0 when output is being suppressed and 1 otherwise. Thus programmers can avoid executing code whose only purpose is to display output.

b. c(smallestdouble) returns the smallest double-precision value that is greater than 0.

c. c(tmpdir) returns the temporary directory being used by Stata.

d. c(eqlen) returns the maximum length that Stata allows for equation names.

14. Existing extended macro function :dir has new option respectcase, which causes :dir to respect uppercase and lowercase when performing filename matches. This option is relevant only for Windows.

15. Stata has new string functions strtoname(), soundex(), and soundex_nara(); see [FN] String functions.

16. Stata has 17 new numerical functions: sinh(), cosh(), asinh(), and acosh(); hypergeometric() and hypergeometricp(); nbinomial(), nbinomialp(), and nbinomialtail(); invnbinomial() and invnbinomialtail(); poisson(), poissonp(), and poissontail(); invpoisson() and invpoissontail(); and binomialp(); see [FN] Trigonometric functions and [FN] Statistical functions.

17. Stata has nine new random-variate functions for beta, binomial, chi-squared, gamma, hypergeometric, negative binomial, normal, Poisson, and Student's t: rbeta(), rbinomial(), rchi2(), rgamma(), rhypergeometric(), rnbinomial(), rnormal(), rpoisson(), and rt(), respectively. Also, old function uniform() is renamed runiform(). All random-variate functions start with r. See [FN] Random-number functions.

18. Existing command clear has new syntax clear matrix, which clears (drops) all Stata matrices, as distinguished from clear mata, which drops all Mata matrices and functions. See [D] clear.

19. These days, commands intended for use by end-users are often being used as subroutines by other end-user commands. Some of these commands preserve the data simply so that, should something go wrong or the user press Break, the original data can be restored. Sometimes, when such commands are used as subroutines, the caller has already preserved the data. Therefore, all programmers are requested to include option nopreserve on commands that preserve the data for no other reason than error recovery, and thus speed execution when commands are used as subroutines. See [P] nopreserve option.

What's new in Mata

1. Mata now allows full object-oriented programming! A class is a set of variables, related functions, or both tied together under one name. One class can be derived from another via inheritance. Variables can be public, private, protected, or static. Functions can be public, private, protected, static, or virtual. Members, whether variables or functions, can be final. Classes, member functions, and access to member variables and calls to member functions are fully compiled -- not interpreted -- meaning there is no speed penalty for casting your program in terms of a class. See [M-2] class.

2. The new moptimize() suite of functions comprises Stata's new optimization engine used by ml and thereby, either directly or indirectly, by nearly all official Stata estimation commands. moptimize() provides full support for Stata's new factor variables. See [M-5] moptimize(), [R] ml, and [R] maximize.

moptimize is important. The full story is that Stata's ml is implemented in terms of Mata's moptimize(), which in turn is implemented in terms of Mata's optimize(). optimize() finds parameters p = (p_1, p_2, ..., p_n) that maximize or minimize f(p). moptimize() finds coefficients b = (b_1, b_2, ..., b_n), where p_1 = X_1b_1, p_2 = X_2b_2, ..., p_n = X_nb_n.

3. New function suite deriv() produces numerically calculated first and second derivatives of vector functions; see [M-5] deriv().

4. Improvements have been made to optimize():

a. optimize() with constraints is now faster for evaluator types d0 and v0 and for all gradient-based techniques. Also, it is faster for evaluator types d1 and v1 when used with constraints and with the nr (Newton-Raphson) technique.

b. Gauss-Newton optimization, also known as quadratic optimization, is now available as technique gn. Evaluator functions must be of type 'q'.

c. optimize() can now switch between techniques bhhh, nr, bfgs, and dfp (between Berndt-Hall-Hall-Hausman, Newton-Raphson, Broyden-Fletcher-Goldfarb-Shanno, and Davidon-Fletcher-Powell).

d. optimize(), when output of the convergence values is requested in the trace log, now displays the identity and value of the convergence criterion that is closest to being met.

e. optimize() has 15 new initialization functions:

optimize_init_cluster() optimize_init_trace_dots() optimize_init_colstripe() optimize_init_trace_gradient() optimize_init_conv_ignorenrtol() optimize_init_trace_Hessian() optimize_init_conv_warning() optimize_init_trace_params() optimize_init_evaluations() optimize_init_trace_step() optimize_init_gnweightmatrix() optimize_init_trace_tol() optimize_init_iterid() optimize_init_trace_value() optimize_init_negH()

Also, new function optimize_result_evaluations() reports the number of times the evaluator is called.

5. Existing functions st_data() and st_view() now allow the variables to be specified as a string scalar with space-separated names, as well as a string row vector with elements being names. In addition, when a string scalar is used, you now specify either or both time-series-operated variables (for example, l.gnp) and factor variables (for example, i.rep78).

6. Thirty-four LAPACK (Linear Algebra PACKage) functions are now available in as-is form and more are coming. LAPACK is the premier software for solving systems of simultaneous equations, eigenvalue problems, and singular value decompositions. Many of Mata's matrix functions are and have been implemented using LAPACK. We are now in the process of making all the double-precision LAPACK real and complex functions available in raw form for those who want to program their own advanced numerical techniques. See [M-5] lapack() and [R] copyright lapack.

7. New function suite eigensystemselect() computes the eigenvectors for selected eigenvalues; see [M-5] eigensystemselect().

8. New function suite geigensystem() computes generalized eigenvectors and eigenvalues; see [M-5] geigensystem().

9. New function suites hessenbergd() and ghessenbergd() compute the (generalized) Hessenberg decompositions; see [M-5] hessenbergd() and [M-5] ghessenbergd().

10. New function suites schurd() and gschurd() compute the (generalized) Schur decompositions; see [M-5] schurd() and [M-5] gschurd().

11. New function _negate() quickly negates a matrix in place; see [M-5] _negate().

12. New functions Dmatrix(), Kmatrix(), and Lmatrix() compute the duplication matrix, commutation matrix, and elimination matrix used in computing derivatives of functions of symmetric matrices; see [M-5] Dmatrix(), [M-5] Kmatrix(), and [M-5] Lmatrix().

13. New function sublowertriangle() extracts the lower triangle of a matrix, where lower triangle means below a specified diagonal; see [M-5] sublowertriangle().

14. New function hasmissing() returns whether a matrix contains any missing values; see [M-5] missing().

15. New function strtoname() performs the same actions as Stata's strtoname() function: it converts a general string to a string meeting the Stata naming conventions. See [M-5] strtoname().

16. New function abbrev() performs the same actions as Stata's abbrev() function: it returns abbreviated variable names. See [M-5] abbrev().

17. New function _st_tsrevar() is a handle-the-error-yourself variation of existing function st_tsrevar(); see [M-5] st_tsrevar().

18. Existing functions ghk() and ghkfast(), which evaluate multivariate normal integrals, have improved syntax; see [M-5] ghk() and [M-5] ghkfast().

19. Existing functions vec() and vech() are now faster for both real and complex matrices; see [M-5] vec().

20. Mata has 13 new distribution-related functions: hypergeometric() and hypergeometricp(); nbinomial(), nbinomialp(), and nbinomialtail(); invnbinomial() and invnbinomialtail(); poisson(), poissonp(), and poissontail(); invpoisson() and invpoissontail(); and binomialp(); see [M-5] normal().

21. Mata has nine new random-variate functions for beta, binomial, chi-squared, gamma, hypergeometric, negative binomial, normal, Poisson, and Student's t: rbeta(), rbinomial(), rchi2(), rgamma(), rhypergeometric(), rnbinomial(), rnormal(), rpoisson(), and rt(), respectively.

Also, rdiscrete() is provided for drawing from a general discrete distribution.

Old functions uniform() and uniformseed() are replaced with runiform() and rseed(). All random-variate functions start with r. See [M-5] runiform().

22. Existing functions sinh(), cosh(), asinh(), and acosh() now have improved accuracy; see [M-5] sin().

23. New function soundex() returns the soundex code for a name and consists of a letter followed by three numbers. New function soundex_nara() returns the U.S. Census soundex for a name and also consists of a letter followed by three numbers, but is produced by a different algorithm. See [M-5] soundex().

24. Existing function J(r, c, val) now allows val to be specified as a matrix and creates an r*rows(val) x c*cols(val) result. The third argument, val, was previously required to be 1 x 1. Behavior in the 1 x 1 case is unchanged. See [M-5] J().

25. Existing functions sort(), _sort(), and order() sorted the rows of a matrix based on up to 500 of its columns. This limit has been removed. See [M-5] sort().

26. New function asarray() provides associative arrays; see [M-5] asarray().

27. New function hash1() provides Jenkins' one-at-a-time hash function; see [M-5] hash1().

28. Mata object-code libraries (.mlib's) may now contain up to 2,048 functions and may contain up to 1,024 by default. Use mlib create's new size() option to change the default. The previous fixed maximum was 500. See [M-3] mata mlib.

29. Mata on 64-bit computers now supports matrices larger than 2 gigabytes when the computer has sufficient memory.

30. One hundred and nine existing functions now take advantage of multiple cores when using Stata/MP. They are

acos() factorial() minutes() arg() Fden() mm() asin() floatround() mmC() atan2() floor() mod() atan() Ftail() mofd() betaden() gammaden() month() binomial() gammap() msofhours() binomialtail() gammaptail() msofminutes() binormal() halfyear() msofseconds() ceil() hh() nbetaden() chi2() hhC() nchi2() chi2tail() hofd() nFden() Cofc() hours() nFtail() cofC() ibeta() nibeta() Cofd() ibetatail() normal() cofd() invbinomial() normalden() comb() invbinomialtail() npnchi2() cos() invchi2() qofd() day() invchi2tail() quarter() dgammapda() invF() round() dgammapdada() invFtail() seconds() dgammapdadx() invgammap() sin() dgammapdx() invgammaptail() sqrt() dgammapdxdx() invibeta() ss() digamma() invibetatail() tan() dofC() invnchi2() tden() dofc() invnFtail() trigamma() dofh() invnibeta() trunc() dofm() invnormal() ttail() dofq() invttail() week() dofw() ln() wofd() dofy() lnfactorial() year() dow() lngamma() yh() doy() lnnormal() ym() exp() lnnormalden() yq() F() mdy() yw()

What's more

We have not listed all the changes, but we have listed the important ones.

Stata is continually being updated, and those updates are available for free over the Internet. All you have to do is type

. update query

and follow the instructions.

To learn what has been added since this manual was printed, select Help > What's New? or type

. help whatsnew

We hope that you enjoy Stata 11.

--- previous updates ----------------------------------------------------------

See whatsnew10.

-------------------------------------------------------------------------------


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index