What's new in release 11.0 (compared with release 10)
This file lists the changes corresponding to the creation of Stata
release 11.0:
+---------------------------------------------------------------+
| help file contents years |
|---------------------------------------------------------------|
| whatsnew Stata 15.0 and 15.1 2017 to present |
| whatsnew14to15 Stata 15.0 new release 2017 |
| whatsnew14 Stata 14.0, 14.1, and 14.2 2015 to 2017 |
| whatsnew13to14 Stata 14.0 new release 2015 |
| whatsnew13 Stata 13.0 and 13.1 2013 to 2015 |
| whatsnew12to13 Stata 13.0 new release 2013 |
| whatsnew12 Stata 12.0 and 12.1 2011 to 2013 |
| whatsnew11to12 Stata 12.0 new release 2011 |
| whatsnew11 Stata 11.0, 11.1, and 11.2 2009 to 2011 |
| this file Stata 11.0 new release 2009 |
| whatsnew10 Stata 10.0 and 10.1 2007 to 2009 |
| whatsnew9to10 Stata 10.0 new release 2007 |
| whatsnew9 Stata 9.0, 9.1, and 9.2 2005 to 2007 |
| whatsnew8to9 Stata 9.0 new release 2005 |
| whatsnew8 Stata 8.0, 8.1, and 8.2 2003 to 2005 |
| whatsnew7to8 Stata 8.0 new release 2003 |
| whatsnew7 Stata 7.0 2001 to 2002 |
| whatsnew6to7 Stata 7.0 new release 2000 |
| whatsnew6 Stata 6.0 1999 to 2000 |
+---------------------------------------------------------------+
Most recent changes are listed first.
--- more recent updates -------------------------------------------------------
See whatsnew11.
--- Stata 11.0 release 13jul2009 ----------------------------------------------
Remarks
We will list all the changes, item by item, but first, here are the
highlights:
1. Stata now allows factor variables! In estimation, you can now
fit models by typing, for example,
. regress y i.sex i.group i.sex#i.group age (1)
. regress y i.sex##i.group age (same as 1)
. regress y i.sex i.group i.region i.sex#i.group
i.sex#i.region i.group#i.region (2)
i.sex#i.group#i.region
age
. regress y i.sex##i.group##i.region age (same as 2)
and Stata will form for itself the indicator variables for sex,
group, and region, and their interactions. You do not use the
old xi command, and no new variables will be created in your
data. You can form interactions of factor variables with
continuous variables, and continuous variables with continuous
variables by using the c. prefix:
. regress y i.sex##i.group##i.region
age c.age#c.age (3)
. regress y i.sex##i.group##i.region
age i.sex##i.group##i.region#c.age (4)
c.age#c.age i.sex##i.group##i.region#c.age#c.age
. regress y i.sex##i.group##i.region##c.age (same as 4)
i.sex##i.group##i.region##c.age#c.age
This new factor-variable notation is understood by nearly every
Stata estimation command, so you can type, for example,
. logistic outcome i.treatment##i.sex age bp c.age#c.bp
Factor variables work with summarize and list, too:
. list outcome i.treatment##i.sex
Factor variables have lots of additional features; see [U] 11.4.3
Factor variables.
2. Stata 11's new postestimation command margins estimates margins
and marginal effects. Included are estimated marginal means,
least-squares means, average and conditional marginal and partial
effects, average and conditional adjusted predictions, predictive
margins, and more. There are few users who will not find margins
useful. It will be well worth your time to read [R] margins.
3. Stata's new mi suite of commands performs multiple imputation.
There is so much to say that mi gets its own manual.
mi provides methods for the analysis of incomplete data, data for
which some values are missing, and provides both the imputation
and estimation steps. mi's estimation step combines the
estimation and pooling steps. Multivariate normal imputation is
provided, along with five univariate methods that can be used
alone or as building blocks for multivariate imputation.
mi can import already imputed data, including data from NHANES
and ice. mi solves the problem of keeping multiple datasets in
sync. You can create or drop variables or observations just as
if you were working with one dataset. You can merge, append, and
reshape data, all of which is to say that you can perform data
management either before or even after forming the imputations.
Included is an interactive control panel that provides access to
almost all of mi's capabilities and guides you through the steps
of analysis.
See [MI] intro.
4. The new Variables Manager is the one-stop place to go to manage
your variables. Click on the Variables Manager button or type
varmanage. You can change names, labels, display formats, and
storage types. You can define and edit notes, and define and
edit value labels. The Variables Manager is useful even for
those who have thousands of variables in their data; just type
part of the name in the filter at the top left. See [D]
varmanage and [GS] 7 Using the Variables Manager (GSM, GSU, or
GSW).
5. The Data Editor is all new. It is now a live view onto your
data, which means that you can run a Stata command and see the
changes reflected immediately. You can apply filters to view
subsets of your data, take snapshots so that you can undo
changes, and enter dates and times the natural way. See [D] edit
and [GS] 6 Using the Data Editor (GSM, GSU, or GSW).
6. The Do-file Editor under Windows is all new, too. Syntax
highlighting and code folding are provided. There is no limit to
file size. See [D] doedit.
7. You can now put bold and italic text, Greek letters, symbols,
superscripts, and subscripts on graphs! See [G-4] text.
8. If you are not reading this on your computer, you could be.
Stata now has PDF manuals -- [GS], [U], [D], [G], [MI], [MV],
[R], [ST], [SVY], [TS], [XT], [P], [M], and [I] -- and they are
shipped with every copy of Stata. Select Help > PDF
Documentation. Even better, the manuals are integrated into the
help system. From a help file, you can jump directly to the
relevant page just by clicking on the reference. There is
nothing more to know.
There are other exciting new features in this release depending on who
you are and what interests you. These include
o competing-risks regression models; see [ST] stcrreg
o GMM estimation; see [R] gmm
o state-space (Kalman filtering) modeling; see [TS] sspace
o multivariate GARCH; see [TS] dvech
o dynamic-factor models; see [TS] dfactor
o unit-root tests for panel data; see [XT] xtunitroot
o error structures for linear mixed models; see [XT] xtmixed
o standard errors for BLUPs in linear mixed models; see [XT] xtmixed
o object-oriented programming in Mata; see [M-2] class
o full model-based optimization in Mata; see [M-5] moptimize()
o numerical derivative function in Mata; see [M-5 deriv()
Each of these, and more, is covered in the sections that follow.
What's new in the GUI and command interface
1. As mentioned in the highlights, the new Variables Manager is the
one-stop place to go to manage your variables. See [D] varmanage
and [GS] 7 Using the Variables Manager (GSM, GSU, or GSW).
2. Also a highlight is the new Data Editor, a live view onto your
data. See [D] edit and [GS] 6 Using the Data Editor (GSM, GSU,
or GSW).
3. The Do-file Editor is all new under Windows and provides syntax
highlighting and code folding. See [D] doedit.
4. You doubtlessly have already noticed that Stata's Results window
now has a white background. Stata has several new color schemes,
and the one you are seeing is called Standard. What was the
default scheme in Stata 10 is called Classic, so if you want it
back, select Edit > Preferences > General Preferences... and
change the scheme for the Results window to it. You can try the
other schemes or make your own and save it in Custom 1, Custom 2,
or Custom 3.
5. In Stata for Windows, you can now choose from among five
different default layouts for the overall size and position of
Stata's windows or, just as previously, you can make your own.
Select Edit > Preferences > Load Preference Set and pick a
layout. In addition to Factory Settings, available are Compact
Window Layout and three Presentation layouts optimized for
different projector resolutions.
6. Output scrolling in the Results window is now significantly
faster. Also, the upper limit of set scrollbufsize has been
increased to 2,000,000. See [R] set.
7. In Stata for Windows, Graph windows no longer float.
8. In Stata for Windows, existing command windows manage has new
subcommand prefs for loading and saving named preference sets;
type help window manage for details.
9. Stata for Unix(GUI) now supports copying graphs to the Clipboard
in bitmap format.
10. Stata for Mac now supports copying graphs to the Clipboard in PDF
format.
11. Stata for Mac's graphical user interface (GUI) has been
completely rewritten in Apple's Cocoa programming interface.
12. Stata for Mac is now available as a universal binary that runs
natively on 32-bit Intel- or PowerPC-based Macs and 64-bit
Intel-based Macs to deliver optimal performance for all three
architectures in a single package.
What's new in data management
1. Existing command merge has all new syntax. It is easier to use,
easier to read, and makes it less likely that you will make a
mistake. Merges are classified as 1:1, 1:m, m:1, and m:m. When
you type merge 1:1, you are saying that you expect the
observations to match one-to-one. merge 1:m specifies a
1-to-many merge; m:1, a many-to-1 merge; and m:m, a many-to-many
merge. New options assert() and keep() allow you to specify what
you expect the outcome to be and what you want to keep from it.
For instance,
. merge 1:1 subjid using filename, assert(match)
means that you expect all the observations in both datasets to
match each other, whereas
. merge 1:1 subjid using filename, assert(match using) keep(match)
specifies that you expect each observation to either match or be
solely from the using data and, assuming that is true, you want
to keep only the matches.
Sorting of both the master and the using datasets is now
automatic.
The new merge does not support merging multiple files in one
step. Merge the first two datasets, then merge that result with
the next dataset, and so on.
merge now aborts with error if variables are string in one
dataset and numeric in the other unless new option force is
specified.
See [D] merge. The old merge syntax continues to work.
2. Existing command append has several new features: 1) it will
work even if there are no data in memory; 2) multiple files can
be appended in one step; and 3) new option generate(newvar)
creates a variable indicating the source of the observations,
numbered 0, 1, .... append now aborts with error if variables
are string in one dataset and numeric in the other unless new
option force is specified. See [D] append. Old behavior is
preserved under version control.
3. Stata's default memory allocations have changed:
a. Stata/SE and Stata/MP now default to allocating 50 M of
memory rather than 10 M. Stata/IC now defaults to 10 M
rather than 1 M. Stata's required footprint has not grown;
we reset these defaults because users were resetting to
larger numbers anyway.
b. Stata/IC now defaults matsize to 400 rather than 200; the
default for Stata/SE and Stata/MP remains 400. The default
for Small Stata is now 100 rather than 40.
4. Existing command order now does what order, move, and aorder did;
see [D] order. Old commands aorder and move continue to work but
are no longer documented.
5. New commands zipfile and unzipfile compress and uncompress files
and directories in zip archive format. See [D] zipfile.
6. New command changeeol converts text from one operating system's
end-of-line format to another's. Stata does not care about
end-of-line format, but some editors and other programs do. See
[D] changeeol.
7. New command snapshot saves to disk and restores from disk copies
of the data in memory. snapshot is used by the new Data Editor.
An important feature of the Data Editor is that it can log all
the changes you make interactively. snapshot will show up in
those logs. snapshot really is a command of Stata, so you can
replay logs to duplicate past efforts. For your own use,
however, it is better if you continue using preserve and restore.
See [D] snapshot.
8. You can now copy-and-paste commands from logs and execute them
without editing out the period (the dot prompt) in front! Stata
11 ignores leading periods.
9. Existing command notes has new options search, replace, and
renumber. See [D] notes.
10. Concerning value labels:
a. Existing command label define has new option replace so that
you do not have to drop the value label before redefining it.
b. New command label copy copies value labels.
c. Existing command label values now allows a varlist, so you
can label (or unlabel) a group of variables at the same time.
See [D] label.
11. Existing command expand has new option generate(newvar) that
makes it easier to distinguish original from duplicated
observations. See [D] expand.
12. Concerning egen:
a. New function rowmedian(varlist) returns, observation by
observation, the median of the values in varlist.
b. New function rowpctile(varlist), p(#) returns, observation by
observation, the #th row percentile of the values within
varlist.
c. Existing function mode(varname) with option missing treats
missing values as a category. When version is set to 10 or
less, missing does not treat missing as a category.
d. Existing functions total(exp) and rowtotal(varlist) have new
option missing. If all values of exp or varlist for an
observation are missing, then that observation in newvar will
be set to missing.
See [D] egen.
13. Existing command copy now allows copying a file to a directory
without having to type the filename twice; see [D] copy.
14. Existing command clear now allows clear matrix to clear all Stata
matrices (as distinguished from Mata matrices) from memory; see
[D] clear.
15. Existing command outfile now exports date variables as strings
rather than their underlying numeric value. Under version
control, old behavior is restored. See [D] outfile.
16. Existing command reshape now preserves variable and value labels
when converting from long to wide and restores variable and value
labels when converting from wide to long. Thus the value and
variable labels for the i variable, which exists in long form but
not in wide form, are restored when converting back from wide to
long. The value labels of the xij variables are similarly
restored. Prior behavior is preserved when version is 10 or
earlier. See [D] reshape.
17. Existing command collapse now allows new statistics semean,
sebinomial, and sepoisson for obtaining the standard error of the
mean. See [D] collapse.
18. Existing command destring allows new option dpcomma to convert to
numeric form string representation of numbers using commas as the
decimal point. See [D] destring.
19. Concerning existing command odbc:
a. odbc insert now uses parameterized inserts, which are faster.
b. The dialogs for odbc load and odbc insert can now store a
data-source user ID and password for a Stata session.
c. odbc query has new options verbose and schema. verbose lists
any data source alias, nickname, typed table, typed view, and
view along with tables so that data from these table types
can be loaded. schema lists schema names with the table
names if the data source returns schema information.
d. odbc insert has a new dialog.
e. Existing option dsn() now allows the data source to be up to
499 characters.
f. odbc now reports driver errors directly. Previously, odbc
would issue the error "ODBC error; type -set debug on- and
rerun command to see extended error information" when an ODBC
driver issued an error.
g. odbc, with set debug on, for security reasons no longer
displays the data source name, user ID, and password used for
connecting to your data source.
See [D] odbc.
20. New function strtoname() converts a general string to a string
meeting Stata's naming conventions. Also, existing functions
lower(), ltrim(), proper(), reverse(), rtrim(), and upper() now
have synonyms strlower(), strltrim(), ..., and strupper(). Both
sets of names work equally well. See [FN] String functions.
21. New function soundex() returns the soundex code for a name,
consisting of a letter followed by three numbers. New function
soundex_nara() returns the U.S. Census soundex for a name, also
consisting of a letter followed by three numbers, but produced by
a different algorithm. See [FN] String functions.
22. New functions sinh(), cosh(), asinh(), and acosh() join existing
functions tanh() and atanh() to provide the hyperbolic functions.
See [FN] Trigonometric functions.
23. New functions binomialp(); hypergeometric() and
hypergeometricp(); nbinomial(), nbinomialp(), and
nbinomialtail(); and poisson(), poissonp(), and poissontail()
provide distribution and probability mass for the binomial,
hypergeometric, negative binomial, and Poisson distributions.
See [FN] Statistical functions.
24. New functions invnbinomial() and invnbinomialtail(), and
invpoisson() and invpoissontail() provide inverses for the
negative binomial and Poisson distributions. See [FN]
Statistical functions.
25. Algorithms for the existing functions normal() and lnnormal()
have been improved to operate in 60% and 75% of the time,
respectively, while giving equivalent double-precision results.
26. New functions rbeta(), rbinomial(), rchi2(), rgamma(),
rhypergeometric(), rnbinomial(), rnormal(), rpoisson(), and rt()
produce random variates for the beta, binomial, chi-squared,
gamma, hypergeometric, negative binomial, normal, Poisson, and
Student's t distributions, respectively.
Old function uniform() has been renamed to runiform(), but
uniform() continues to work.
Thus all random-variate functions start with r.
See [FN] Random-number functions.
27. Existing command drawnorm now uses new function rnormal() to
generate random variates. When version is set to 10 or earlier,
drawnorm reverts to using invnormal(uniform()). See [FN]
Random-number functions.
28. Existing command describe now respects the width of the Results
window when formatting output; see [D] describe.
29. Existing command renpfix now returns the list of variables
changed in r(varlist); see [D] rename.
30. Previously existing command impute still works but is now
undocumented. It is replaced by the new multiple-imputation
command mi. See the Multiple-Imputation Reference Manual.
What's new in statistics (general)
1. The highlight of this release is statistics related, namely,
factor variables. We have already said a lot about them. You
will not be able to avoid them. You will not want to avoid them.
See [U] 11.4.3 Factor variables.
2. The new postestimation command margins is also a highlight of
this release. margins estimates margins and marginal effects.
Included are estimated marginal means, least-squares means,
average and conditional marginal and partial effects, average and
conditional adjusted predictions, predictive margins, and more.
We urge you to read [R] margins.
margins replaces old commands mfx and adjust. mfx and adjust are
no longer documented but continue to work under version control.
3. New command mi performs multiple imputation; see [MI] intro.
4. New command misstable makes tables that help you understand the
pattern of missing values in your data; see [R] misstable.
5. New command gmm implements the generalized method of moments
estimator. gmm allows linear and nonlinear models; allows
one-step, two-step, and iterative estimators; works with
cross-sectional, time-series, and panel data; and allows
panel-style instruments. To fit a model, you need only write the
expressions of the moments. See [R] gmm.
6. Concerning factor variables:
a. Factor variables may be specified with almost all estimation
commands (see item 6g below).
b. If an estimation command works with factor variables, so do
its postestimation commands. If the postestimation command
accepts or requires a varlist, factor variables may be
specified.
c. Factor variables may be specified with existing commands list
and summarize.
d. Commands that allow factor variables also allow new options
affecting how output appears: vsquish, baselevels,
allbaselevels, noemptycells, and noomitted. Many commands
that work with factor variables, such as estat summarize,
estat vce, and the like, also allow the above options.
Estimation commands also allow new option coeflegend. See
[R] estimation options.
coeflegend is useful when you wish to access the coefficients
or standard errors individually using _b[] or _se[], such as
when you are using lincom, nlcom, or test. coeflegend
provides what you need to type.
vsquish reduces the amount of white space used vertically to
display results.
Stata used to drop covariates because of collinearity before
performing estimation. This is now handled differently.
Stata dropped variables for three reasons: because they were
1) base levels of factors, 2) levels corresponding to
interactions where there were no data, and 3) truly
collinear. These are now identified separately.
New option baselevels says to report reason 1 in main
effects.
New option allbaselevels says to report reason 1 in all
terms.
New option noemptycells says not to report reason 2.
New option noomitted says not to report reason 3.
e. New command fvset allows you to specify default base levels
and design settings for variables that can be recorded in the
dataset and so remembered from one session to the next; see
[R] fvset.
f. New command set emptycells drop specifies that all estimation
commands drop covariates associated with empty cells from
estimation. The default is set emptycells keep. If you have
sufficient memory, it is better to keep the covariates
because then new postestimation command margins can better
identify nonestimability.
g. Factor variables are allowed with the following estimation
commands: anova, areg, binreg, biprobit, blogit, bootstrap,
bprobit, clogit, cloglog, dfactor, dvech, eivreg, frontier,
glm, glogit, gnbreg, gprobit, heckman, heckprob, hetprob,
intreg, ivprobit, ivregress, ivtobit, jackknife, logistic,
logit, manova, mlogit, mprobit, mvreg, nbreg, newey, ologit,
oprobit, poisson, prais, probit, reg3, regress, rologit,
rreg, scobit, slogit, sspace, stcox, streg, sureg, svy,
tobit, treatreg, truncreg, xtcloglog, xtfrontier, xtgee,
xtgls, xtintreg, xtivreg, xtlogit, xtmelogit, xtmepoisson,
xtmixed, xtnbreg, xtpcse, xtpoisson, xtprobit, xtrc, xtreg,
xtregar, xttobit, zinb, zip, ztnb, and ztp.
7. anova and manova now use Stata's new factor-variable syntax,
which means new estimation and postestimation features and a few
changes to what you type.
a. In other estimation commands, covariates are assumed to be
continuous unless i. is specified in front of variable names.
In anova and manova, covariates are assumed to be factors
unless c. is specified.
b. To form an interaction, you now use varname#varname rather
than varname*varname. A * now means variable-name expansion.
A | continues to be used to indicate nesting.
c. varname1##varname2 can now be specified to indicate full
factorial layout, i.e, varname1 varname2 varname1#varname2.
You can use varname1##varname2##varname3 to form 3-way
factorial layouts, and so on.
d. No longer allowed are negative and noninteger levels for
categorical variables. Options category(), class(), and
continuous() are no longer allowed; instead, factor-variable
notations i. and . are used where there might be ambiguity.
e. Reporting option regress is no longer allowed. To redisplay
results, use the regress command after anova, or the mvreg
command after manova.
f. Option detail is no longer allowed nor necessary. Output
produced by anova and manova is self explanatory, and you can
use regress or mvreg if you want factor-level information.
g. Option noanova is no longer allowed. To suppress output,
type quietly in front of the command just as you would with
any other estimation command.
h. New option dropemptycells makes anova and manova more space
efficient by dropping from e(b) and e(V) any interactions for
which there are no observations. The disadvantage is that
new postestimation command margins then cannot identify
nonestimability and issue the appropriate warnings; see [R]
margins.
i. The following postestimation commands now work after anova
just as they do after regress: dfbeta, estat imtest, estat
szroeter, estat vif, hausman, lrtest, margins, predictnl,
nlcom, suest, testnl, and testparm. Full estat hettest
syntax is now allowed, too.
j. The following postestimation commands now work after manova
just as they do after mvreg: margins, nlcom, predictnl, and
testnl.
k. Existing command test used after anova now allows all the
syntaxes allowed after regress while continuing to allow the
special syntaxes for anova.
l. Existing command test used after manova now allows all the
syntaxes allowed after mvreg while continuing to allow the
special syntaxes for manova.
Old anova and manova syntaxes continue to work under version
control. See [R] anova and [MV] manova.
8. Concerning the bootstrap and jackknife prefix commands:
a. They may now be used with anova and manova.
b. bootstrap's new option jackknifeopts() allows options to be
passed to jackknife for computing acceleration values for BCa
confidence intervals.
c. bootstrap no longer overwrites the macro e(version), which
the command being prefixed saved.
9. Concerning fractional polynomial regression:
a. Existing commands fracpoly and mfp have a new syntax. They
are now prefix commands, so you type fracpoly, ...
estimation_command and mfp, ... : estimation_command. Old
syntax continues to be understood.
b. Option adjust() used by fracpoly, mfp, and fracgen is renamed
center(). The old option continues to be understood.
c. fracpoly now works with intreg; see [R] intreg.
d. mfp now works with intreg; see [R] intreg.
See [R] fracpoly and [R] mfp.
10. Concerning the existing estimates command:
a. estimates save has new option append, which allows results to
be appended to an existing file. See [R] estimates save.
b. estimates use and estimates describe using have new option
number(#), which specifies the results to be used or
described. See [R] estimates save and [R] estimates
describe.
c. estimates table now supports factor variables and
time-series-operated variables and so supports the new
options vsquish, noomitted, baselevels, allbaselevels, and
noemptycells; see [R] estimates table.
11. Concerning existing estimation command ivregress:
a. New postestimation command estat endogenous for use with
ivregress 2sls and ivregress gmm performs tests of whether
endogenous regressors can be treated as exogenous; see [R]
ivregress postestimation.
b. New option perfect for use with ivregress 2sls and ivregress
gmm allows perfect instruments; it skips checking whether
endogenous regressors are collinear with excluded instruments
(see [R] ivregress).
12. Concerning regress:
a. Existing postestimation command dfbeta now names the
variables it creates differently. Variables are now named
_dfbeta_# rather than DFname. The old naming convention is
restored under version control.
b. New option notable suppresses display of the coefficient
table.
See [R] regress.
13. Constraints are now allowed by existing estimation commands
blogit, bprobit, logistic, logit, ologit, oprobit, and probit.
New option collinear specifies not to omit collinear variables
from the model.
14. New option nocnsreport for use on estimation commands suppresses
display of constraints. See [R] estimation options.
15. Existing command pcorr can now calculate semipartial correlation
coefficients; see [R] pcorr.
16. Existing command pwcorr has new option listwise to omit
observations in which any of the variables contain missing and
thus mimic correlate's treatment of missing values, while
maintaining access to all of pwcorr's other features; see [R]
correlate.
17. Existing estimation command glm now allows option ml in
family(nbinomial ml) to allow estimation via maximum likelihood;
see [R] glm.
18. Existing estimation commands asmprobit and asroprobit have
several new features:
a. New option factor(#) specifies that a factor covariance
structure with dimension # be used.
b. New option favor(speed | space) allows you to set the
speed/memory tradeoff. favor(speed) is the default.
c. New option nopivot specifies that interval pivoting not be
used in integration. By default, the programs pivot the
wider of the integration intervals into the interior of the
multivariate integration. Although this improves the
accuracy of the quadrature estimate, discontinuities may
result in the computation of numerical second-order
derivatives.
d. New postestimation command estat facweights specifies that
the covariance factor weights be displayed in matrix form.
e. Existing postestimation command estat correlation now uses a
default output format of %9.4f instead of the previous %6.3f.
See [R] asmprobit, [R] asroprobit, [R] asmprobit postestimation,
and [R] asroprobit postestimation.
19. biprobit with option constraints() specified now applies these
constraints when fitting the comparison models. As such, we can
now report a likelihood-ratio (LR) test of the comparison model
test instead of a Wald test. To obtain a Wald comparison test,
type test [athrho]_cons after fitting the model.
20. Existing quality-control commands cchart, pchart, rchart, xchart,
and shewhart have new option nograph, which suppresses the
display of the graph. These commands also now return in r() the
relevant values displayed in the charts. Also, pchart has new
option generate(), which saves the variables plotted in the
chart. See [R] qc.
21. predict used after mlogit, mprobit, ologit, oprobit, and slogit
now defaults to predicting the probability of observing the first
outcome. Previously, the outcome() option was required.
22. Existing estimation command reg3 now reports large-sample
statistics by default when constraints are specified, regardless
of the estimator used.
23. Several estimation commands now accept existing
convergence-criterion options nrtolerance(#) and nonrtolerance.
Commands include blogit, factor, logit, mlogit, ologit, oprobit,
probit, rologit, stcox, and tobit. The default is
nrtolerance(1e-5).
24. Existing estimation commands exlogistic and expoisson allow
option memory() to be more than 512 MB; see [R] exlogistic and
[R] expoisson.
25. Existing command ssc, which obtains user-written software from
the Statistical Software Components archive, has new syntax ssc
hot to list the most-downloaded submissions; see [R] ssc.
What's new in statistics (longitudinal data/panel data)
1. New command xtunitroot performs the Levin-Lin-Chu,
Harris-Tzavalis, Breitung's, Im-Pesaran-Shin, Fisher-type, and
Hadri Lagrange multiplier tests for unit roots on panel data.
See [XT] xtunitroot.
2. Concerning existing estimation command xtmixed:
a. xtmixed now allows modeling of the residual-error structure
of the linear mixed models. Five structures are available:
independent, exchangeable, autoregressive (AR), moving
average (MA), and unstructured. Use new option residuals().
Within residuals(), you may also specify suboption
by(varname) to obtain heteroskedastic versions of the above
structures. For example, specifying residuals(independent,
by(sex)) will estimate distinct residual variances for both
males and females.
b. xtmixed has new options matlog and matsqrt, which specify the
matrix square root and matrix logarithm variance-component
parameterizations, respectively. Previously, xtmixed
supported the matrix logarithm parameterization only. Now
xtmixed supports both parameterizations and the default has
changed to matsqrt. Previous default behavior is preserved
under version control.
c. xtmixed now supports time-series operators.
See [XT] xtmixed.
3. predict after xtmixed now allows new option reses for obtaining
standard errors of predicted random effects (best linear unbiased
predictions). See [XT] xtmixed postestimation.
4. Concerning existing estimation command xtreg:
a. Specifying xtreg, re vce(robust) now means the same as xtreg,
re vce(cluster panelvar. The new interpretation is robust to
a broader class of deviations. The old interpretation is
available under version control.
b. Similarly, specifying xtreg, fe vce(robust) now means the
same as xtreg, fe vce(cluster panelvar) in light of the new
results by Stock and Watson (2008).
c. xtreg now allows the in range qualifier.
See [XT] xtreg.
5. All xt estimation commands now allow Stata's new factor-variable
varlist notation, with the exception of commands xtabond, xtdpd,
xtdpdsys, and xthtaylor. See [U] 11.4.3 Factor variables. Also,
estimation commands allow the standard set of
factor-variable-related reporting options; see [R] estimation
options.
6. New postestimation command margins is available after all xt
estimation commands; see [R] margins.
7. Concerning existing estimation commands xtmelogit and
xtmepoisson:
a. They have new option matsqrt, which allows you to explicitly
specify the default matrix square-root parameterization.
b. They now support time-series operators.
See [XT] xtmelogit and [XT] xtmepoisson.
8. As of Stata 10.1, existing estimation commands xtmixed,
xtmelogit, and xtmepoisson require that random-effects
specifications contain an explicit level variable (or _all)
followed by a colon. Previously, if these were omitted, a level
specification of _all: was assumed, leading to confusion when
only the colon was omitted. To avoid this confusion, omitting
the colon now produces an error, with previous behavior preserved
under control.
9. Existing command xttab now returns the matrix of results in
r(results) and the number of panels in r(n). See [XT] xttab.
What's new in statistics (time series)
1. New estimation command sspace fits linear state-space models by
maximum likelihood. In state-space models, the dependent
variables are linear functions of unobserved states and observed
exogenous variables. This includes VARMA, structural
time-series, some linear dynamic, and some stochastic
general-equilibrium models. sspace can estimate stationary and
nonstationary models. See [TS] sspace.
2. New estimation command dvech estimates diagonal vech multivariate
GARCH models. These models allow the conditional variance matrix
of the dependent variables to follow a flexible dynamic structure
in which each element of the current conditional variance matrix
depends on its own past and on past shocks. See [TS] dvech.
3. New estimation command dfactor estimates dynamic-factor models.
These models allow the dependent variables and the unobserved
factor variables to have vector autoregressive (VAR) structures
and to be linear functions of exogenous variables. See [TS]
dfactor.
4. Estimation commands newey, prais, sspace, dvech, and dfactor
allow Stata's new factor-variable varlist notation; see [U]
11.4.3 Factor variables. Also, these estimation commands allow
the standard set of factor-variable-related reporting options;
see [R] estimation options.
5. New postestimation command margins, which calculates marginal
means, predictive margins, marginal effects, and average marginal
effects, is available after arch, arima, newey, prais, sspace,
dvech, and dfactor. See [R] margins.
6. New display option vsquish for estimation commands, which allows
you to control the spacing in output containing time-series
operators or factor variables, is available after all time-series
estimation commands. See [R] estimation options.
7. New display option coeflegend for estimation commands, which
displays the coefficients' legend showing how to specify them in
an expression, is available after all time-series estimation
commands. See [R] estimation options.
8. predict after regress now allows time-series operators in option
dfbeta(); see [R] regress postestimation. Also allowing
time-series operators are regress postestimation commands estat
szroeter, estat hettest, avplot, and avplots. See [R] regress
postestimation.
9. Existing estimation commands mlogit, ologit, and oprobit now
allow time-series operators; see [R] mlogit, [R] ologit, and [R]
oprobit.
10. Existing estimation commands arch and arima now accept
maximization option showtolerance; see [R] maximize.
11. Existing estimation command arch now allows you to fit models
assuming that the disturbances follow Student's t distribution or
the generalized error distribution, as well as the Gaussian
(normal) distribution. Specify which distribution to use with
option distribution(). You can specify the shape or
degree-of-freedom parameter, or you can let arch estimate it
along with the other parameters of the model. See [TS] arch.
12. Existing command tsappend is now faster. See [TS] tsappend.
What's new in statistics (survival analysis)
1. Stata's new stcrreg command fits competing-risks regression
models. In a competing-risks model, subjects are at risk of
failure because of two or more separate and possibly correlated
causes. See [ST] stcrreg. Existing command stcurve will now
graph cumulative incidence functions after stcrreg; see [ST]
stcurve.
2. Stata's new multiple-imputation features may be used with stcox,
streg, and stcrreg; see [MI] intro.
3. Factor variables may now be used with stcox, streg, and stcrreg.
See [U] 11.4.3 Factor variables.
4. New postestimation command margins, which calculates marginal
means, predictive margins, marginal effects, and average marginal
effects, is available after stcox, streg, and stcrreg. See [R]
margins.
5. New reporting options baselevels and allbaselevels control how
base levels of factor variables are displayed in output tables.
New reporting option noemptycells controls whether missing cells
in interactions are displayed.
These new options are supported by estimation commands stcox,
streg, and stcrreg, and by existing postestimation commands estat
summarize and estat vce. See [R] estimation options.
6. New reporting option noomitted controls whether covariates that
are dropped because of collinearity are reported in output
tables. By default, Stata now includes a line in estimation and
related output tables for collinear covariates and marks those
covariates as "(omitted)". noomitted suppresses those lines.
noomitted is supported by estimation commands stcox, streg, and
stcrreg, and by existing postestimation commands estat summarize
and estat vce. See [R] estimation options.
7. New option vsquish eliminates blank lines in estimation and
related tables. Many output tables now set off factor variables
and time-series-operated variables with a blank line. vsquish
removes these lines.
vsquish is supported by estimation commands stcox, streg, and
stcrreg, and by existing postestimation command estat summarize.
See [R] estimation options.
8. Estimation commands stcox, streg, and stcrreg support new option
coeflegend to display the coefficients' legend rather than the
coefficient table. The legend shows how you would type a
coefficient in an expression, in a test command, or in a
constraint definition. See [R] estimation options.
9. Estimation commands streg and stcrreg support new option
nocnsreport to suppress reporting constraints; see [R] estimation
options.
10. Concerning predict:
a. predict after stcox offers three new diagnostic measures of
influence: DFBETAs, likelihood displacement values, and LMAX
statistics. See [ST] stcox postestimation.
b. predict after stcox can now calculate diagnostic statistics
basesurv(), basechazard(), basehc(), mgale(), effects(),
esr(), schoenfeld(), and scaledsch(). Previously, you had to
request these statistics when you fit the model by specifying
the option with the stcox command. Now you obtain them by
using predict after estimation. The options continue to work
with stcox directly but are no longer documented. See [ST]
stcox postestimation.
c. predict after stcox and streg now produces subject-level
residuals by default. Previously, record-level or partial
results were produced, although there was an inconsistency.
This affects multiple-record data only because there is no
difference between subject-level and partial residuals in
single-record data. This change affects predict's options
mgale, csnell, deviance, and scores after stcox (and new
options ldisplace, lmax, and dfbeta, of course); and it
affects mgale and deviance after streg. predict, deviance
was the inconsistency; it always produced subject-level
results.
For instance, in previous Stata versions you typed
. predict cs, csnell
to obtain partial Cox-Snell residuals. One statistic per
record was produced. To obtain subject-level residuals, for
which there is one per subject and which predict stored on
each subject's last record, you typed
. predict ccs, ccsnell
In Stata 11, when you type
. predict cs, csnell
you obtain the subject-level residual. To obtain the
partial, you use the new partial option:
. predict cs, csnell partial
The same applies to all the other residuals. Concerning the
inconsistency, partial deviances are now available.
Not affected is predict, scores after streg. Log-likelihood
scores in parametric models are mathematically defined at the
record level and are meaningful only if evaluated at that
level.
Prior behavior is restored under version control. See [ST]
stcox postestimation, [ST] streg postestimation, and [ST]
stcrreg postestimation.
11. stcox now allows up to 100 time-varying covariates as specified
in option tvc(). The previous limit was 10. See [ST] stcox.
12. Existing commands stcurve and estat phtest no longer require that
you specify the appropriate options to stcox before using them.
The commands automatically generate the statistics they require.
See [ST] stcurve and [ST] stcox PH-assumption tests.
13. Existing epitab commands ir, cs, cc, and mhodds now treat missing
categories of variables in by() consistently. By default,
missing categories are now excluded from the computation. This
may be overridden by specifying by()'s new option missing. See
[R] epitab.
14. Existing command sts list has new option saving(), which creates
a dataset containing the results. See [ST] sts list.
What's new in statistics (multivariate)
1. New command mvtest performs multivariate tests on means,
covariances, and correlations (both one-sample and
multiple-sample), and it performs tests of univariate, bivariate,
and multivariate normality. Included are Box's M test for
covariances, and for tests of normality, the Doornik-Hansen
omnibus test, Henze-Zirkler test, Mardia's multivariate kurtosis
test, and Mardia's multivariate skewness test. See [MV] mvtest.
2. The new factor-variable syntax allowed throughout Stata affects
manova even though manova always allowed factor variables. See
[MV] manova.
a. manova has an all-new syntax. The old syntax continues to
work under version control.
b. manova, just like anova, adopts the new factor-variable
syntax, but with a twist. In other Stata commands,
continuous is assumed and you use i.varname to indicate a
categorical variable. In manova and anova, categorical is
assumed and you use c.varname to indicate continuous. Thus
the options category(), class(), and continuous() are no
longer used.
c. To form an interaction, you use varname1#varname2.
Previously, you used varname1*varname2. A * now means
variable-name expansion, just as it does on other commands,
so you could type manova y* = a b* a#b*. The | symbol
continues to be used for nesting.
d. You can now use varname1##varname2 as a shorthand for full
factorial, meaning varname1 varname2 varname1#varname2. You
can use varname1##varname2##varname3 for 3-way factorial, and
so on.
3. Existing command mvreg may now be used after manova to show
results in regression-style format, just as regress can be used
after anova. See [MV] manova.
4. Existing command test after manova, in addition to allowing the
special syntax previously provided, now allows all the standard
test syntax, too. See [MV] manova postestimation.
5. Existing commands predictnl, nlcom, testnl, and testparm may now
be used after manova; see [R] predictnl, [R] nlcom, [R] testnl,
and [R] test.
6. New postestimation command margins may be used after manova. See
[R] margins.
7. manova now requires that categorical variables take on
nonnegative integer values. Previously, a categorical variable
could take on values -1, 2.5, 3.14159, etc., although few did.
Arbitrary values are still allowed under version control. See
[MV] manova.
8. manova's new option dropemptycells removes unobserved levels from
the model rather than setting their coefficients to zero.
Statistically, the approaches are equivalent. Computationally, a
larger matsize is required when empty cells are retained. In
models with many interactions, you may need to specify this
option. See [MV] manova and see [R] set emptycells.
9. Programmers: The row and column names on e(b), e(V), etc., after
manova are now meaningful and follow standard factor-variable
notation. See What's new in [P] intro.
10. Existing command biplot has several improvements:
a. biplot can now be used with larger datasets. Previously, the
row dimension was limited by Stata's maximum matsize.
b. biplot has new option generate(), which saves the coordinates
of observations in variables.
c. biplot has new options rowover() and row#opts(), which allow
highlighting groups of observations on the graph and
customizing the look of the graph.
d. New option rowlabel() makes customizing rows easier.
e. biplot now drops constant variables from the computation.
f. biplot now uses an improved version of the singular value
decomposition, which may result in sign differences and
slight differences in values.
g. rowopts(), colopts(), and negcolopts() now allow names to
contain simple and compound quotes.
h. biplot did not honor option scheme(economist) for separate
graphs (option separate). This has been fixed.
11. Existing command canon's default output has changed. It
previously displayed something that looked like estimation output
but was not because standard errors were conditional. The output
now looks like you would expect. The conditional output can be
obtained by specifying new option stderr or under version control
(set version to 10 or earlier).
12. The manual now includes a glossary; see [MV] Glossary.
What's new in statistics (survey)
1. New command margins, a highlight of the release, may be used
after estimation, whether survey or not, but will be of special
interest to those doing survey estimation. One aspect of margins
-- predictive margins -- was developed by survey statisticians
for reporting survey results.
margins lets you explore the response surface of a fitted model
in any metric of interest -- means, linear predictions,
probabilities, marginal effects, risk differences, and so on.
margins can evaluate responses for fixed values of the covariates
or for observations in a sample or subsample. Average responses
can be obtained, not just responses that are conditional on fixed
values of the covariates. Survey-adjusted standard errors and
confidence intervals are reported based on a linearized variance
estimator of the response that accounts for the sampling
distribution of the covariates. Thus inferences can be made
about the population. See [R] margins.
2. Survey estimators may be used with Stata's new
multiple-imputation features. Either svyset your data before you
mi set your data or use mi svyset afterward. See [MI] intro.
3. Survey commands now report population and subpopulation sizes
with a larger number of digits, reserving scientific notation
only for sizes greater than 99 trillion.
4. Survey estimation commands may now be used with factor variables;
see [U] 11.4.3 Factor variables.
5. New reporting options baselevels and allbaselevels control how
base levels of factor variables are displayed in output tables.
New reporting option noemptycells controls whether missing cells
in interactions are displayed. These new options are supported
by existing prefix command svy and existing postestimation
commands estat effects and estat vce. See [R] estimation
options.
6. New reporting option noomitted controls whether covariates that
are dropped because of collinearity are reported in output
tables. By default, Stata now includes a line in estimation and
related output tables for collinear covariates and marks those
covariates as "(omitted)". noomitted suppresses those lines.
noomitted is supported by prefix command svy and postestimation
commands estat effects and estat vce. See [R] estimation
options.
7. New option vsquish eliminates blank lines in estimation and
related tables. Many output tables now set off factor variables
and time-series-operated variables with a blank line. vsquish
removes these lines.
vsquish is supported by prefix command svy and postestimation
command estat effects.
8. Prefix command svy now supports new option coeflegend to display
the coefficients' legend rather than the coefficient table. The
legend shows how you would type a coefficient in an expression,
in a test command, or in a constraint definition. See [R]
estimation options.
9. Prefix command svy now supports new option nocnsreport to
suppress reporting constraints; see [R] estimation options.
What's new in statistics (multiple imputation)
1. All of it. Multiple imputation is about the analysis of data for
which some values are missing. See [MI] intro.
2. New command misstable makes tables that help you understand the
pattern of missing values in your data; see [R] misstable and
[MI] mi misstable.
3. Estimation commands that may be used with mi estimate include the
following:
Command Description
-----------------------------------------------------------------
Linear regression models
regress Linear regression
cnsreg Constrained linear regression
mvreg Multivariate regression
Binary-response regression models
logistic Logistic regression, reporting odds ratios
logit Logistic regression, reporting coefficients
probit Probit regression
cloglog Complementary log-log regression
binreg GLM for the binomial family
Count-response regression models
poisson Poisson regression
nbreg Negative binomial regression
gnbreg Generalized negative binomial regression
Ordinal-response regression models
ologit Ordered logistic regression
oprobit Ordered probit regression
Categorical-response regression models
mlogit Multinomial (polytomous) logistic regression
mprobit Multinomial probit regression
clogit Conditional (fixed-effects) logistic regression
Quantile regression models
qreg Quantile regression
iqreg Interquantile range regression
sqreg Simultaneous-quantile regression
bsqreg Quantile regression with bootstrap standard
errors
Survival regression models
stcox Cox proportional hazards model
streg Parametric survival models
stcrreg Competing-risks regression
Other regression models
glm Generalized linear models
areg Linear regression with a large dummy-variable
set
rreg Robust regression
truncreg Truncated regression
Descriptive statistics
mean Estimate means
proportion Estimate proportions
ratio Estimate ratios
Survey regression models
svy: Estimation commands for survey data (excluding
commands that are not listed above)
-----------------------------------------------------------------
What's new in graphics
1. A release highlight, text in graphs now supports multiple fonts.
You can display symbols, Greek letter, subscripts, superscripts,
as well as text in multiple font faces including bold and italic.
See [G-4] text. Everything is automatic, but you can set up the
fonts to be used; see [G-2] graph set, [G-3] ps_options, and
[G-3] eps_options.
2. Stata's Graph Editor can now record a series of edits and apply
them to other graphs; see Graph Recorder in [G-1] graph editor.
You can also apply recorded edits from the command line. See
[G-2] graph play and see option play(recordingname) in [G-3]
std_options and [G-2] graph use.
3. The dialog box for graph twoway now allows plots to be reordered
when multiple plots have been defined.
What's new in programming
1. The big news in programming concerns parsing varlists containing
factor variables, dealing with factor variables, and processing
matrices whose row or column names contain factor variables.
a. syntax will allow varlists to contain factor variables if new
specifier fv is among the specifiers in the description of
the varlist, for instance,
syntax varlist(fv) [if] [in] [, Detail]
Similarly, syntax will allow a varlist option to include
factor variables if fv is included among its specifiers:
syntax varlist(fv) [if] [in] [, Detail] EQ(varlist fv)
See [P] syntax.
b. You can use resulting macro `varlist' as the varlist for any
Stata command that allows factor varlists.
c. Factor varlists come in two flavors, general and specific. An
example of a general factor varlist is mpg i.foreign. The
corresponding specific factor varlist might be
mpg i(0 1)b0.foreign
A specific factor varlist is specific with respect to a given
problem, which is to say, a given dataset and subsample. The
specific varlist identifies the values taken on by factor
variables and the base.
Users usually specify general factor varlists, although they
can specify specific ones. In the process of your program, a
factor varlist, if it is general, will become specific. This
is usually automatic.
Existing commands _rmcoll and _rmdcoll now accept a general
or specific factor varlist and return a specific varlist in
r(varlist). See [P] _rmcoll.
Existing command ml accepts a general or specific factor
varlist and returns a specific varlist, in this case in the
row and column names of the vectors and matrices it produces;
see [R] ml. The same applies to Mata's new moptimize()
function, which is equivalent to ml; see [M-5] moptimize().
Similarly, all Stata estimation commands that allow factor
varlists return the specific varlist in the row and column
names of e(b) and e(V).
Factor varlist mpg i(0 1)b0.foreign is specific. The same
varlist could be written mpg i0b.foreign i1.foreign, so that
is specific, too. The first is specific and unexpanded. The
second is specific and expanded. New command fvexpand takes
a general or specific (expanded or unexpanded) factor
varlist, if or in, and returns a fully expanded, specific
varlist. See [P] fvexpand.
New command fvunab takes a general or specific factor varlist
and returns it in the same form, but with variable names
unabbreviated. See [P] unab.
d. Matrix row and column names are now generalized to include
factor variables. The row or column names contain the
elements from a fully expanded, specific factor varlist.
Because a fully expanded, specific factor varlist is a factor
varlist, the contents of the row or column names can be used
with other Stata commands as a varlist. Unrelatedly, the
equation portion of the row or column name now has a maximum
length of 127 rather than the previous 32.
e. The treatment of variables that are omitted because of
collinearity has changed. Previously, such variables were
dropped from e(b) and e(V) except by regress, which included
the variables but set the corresponding element of e(b) to
zero and similarly set the corresponding row and column of
e(V) to zero. Now all Stata estimators that allow factor
variables work like regress.
Also, if you want to know why the variable was dropped, you
can look at the corresponding element of the row or column
name. The syntax of an expanded, specific varlist allows
operators o and b. Operator o indicates omitted either
because the user specified omitted or because of
collinearity; b indicates omitted because of being a base
category. For instance, o.mpg would indicate that mpg was
omitted, whereas i0b.foreign would indicate that foreign=0
was omitted because it was the base category. Either way,
the corresponding element of e(b) will be zero, as will the
corresponding rows and columns of e(V).
This new treatment of omitted variables -- previously called
dropped variables -- can cause old user-written programs to
break. This is especially true of old postestimation
commands not designed to work with regress. If you set
version to 10 or earlier before estimation, however, then
estimation results will be stored in the old way and the old
postestimation commands will work. The solution is
. version 10
. estimation_command ...
. old_postestimation_command ...
. version 11
When running under version 10 or earlier, you may not use
factor variables with the estimation command.
f. Because omitted variables are now part of estimation results,
constraints play a larger role in the implementation of
estimators. Omitted variables have coefficients constrained
to be zero. ml now handles such constraints automatically and
posts in e(k_autoCns) the number of such constraints, which
can be due to the variable being used as the base, being
empty, or being omitted. makecns similarly saves in
r(k_autoCns) the number of such constraints, and in r(clist),
the constraints used. The matrix of constraints is now
posted with ereturn post and saved, as usual, in e(Cns).
ereturn matrix no longer posts constraints. Old behavior is
preserved under version control. See [R] ml, [P] makecns,
and [P] ereturn.
g. There are additional commands to assist in using and
manipulating factor varlists that are documented only online;
type help undocumented in Stata.
2. Factor variables also allow interactions. Up to eight-way
interactions are allowed.
a. Consider the interaction a#b. If each took on two levels, the
unexpanded, specific varlist would be i(1 2)b1.a#i(1 2)b1.b.
The expanded, specific varlist would be 1b.a#1b.b 1b.a#2.b
2.a#1b.b 2.a#2.b.
b. Consider the interaction c.x#c.x, where x is continuous. The
unexpanded and expanded, specific varlists are the same as
the general varlist: c.x#c.x.
c. Consider the interaction a#c.x. The unexpanded, specific
varlist is i(1 2).a#c.x, and the expanded, specific varlist
is 1.a#c.x 2.a#c.x.
d. All of these varlists are handled in the same way that factor
variables are handled, as outlined in item 1 above.
3. New command fvrevar creates equivalent, temporary variables for
any factor variables, interactions, or times-series-operated
variables so that older commands can be easily converted to
working with factor variables. We hasten to add that, in
general, Stata does not follow the fvrevar approach. Think of
this fvrevar as a generalization of tsrevar. See [R] fvrevar.
4. Factor variables lead to a number of additions to what is saved
in e() and sometimes r():
a. Estimation commands that post e(V) now post the corresponding
rank of the matrix in scalar e(rank).
b. Estimation commands that allow constraints now post the
constraints matrix in matrix e(Cns).
c. In many estimation commands allowing constraints, and in the
programming command makecns, scalar e(k_autoCns) is now
posted containing the sum of the number of base, empty, and
omitted constraints.
d. Programming command makecns now save the constraints used in
macro r(clist).
e. Estimation commands that allow factor variables now post in
macro e(asbalanced) the name of each factor variable
participating in e(b) that was fvset design asbalanced and
post in macro e(asobserved) the name of each factor variable
participating in e(b) that was fvset design asobserved.
f. Estimation commands now post in macros how new command
margins is to treat their prediction statistics when the
statistics require special treatment. These macros are
e(marginsok), e(marginsnotok), and e(marginsprop).
e(marginsok) specifies the name of predictors that are to be
allowed and that appear to violate margins' usual rules, such
as dependent variables being involved in the calculation.
e(marginsnotok) are statistics that margins fails to identify
as violating assumptions but that do and should not be
allowed.
e(marginsprop) provides special signals as to how statistics
for the estimator must be handled. Currently allowed are
combinations of addcons, noeb, and nochainrule. addcons
means that the estimated equations have no constant even if
the user did not specify noconstant at estimation time. noeb
means that the estimator does not store the covariate names
in the column names of e(b). nochainrule means that the
chain rule may not be used to calculate derivatives.
g. Matrix e(V_modelbased), the model-based VCE, is now posted by
most estimation commands that allow robust variance
estimation by bootstrap and jackknife.
h. Existing command sktest now returns in matrix r(N) the matrix
of observation counts and in matrix r(Utest) the matrix of
test results.
5. Existing command estimates describe using now saves in scalar
r(nestresults) the number of sets of estimation results saved in
the .ster file.
6. Existing command correlate saves in matrix r(C) the correlation
or covariance matrix.
7. Existing command ml has been rewritten. It is now implemented in
terms of new Mata function and optimization engine moptimize().
The new ml handles automatic or implied constraints, posts some
additional information to e(), and allows evaluators written in
Mata as well as ado. See [R] maximize for an overview and see
[R] ml and [M-5] moptimize().
8. Existing command estimates save now has option append, which
allows storing more than one set of estimation results in the
same file; see [R] estimates save.
9. Existing commands ereturn post and ereturn repost now work with
more commands, including logit, mlogit, ologit, oprobit, probit,
qreg, _qreg, regress, stcox, and tobit. Also, ereturn post and
ereturn repost now allow weights to be specified and save them in
e(wtype) and e(wexp). See [P] ereturn.
10. Existing command markout has new option sysmissok, which excludes
observations with variables equal to system missing (.) but not
to extended missing (.a, .b, ..., .z); see [P] mark. This has to
do with new emphasis on imputation of missing values; see [MI]
intro.
11. New commands varabbrev and unabbrev make it easy to temporarily
reset whether Stata allows variable-name abbreviations; see [P]
varabbrev.
12. New programming function smallestdouble() returns the smallest
double-precision number greater than zero; see [FN] Programming
functions.
13. creturn has new returned values:
a. c(noisily) returns 0 when output is being suppressed and 1
otherwise. Thus programmers can avoid executing code whose
only purpose is to display output.
b. c(smallestdouble) returns the smallest double-precision value
that is greater than 0.
c. c(tmpdir) returns the temporary directory being used by
Stata.
d. c(eqlen) returns the maximum length that Stata allows for
equation names.
14. Existing extended macro function :dir has new option respectcase,
which causes :dir to respect uppercase and lowercase when
performing filename matches. This option is relevant only for
Windows.
15. Stata has new string functions strtoname(), soundex(), and
soundex_nara(); see [FN] String functions.
16. Stata has 17 new numerical functions: sinh(), cosh(), asinh(),
and acosh(); hypergeometric() and hypergeometricp(); nbinomial(),
nbinomialp(), and nbinomialtail(); invnbinomial() and
invnbinomialtail(); poisson(), poissonp(), and poissontail();
invpoisson() and invpoissontail(); and binomialp(); see [FN]
Trigonometric functions and [FN] Statistical functions.
17. Stata has nine new random-variate functions for beta, binomial,
chi-squared, gamma, hypergeometric, negative binomial, normal,
Poisson, and Student's t: rbeta(), rbinomial(), rchi2(),
rgamma(), rhypergeometric(), rnbinomial(), rnormal(), rpoisson(),
and rt(), respectively. Also, old function uniform() is renamed
runiform(). All random-variate functions start with r. See [FN]
Random-number functions.
18. Existing command clear has new syntax clear matrix, which clears
(drops) all Stata matrices, as distinguished from clear mata,
which drops all Mata matrices and functions. See [D] clear.
19. These days, commands intended for use by end-users are often
being used as subroutines by other end-user commands. Some of
these commands preserve the data simply so that, should something
go wrong or the user press Break, the original data can be
restored. Sometimes, when such commands are used as subroutines,
the caller has already preserved the data. Therefore, all
programmers are requested to include option nopreserve on
commands that preserve the data for no other reason than error
recovery, and thus speed execution when commands are used as
subroutines. See [P] nopreserve option.
What's new in Mata
1. Mata now allows full object-oriented programming! A class is a
set of variables, related functions, or both tied together under
one name. One class can be derived from another via inheritance.
Variables can be public, private, protected, or static.
Functions can be public, private, protected, static, or virtual.
Members, whether variables or functions, can be final. Classes,
member functions, and access to member variables and calls to
member functions are fully compiled -- not interpreted -- meaning
there is no speed penalty for casting your program in terms of a
class. See [M-2] class.
2. The new moptimize() suite of functions comprises Stata's new
optimization engine used by ml and thereby, either directly or
indirectly, by nearly all official Stata estimation commands.
moptimize() provides full support for Stata's new factor
variables. See [M-5] moptimize(), [R] ml, and [R] maximize.
moptimize is important. The full story is that Stata's ml is
implemented in terms of Mata's moptimize(), which in turn is
implemented in terms of Mata's optimize(). optimize() finds
parameters p = (p_1, p_2, ..., p_n) that maximize or minimize
f(p). moptimize() finds coefficients b = (b_1, b_2, ..., b_n),
where p_1 = X_1b_1, p_2 = X_2b_2, ..., p_n = X_nb_n.
3. New function suite deriv() produces numerically calculated first
and second derivatives of vector functions; see [M-5] deriv().
4. Improvements have been made to optimize():
a. optimize() with constraints is now faster for evaluator types
d0 and v0 and for all gradient-based techniques. Also, it is
faster for evaluator types d1 and v1 when used with
constraints and with the nr (Newton-Raphson) technique.
b. Gauss-Newton optimization, also known as quadratic
optimization, is now available as technique gn. Evaluator
functions must be of type 'q'.
c. optimize() can now switch between techniques bhhh, nr, bfgs,
and dfp (between Berndt-Hall-Hall-Hausman, Newton-Raphson,
Broyden-Fletcher-Goldfarb-Shanno, and
Davidon-Fletcher-Powell).
d. optimize(), when output of the convergence values is
requested in the trace log, now displays the identity and
value of the convergence criterion that is closest to being
met.
e. optimize() has 15 new initialization functions:
optimize_init_cluster()
optimize_init_trace_dots()
optimize_init_colstripe()
optimize_init_trace_gradient()
optimize_init_conv_ignorenrtol()
optimize_init_trace_Hessian()
optimize_init_conv_warning()
optimize_init_trace_params()
optimize_init_evaluations()
optimize_init_trace_step()
optimize_init_gnweightmatrix()
optimize_init_trace_tol()
optimize_init_iterid()
optimize_init_trace_value()
optimize_init_negH()
Also, new function optimize_result_evaluations() reports the
number of times the evaluator is called.
5. Existing functions st_data() and st_view() now allow the
variables to be specified as a string scalar with space-separated
names, as well as a string row vector with elements being names.
In addition, when a string scalar is used, you now specify either
or both time-series-operated variables (for example, l.gnp) and
factor variables (for example, i.rep78).
6. Thirty-four LAPACK (Linear Algebra PACKage) functions are now
available in as-is form and more are coming. LAPACK is the
premier software for solving systems of simultaneous equations,
eigenvalue problems, and singular value decompositions. Many of
Mata's matrix functions are and have been implemented using
LAPACK. We are now in the process of making all the
double-precision LAPACK real and complex functions available in
raw form for those who want to program their own advanced
numerical techniques. See [M-5] lapack() and [R] copyright
lapack.
7. New function suite eigensystemselect() computes the eigenvectors
for selected eigenvalues; see [M-5] eigensystemselect().
8. New function suite geigensystem() computes generalized
eigenvectors and eigenvalues; see [M-5] geigensystem().
9. New function suites hessenbergd() and ghessenbergd() compute the
(generalized) Hessenberg decompositions; see [M-5] hessenbergd()
and [M-5] ghessenbergd().
10. New function suites schurd() and gschurd() compute the
(generalized) Schur decompositions; see [M-5] schurd() and [M-5]
gschurd().
11. New function _negate() quickly negates a matrix in place; see
[M-5] _negate().
12. New functions Dmatrix(), Kmatrix(), and Lmatrix() compute the
duplication matrix, commutation matrix, and elimination matrix
used in computing derivatives of functions of symmetric matrices;
see [M-5] Dmatrix(), [M-5] Kmatrix(), and [M-5] Lmatrix().
13. New function sublowertriangle() extracts the lower triangle of a
matrix, where lower triangle means below a specified diagonal;
see [M-5] sublowertriangle().
14. New function hasmissing() returns whether a matrix contains any
missing values; see [M-5] missing().
15. New function strtoname() performs the same actions as Stata's
strtoname() function: it converts a general string to a string
meeting the Stata naming conventions. See [M-5] strtoname().
16. New function abbrev() performs the same actions as Stata's
abbrev() function: it returns abbreviated variable names. See
[M-5] abbrev().
17. New function _st_tsrevar() is a handle-the-error-yourself
variation of existing function st_tsrevar(); see [M-5]
st_tsrevar().
18. Existing functions ghk() and ghkfast(), which evaluate
multivariate normal integrals, have improved syntax; see [M-5]
ghk() and [M-5] ghkfast().
19. Existing functions vec() and vech() are now faster for both real
and complex matrices; see [M-5] vec().
20. Mata has 13 new distribution-related functions: hypergeometric()
and hypergeometricp(); nbinomial(), nbinomialp(), and
nbinomialtail(); invnbinomial() and invnbinomialtail();
poisson(), poissonp(), and poissontail(); invpoisson() and
invpoissontail(); and binomialp(); see [M-5] normal().
21. Mata has nine new random-variate functions for beta, binomial,
chi-squared, gamma, hypergeometric, negative binomial, normal,
Poisson, and Student's t: rbeta(), rbinomial(), rchi2(),
rgamma(), rhypergeometric(), rnbinomial(), rnormal(), rpoisson(),
and rt(), respectively.
Also, rdiscrete() is provided for drawing from a general discrete
distribution.
Old functions uniform() and uniformseed() are replaced with
runiform() and rseed(). All random-variate functions start with
r. See [M-5] runiform().
22. Existing functions sinh(), cosh(), asinh(), and acosh() now have
improved accuracy; see [M-5] sin().
23. New function soundex() returns the soundex code for a name and
consists of a letter followed by three numbers. New function
soundex_nara() returns the U.S. Census soundex for a name and
also consists of a letter followed by three numbers, but is
produced by a different algorithm. See [M-5] soundex().
24. Existing function J(r, c, val) now allows val to be specified as
a matrix and creates an r*rows(val) x c*cols(val) result. The
third argument, val, was previously required to be 1 x 1.
Behavior in the 1 x 1 case is unchanged. See [M-5] J().
25. Existing functions sort(), _sort(), and order() sorted the rows
of a matrix based on up to 500 of its columns. This limit has
been removed. See [M-5] sort().
26. New function asarray() provides associative arrays; see [M-5]
asarray().
27. New function hash1() provides Jenkins' one-at-a-time hash
function; see [M-5] hash1().
28. Mata object-code libraries (.mlib's) may now contain up to 2,048
functions and may contain up to 1,024 by default. Use mlib
create's new size() option to change the default. The previous
fixed maximum was 500. See [M-3] mata mlib.
29. Mata on 64-bit computers now supports matrices larger than 2
gigabytes when the computer has sufficient memory.
30. One hundred and nine existing functions now take advantage of
multiple cores when using Stata/MP. They are
acos() factorial() minutes()
arg() Fden() mm()
asin() floatround() mmC()
atan2() floor() mod()
atan() Ftail() mofd()
betaden() gammaden() month()
binomial() gammap() msofhours()
binomialtail() gammaptail() msofminutes()
binormal() halfyear() msofseconds()
ceil() hh() nbetaden()
chi2() hhC() nchi2()
chi2tail() hofd() nFden()
Cofc() hours() nFtail()
cofC() ibeta() nibeta()
Cofd() ibetatail() normal()
cofd() invbinomial() normalden()
comb() invbinomialtail() npnchi2()
cos() invchi2() qofd()
day() invchi2tail() quarter()
dgammapda() invF() round()
dgammapdada() invFtail() seconds()
dgammapdadx() invgammap() sin()
dgammapdx() invgammaptail() sqrt()
dgammapdxdx() invibeta() ss()
digamma() invibetatail() tan()
dofC() invnchi2() tden()
dofc() invnFtail() trigamma()
dofh() invnibeta() trunc()
dofm() invnormal() ttail()
dofq() invttail() week()
dofw() ln() wofd()
dofy() lnfactorial() year()
dow() lngamma() yh()
doy() lnnormal() ym()
exp() lnnormalden() yq()
F() mdy() yw()
What's more
We have not listed all the changes, but we have listed the important
ones.
Stata is continually being updated, and those updates are available for
free over the Internet. All you have to do is type
. update query
and follow the instructions.
To learn what has been added since this manual was printed, select Help >
What's New? or type
. help whatsnew
We hope that you enjoy Stata 11.
--- previous updates ----------------------------------------------------------
See whatsnew10.
-------------------------------------------------------------------------------