__What's new in release 9.0 (compared with release 8)__

This file lists the changes corresponding to the creation of Stata
release 9.0:

+---------------------------------------------------------------+
| help file contents years |
|---------------------------------------------------------------|
| whatsnew Stata 15.0 and 15.1 2017 to present |
| whatsnew14to15 Stata 15.0 new release 2017 |
| whatsnew14 Stata 14.0, 14.1, and 14.2 2015 to 2017 |
| whatsnew13to14 Stata 14.0 new release 2015 |
| whatsnew13 Stata 13.0 and 13.1 2013 to 2015 |
| whatsnew12to13 Stata 13.0 new release 2013 |
| whatsnew12 Stata 12.0 and 12.1 2011 to 2013 |
| whatsnew11to12 Stata 12.0 new release 2011 |
| whatsnew11 Stata 11.0, 11.1, and 11.2 2009 to 2011 |
| whatsnew10to11 Stata 11.0 new release 2009 |
| whatsnew10 Stata 10.0 and 10.1 2007 to 2009 |
| whatsnew9to10 Stata 10.0 new release 2007 |
| whatsnew9 Stata 9.0, 9.1, and 9.2 2005 to 2007 |
| **this file** Stata 9.0 new release 2005 |
| whatsnew8 Stata 8.0, 8.1, and 8.2 2003 to 2005 |
| whatsnew7to8 Stata 8.0 new release 2003 |
| whatsnew7 Stata 7.0 2001 to 2002 |
| whatsnew6to7 Stata 7.0 new release 2000 |
| whatsnew6 Stata 6.0 1999 to 2000 |
+---------------------------------------------------------------+

Most recent changes are listed first.

--- **more recent updates** -------------------------------------------------------

See whatsnew9.

--- **Stata 9.0 release 22apr2005** -----------------------------------------------

__Remarks__

Some of the important new additions include

1. New matrix programming language Mata.

2. New survey features, including balanced repeated replications
(BRR) and jackknife variance estimates, complete support for
multistage designs, and poststratification.

3. Estimation of linear mixed models, including standard errors and
confidence intervals for all variance components.

4. Estimation of multinomial probit models, including support for
several correlation structures and for user-defined structures.

5. New multivariate analysis, including multidimensional scaling,
correspondence analysis, and Procrustean analysis, along with the
ability to analyze proximity matrices as well as raw data.

6. Improved GUI, including multiple Do-file Editors, multiple
Viewers, and multiple Graph windows; multiple windowing
preferences; dockable windows; and much more.

There are other major features, and it will take us another 30 pages to
mention everything.

What's new is presented under the headings

**New matrix language**

**Survey statistics**

**Longitudinal/panel data**
**Time-series statistics**

**Multivariate statistics**

**Survival analysis**

**General-purpose statistics**

**New ML features**

**Functions and expressions**

**Data management**

**Graphics**

**User interface**

**Programming**

**Documentation**

__What's new: New matrix language__

Stata has an all-new matrix language called Mata, which is the subject of
its own manual, **[M] The Mata Reference Manual**. Mata can be used by those
who want to think in matrix terms and perform matrix calculations
interactively, and it can be used by programmers who want to add features
to Stata.

Mata has been used to implement many of the new features found in this
release. Mata is compiled, optimized, and fast.

Stata's previously existing **matrix** command continues to be documented.
There is an admittedly uneasy relationship between the two, but **matrix**
continues to have its uses. For serious computation, however, you will
definitely want to use the new language.

See **[M-0] intro** -- or **help** **mata** -- which provides an introduction and
organized reading list. The first thing you will read is **[M-1] first**.

__What's new: Survey statistics__

Stata 9 substantially extends Stata's survey-analysis and
correlated-data-analysis facilities by adding the remaining two methods
of computing standard errors -- Balanced Repeated Replications (BRR) and
survey jackknife.

Stata 9 also adds complete support for multistage sampling and
poststratification.

A new, unified syntax is used for declaring the design of survey data and
for fitting models. For an overview of all survey facilities, see **[SVY]**
**survey**.

All the old syntax continues to work under version control, although the
survey estimation commands do not require that, but if you use old
syntax, the new features will not be available.

1. Existing command **svyset** for declaring the survey design has new
syntax that supports a host of new features in Stata's
survey-analysis facilities:

a. BRR and jackknife variance estimators have been added to the
previously available linearization variance estimator.
Moreover, use of BRR or jackknife (or linearization) can now be
specified when you **svyset** or at estimation time.

b. Multistage designs can now be declared, and they may have
primary, secondary, and lower-stage sampling units. The
linearization variance estimator takes complete advantage of the
information in multistage designs.

c. Stratification is now allowed in all stages, making variance
estimates more efficient wherever stratification can be
exploited.

d. Poststratification is now available and, like stratification,
also makes variance estimates more efficient.
Poststratification adjusts weights, improves variance estimates,
and accounts for biases when demographic or other groupings are
known.

e. Finite-population corrections are now allowed in all stages.

f. Sampling weights are handled under all three variance
estimators.

For details, see **[SVY] svyset**. The previous **svyset** syntax continues
to work under version control.

2. New prefix command **svy:** is how you tell estimators you have survey
data. You no longer type **svyregress**; you type **svy: regress**. This
is not just a matter of style; **svy** really is a prefix command, and
in fact, you can even use it as a prefix on estimation commands you
write. In addition, **svy:** provides a standard, unified syntax for
accessing Stata's survey features. **svy:** is easy to use because it
automatically applies everything you have previously **svyset**,
including the design.

The following estimators can be used with **svy:** prefix:

**Descriptive statistics**

**svy: mean** Population and subpopulation means
**svy: proportion** Population and subpopulation proportions
**svy: ratio** Population and subpopulation ratios
**svy: total** Population and subpopulation totals

**svy: tabulate oneway** One-way tables for survey data
**svy: tabulate twoway** Two-way tables for survey data

**Regression models**

**svy: regress** Linear regression
**svy: ivreg** Instrumental variables regression
**svy: intreg** Interval and censored regression

**svy: logistic** Logistic regression, reporting odds ratios
**svy: logit** Logistic regression, reporting
coefficients
**svy: probit** Probit regression

**svy: mlogit** Multinomial logistic regression
**svy: ologit** Ordered logistic regression
**svy: oprobit** Ordered probit models

**svy: poisson** Poisson regression
**svy: nbreg** Negative binomial regression
**svy: gnbreg** Generalized negative binomial regression

**svy: heckman** Heckman selection model
**svy: heckprob** Probit model with selection

Previously existing survey-estimation commands, such as **svyregress**,
**svymean**, and **svypoisson**, continue to work as they did before, but
only if your survey design is declared using **version 8: svyset** or if
you are working with an old Stata 8 dataset. For a mapping from old
estimation commands to the new syntax, see svy8. (The new prefix
**svy:** works with datasets that were **svyset** under an earlier release
of Stata.)

In addition to the three variance estimators and support for
multistage sampling, the new **svy:** prefix provides other
enhancements, including

a. Option **subpop()** allows more flexible selection of
subpopulations, meaning that more general **if** conditions are now
allowed.

b. Strata with only one sampling unit (sometimes called singleton
PSUs) are now handled better -- the coefficients are now
reported, but with missing standard errors. **svydes** can now be
used to find and describe these strata; see **[SVY] svydes**.

c. With BRR variance estimation, a Hadamard matrix can be used in
place of BRR weights, and Fay's adjustment may be specified; see
**[SVY]** *brr_options*.

3. New command **svy:** **proportion** replaces **svyprop**. (By the way, new
command **proportion** can be used without the **svy:** prefix; see **[R]**
**proportion**.) Unlike **svyprop**, **svy:** **proportion** is an estimation
command and computes a full covariance matrix for all the estimated
proportions, allowing postestimation features, such as tests of
linear and nonlinear combinations of proportions (**test** and **testnl**)
or creation of linear and nonlinear combinations with confidence
intervals (**lincom** and **nlcom**).

4. New commands **ratio**, **total**, and **mean**, used with the **svy:** prefix, use
casewise deletion and estimate full covariance matrices for the
estimates.

5. New command **svy: tabulate oneway** addresses a missing feature.
Previously, anyone wanting a one-way tabulation had to create a
constant and perform two-way survey tabulation with that constant.

6. New command **estat** computes and reports additional statistics and
information after estimation with **svy:** prefix:

a. **estat** **svyset** reports complete information on the survey design.

b. **estat** **effects** computes and reports the design effects -- DEFF
and DEFT -- and the misspecification effects -- MEFF and MEFT --
in any combination for each estimated parameter.

c. **estat** **effects** can also compute DEFF and DEFT for subpopulations
using simple random-sample estimates from either the overall
population or from the subpopulation. **estat** **effects** replaces
and extends the **deff**, **deft**, **meff**, and **meft** options previously
available on survey estimators.

d. **estat** **lceffects** computes and reports the survey design effects
and misspecification effects for any linear combination of
estimated parameters.

e. **estat** **size** reports the sample and population sizes for each
subpopulation after **svy:** **mean**, **svy:** **proportion**, **svy:** **ratio**, and
**svy:** **total**.

For details on **estat** after survey estimation, see **[SVY] estat**.

7. Existing command **svydes** has several new features and options:

a. New option **stage()** lets you select the sampling stage for which
sample statistics are to be reported.

b. New option **generate()** identifies strata with a single sampling
unit.

c. New option **finalstage** replaces **bypsu** and reports observation
sample statistics by sampling unit in the final stage.

8. New options **stdize()** and **stdweight()** on commands **svy: mean**, **svy:**
**ratio**, **svy: proportion**, **svy: tabulate oneway**, and **svy: tabulate**
**twoway** allow direct standardization of means, ratios, proportions,
and tabulations using any of the three survey variance estimators.

9. Programmers of estimation commands can get full support for
estimation with survey and correlated data almost automatically.
This support includes correct treatment of multistage designs,
weighting, stratification, poststratification, and finite-population
corrections, as well as access to all three variance estimators.
See **[P] program properties**.

10. The [SVY] manual now has a glossary that defines commonly used terms
in survey analysis and explains how these terms are used in the
manual; see **[SVY] Glossary**.

__What's new: Longitudinal/panel data__

1. The big news is new command **xtmixed** -- Stata now fits linear mixed
models, also known as hierarchical models or multilevel models.

Mixed models include what social scientists call random-effects
models, including one-way, two-way, multi-way, and hierarchical
models, and it includes random-coefficient models.

Estimates are obtained using maximum likelihood (ML), restricted
maximum likelihood (REML), or expectation maximization (EM).
Covariances among random effects are estimated and may be
independent (no covariance), exchangeable (common covariance), or
unstructured (unique covariance for each pair of effects).

**xtmixed** estimates standard errors and confidence intervals for the
fixed parameters, and it estimates the standard deviations
(variances) and correlations (covariances) of the random effects and
the full VCE matrix among them.

For details, see **[XT] xtmixed**.

After estimation with **xtmixed**,

a. **estat** **recovariance** reports the estimated variance-covariance
matrix of the random effects for each level.

b. **estat** **group** summarizes the composition of the nested groups,
providing minimum, average, and maximum group size for each
level in the model.

**predict** after **xtmixed** can compute best linear unbiased predictions
(BLUPs) for each random effect. It can also compute the linear
predictor, the standard error of the linear predictor, the fitted
values (linear predictor plus contributions of random effects), the
residuals, and the standardized residuals.

2. New features have been added to the maximum-likelihood estimators
that do not have closed-form solutions and require numeric
evaluation of the likelihood. These estimators include **xtlogit**,
**xtprobit**, **xtpoisson**, **xtcloglog**, **xtintreg**, and **xttobit**.

a. The likelihood may now be approximated using adaptive
Gauss-Hermite quadrature (the new default) or nonadaptive
quadrature (the previous default). Adaptive quadrature
substantially increases the accuracy of the approximation,
particularly on difficult problems such as data with large panel
sizes or data with a large variance for the random effects.

b. Linear constraints may now be imposed using the new option
**constraints()**. Constraints are specified the standard way; see
**[R] constraint**.

c. New option **intpoints()** replaces old option **quad()**, although
**quad()** continues to work. The new name is more meaningful,
especially when used with estimators that integrate likelihoods
using methods other than quadrature.

3. Existing command **xtreg** now allows options **robust** and **cluster()** when
estimating fixed-effects (FE) and random-effects (RE) models; see
**[XT] xtreg**.

4. Most **[XT]** commands that previously did not allow time-series
operators now support them. These commands include **xtgls**, **xtreg**,
**xtsum**, **xtcloglog**, **xtintreg**, **xtlogit**, **xtpoisson**, **xtprobit**, **xttobit**,
and **xtgee**.

5. New command **xtrc** is old command **xtrchh**, renamed, and with new
features. New option **beta** reports the best linear predictors
(BLUPs) for the group-specific coefficients, along with their
standard errors and confidence intervals. For details, see **[XT]**
**xtrc**.

6. **predict** after **xtrc** has the new option **group()** to compute the BLUPs
of the dependent variable using the BLUPs of the coefficients.

7. New command **xtline** plots panel data and allows either overlaid or
separate graphs for each panel; see **[XT] xtline**

8. New section **[XT]** **Glossary** defines commonly used terms and how they
are used by us.

__What's new: Time-series statistics__

1. Existing command **arima** can now estimate multiplicative seasonal
ARIMA (SARIMA) models; see new options **sarima()**, **mar()**, and **mma()** in
**[TS] arima**.

2. New command **rolling** performs rolling-window or recursive
estimations, including regressions, and collects statistics from the
estimation on each window; see **[TS] rolling**.

3. The **[TS]** manual now has a glossary that defines commonly used terms
in time-series analysis and explains how we use them in the manual;
see **[TS]** **Glossary**.

4. Many existing commands that previously did not allow time-series
operators now do. These commands include **areg**, **binreg**, **biprobit**,
**boxcox**, **cloglog**, **cnsreg**, **glm**, **heckman**, **heckprob**, **hetprob**, **impute**,
**intreg**, **logistic**, **logit**, **lowess**, **mvreg**, **nbreg**, **orthog**, **pcorr**,
**poisson**, **probit**, **pwcorr**, **rreg**, **testparm**, **treatreg**, **truncreg**,
**xtcloglog**, **xtgls**, **xtintreg**, **xtlogit**, **xtpoisson**, **xtprobit**, **xtgee**,
**xtreg**, **xtsum**, and **xttobit**.

5. Many commands requiring time-series data will now work on a single
panel from a panel dataset when that panel is selected using an **if**
expression or an **in** qualifier. Those commands include **ac**, **corrgram**,
**cumsp**, **dfgls**, **dfuller**, **pac**, **pergram**, **pperron**, **wntestb**, **wntestq**, and
**xcorr**. New commands **estat** **archlm**, **estat** **bgodfrey**, **estat** **dwatson**,
and **estat** **durbinalt**, which replace commands **archlm**, **bgodfrey**,
**dwstat**, and **durbina**, also work on a single panel from a panel
dataset.

6. The dialogs for analyzing IRF results are much improved. The
dialogs now populate lists of models and variables from the current
IRF results that may be chosen for producing tables and graphs. The
improved dialogs include **db irf cgraph**, **db irf ctable**, **db irf graph**,
**db irf ograph**, and **db irf table**.

7. Existing command **dfuller** has new option **drift** for testing the null
hypothesis of a random walk with drift. The algorithm for
calculating MacKinnon's approximate p-values is also now more
accurate in cases where the p-value is relatively large; see **[TS]**
**dfuller**.

8. Existing commands **corrgram** and **pac** have new option **yw** that computes
partial autocorrelations using the Yule-Walker equations instead of
the default regression-based method; see **[TS] corrgram**.

9. Time-series operators are now better displayed in estimation and
other result tables.

10. New command **estat** -- used after **regress** -- brings together what was
previously done by commands **dwstat**, **durbina**, **bgodfrey**, and **archlm**.
The new commands are **estat dwatson**, **estat durbina**, **estat bgodfrey**,
and **estat archlm**. See **[R] regress postestimation time series**.

11. The ability of **arima** and **arch** to estimate standard errors using
either the observed information matrix (OIM) or the outer product of
gradients (OPG) has been consolidated under the new **vce()** option.

(What follows was first released in Stata 8.2.)

12. New command **vec** fits cointegrated vector error-correction models
(VECMs) using Johansen's method; see **[TS] vec**.

13. New command **vecrank** produces statistics used to determine the number
of cointegrating vectors in a VECM, including Johansen's trace and
maximum-eigenvalue tests for cointegration; see **[TS] vecrank**.

14. New command **fcast** -- which replaces old command **varfcast** -- produces
and graphs dynamic forecasts of the dependent variables after
fitting a VAR, SVAR, or VECM; see **[TS] fcast**.

15. New command **irf** -- which replaces the old command **varirf** -- does
everything the old command did and more. **irf** estimates the
impulse-response functions, cumulative impulse-response functions,
orthogonalized impulse-response functions, structural
impulse-response functions, and forecast error-variance
decompositions after fitting a VAR, SVAR, or VECM. **irf** can also
make graphs and tables of the results. See **[TS] irf**.

**varirf** continues to work but is no longer documented. **irf** accepts
**.vrf** result files created by **varirf**.

16. Existing command **varsoc** can now be used to obtain lag-order
selection statistics for VECMs, as well as VARs; see **[TS] varsoc**.

17. New command **veclmar** computes Lagrange-multiplier statistics for
autocorrelation after fitting a VECM; see **[TS] veclmar**.

18. New command **vecnorm** tests whether the disturbances in a VECM are
normally distributed. For each equation and for all equations
jointly, three statistics are computed: a skewness statistic, a
kurtosis statistic, and the Jarque-Bera statistic. See **[TS]**
**vecnorm**.

19. New command **vecstable** checks the eigenvalue stability condition
after fitting a VECM; see **[TS] vecstable**.

20. New command **vecstable** and the existing command **varstable** have a new
graph option for presenting the stability results. See **[TS]**
**vecstable** and **[TS] varstable**.

21. The output of the following commands has been standardized to
improve formatting: **var**, **svar**, **vargranger**, **varlmar**, **varnorm**,
**varsoc**, **varstable**, and **varwle**.

22. New command **haver** makes it easy to load and analyze economic and
financial databases available from Haver Analytics; see **[TS] haver**.

__What's new: Multivariate statistics__

Stata has four all-new methods for analyzing multivariate data and many
more extensions to existing methods. In addition, most methods now
support direct analysis of matrices as well as raw data.

Be sure you check the postestimation documentation for the multivariate
estimators you use; many important new features are documented there. In
particular, all the multivariate commands make extensive use of new
command **estat** for providing additional statistics and results after
estimation.

1. New commands **mds**, **mdslong**, and **mdsmat** perform classic metric
multidimensional scaling: **mds** performs the scaling with respect to
the distances (dissimilarities) between observations, **mdslong**
performs the scaling on a long dataset where each observation
represents the distance between two points or objects, and **mdsmat**
performs the scaling on a matrix of distances. See **[MV] mds**, **[MV]**
**mdslong**, and **[MV] mdsmat**.

**mds** supports all 33 similarity/dissimilarity measures available in
Stata; see **[MV]** *measure_option*.

The following new **estat** commands work after **mds**, **mdslong**, or **mdsmat**
and provide additional statistics and results:

a. **estat** **config** reports the coordinates of the approximating
configuration.

b. **estat** **correlations** reports the Pearson and Spearman correlations
between the dissimilarities and the approximating distances for
each object.

c. **estat** **pairwise** reports a set of statistics for each pairwise
comparison; it reports the dissimilarities, the approximating
distances, and the raw residuals.

d. **estat** **quantiles** reports the quantiles of the residuals for each
observation (after **mds**) or object (after **mdslong** or **mdsmat**).

e. **estat** **stress** reports the Kruskal stress (loss) measure between
the transformed dissimilarities and fitted distances per object.

See **[MV] mds postestimation** for more information.

In addition, there are two new commands for graphing results from a
multidimensional scaling:

a. **mdsconfig** plots the approximating Euclidean configuration of the
first two dimensions; see **[MV] mds postestimation plots**.

b. **mdsshepard** produces a Shepard diagram of the dissimilarities
against the approximating Euclidean distances; see **[MV] mds**
**postestimation plots**.

**predict** after any multidimensional-scaling command will produce

a. variables containing the approximating configuration (**predict**
*newvarlist***,** **config**);

b. variables containing the dissimilarity, distance, and raw
residuals (**predict** *newvarlist***,** **pairwise**)

See **[MV] mds postestimation** for more information.

2. New commands **ca** and **camat** perform two-way correspondence analysis
using any of several available forms of normalization. **ca** performs
the analysis on the cross-tabulation of two categorical variables;
**camat** performs the analysis on a matrix of counts; see **[MV] ca** for
more information on both.

The following new **estat** commands work after **ca** and **camat** and provide
additional statistics and results

a. **estat** **coordinates** reports the coordinates in both the row space
and the column space.

b. **estat** **distances** reports the chi-squared distances between the
row profiles and between the column profiles, including the
distances to the marginal distributions (commonly called
centers). Both observed or fitted profiles are available.

c. **estat** **inertia** reports the inertia contributions of the
individual cells.

d. **estat** **profiles** reports the row profiles and column profiles --
the conditional distributions, given the other dimension.

e. **estat** **summarize** reports summary information of the row and
column variables over the estimation sample.

f. **estat** **table** reports the fitted correspondence table, the
observed "correspondence" table, or the expected table under the
assumption of independence.

See **[MV] ca postestimation** for more information.

In addition, there are two new commands for graphing results from a
correspondence analysis:

a. **cabiplot** produces a biplot of each row category and each column
category; see **[MV] ca postestimation plots**.

b. **caprojection** produces a graph that shows the ordering of row
categories and column categories on each principal dimension of
the analysis. Each principal dimension is represented by a
vertical line; markers are plotted on the lines where the row
categories and column categories project onto the dimensions;
see **[MV] ca postestimation plots**.

**predict** after **ca** and **camat** computes fitted values and row or column
scores for any dimension; see **[MV] ca postestimation**.

3. The new command **procrustes** performs Procrustean analysis for
comparing and measuring the similarity between two sets of
variables: source and target. Two datasets can also be compared if
the datasets are first merged by record.

The following new **estat** commands work after **procrustes** and provide
additional statistics and results:

a. **estat** **compare** reports fit statistics of the three
transformations available in Procrustean analysis: orthogonal,
oblique, and unrestricted.

b. **estat** **mvreg** reports the multivariate regression that is related
to the current Procrustean analysis.

c. **estat** **summarize** reports summary information of the two sets of
variables over the estimation sample.

See **[MV] procrustes postestimation** for more information.

New command **procoverlay** after **procrustes** creates an overlay graph
comparing the target variables to the fitted values derived from the
source variables; see **[MV] procrustes postestimation**.

**predict** after **procrustes** produces fitted values for all variables,
residuals for all variables, or residual sums of squares for a
specified target variable; see **[MV] procrustes postestimation**.

4. New command **biplot** performs a biplot analysis of a dataset and
produces a two-dimensional biplot of the results. A biplot
simultaneously displays the observations (rows) and the relative
positions of the variables (columns). Observations are projected to
two dimensions such that the distance between the observations is
approximately preserved. The variables are plotted as arrows, with
the cosine of the angle between arrows approximating the correlation
between the variables. See **[MV] biplot**.

5. New command **tetrachoric** computes a tetrachoric correlation matrix
for a set of binary variables. **tetrachoric** is documented in **[R]** but
will often be used in multivariate analyses; see **[R] tetrachoric**.

**tetrachoric** results can be used in subsequent factor analyses or
principal component analyses using the new **factormat** and **pcamat**
commands. See **[MV] factor** and **[MV] pca**.

6. Existing command **canon** now allows analysis and presentation of more
than one linear combination and has new options for reporting the
raw or standardized coefficients and for reporting significance
tests of the canonical correlations; see **[MV] canon**.

The following new **estat** commands work after **canon** and provide
additional statistics and results:

a. **estat** **correlations** reports the correlations among all variables.

b. **estat** **loadings** reports the matrices of canonical loadings.

See **[MV] canon postestimation** for more information.

7. Existing command **cluster dendrogram** has many new features, including
horizontal dendrograms and the ability to label branch counts. The
look of the graph can now be changed (titles, axes, colors, etc.);
see **[MV] cluster dendrogram**.

8. The existing hierarchical cluster commands have new option **measure()**
that specifies the proximity measure to use in computing
dissimilarities between observations. Any of 33 measures may be
specified; see **[MV]** *measure_option*. Previously most of the measures
were available under other option names; those options continue to
work but are undocumented. See **[MV] cluster**.

9. Existing command **cluster stop** has new option **varlist()** that
specifies alternative variables to use when computing the stopping
rules; see **[MV] cluster stop**.

__What's new: Analysis of proximity matrices__

All of Stata's multivariate analysis facilities that rely on pairwise
comparisons of distance, similarity, dissimilarity, covariance,
correlation, or other proximity measures can now work directly with
proximity matrices that you compute or obtain from other sources.

Previously, all of these facilities worked only with raw datasets. The
new commands implement analyses on matrices. They share the common
ability to accept either full matrices or vectors representing the lower
or upper triangle of a symmetric proximity matrix.

10. New command **clustermat** extends all of Stata's hierarchical
clustering facilities to the analysis of matrices of a dissimilarity
measure (sometimes called a distance or proximity measure). This
includes all seven linkage methods and the ability to create
dendrograms of the results; see **[MV] clustermat**.

11. New command **factormat** performs factor analysis on a matrix of
correlations, extending all the new and previously available
capabilities of the existing command **[MV] factor** to precomputed
matrices of correlations; see **[MV] factor**.

12. New command **pcamat** performs principal component analysis on an
existing correlation or covariance matrix; see **[MV] pca**.

13. New **matrix** subcommand **dissimilarity** computes similarity,
dissimilarity, or distance matrices using any of 19 proximity
measures for continuous data and 14 measures for binary data; see
**[MV]** *measure_option* and see **[MV] matrix dissimilarity**.

__What's new: Factor and principal component analysis additions__

In addition to allowing direct analysis of correlation and covariance
matrices using **factormat** and **pcamat**, Stata's factor analysis and
principal components analysis (PCA) methods have been expanded,
particularly through the addition of postestimation commands for
reporting and graphing results.

14. Command **factor** has new reporting option **altdivisor**, that specifies
the trace of the correlation matrix be used as the divisor for
proportions, rather than the default (the sum of all eigenvalues).

15. New **estat** commands for use after **factor** and **factormat** provide
additional statistics and results:

a. **estat** **common** reports the correlation matrix of the common
factors and is more of interest after oblique rotations.

b. **estat** **factors** reports model-selection criteria (AIC and BIC)
over all the factors retained in an analysis.

c. **estat** **rotatecompare** reports the unrotated factor loadings next
to the most-recent rotated loadings.

d. **estat** **structure** reports the factor structure -- the correlations
between the variables and the common factors.

See **[MV] factor postestimation** for more information.

16. Existing command **pca** allows several new options:

a. Option **vce(normal)** computes the VCE of the eigenvalues and
eigenvectors, assuming multivariate normality.

This gives you access to many of Stata's postestimation
facilities for analyzing estimation results, including tests of
eigenvalue and eigenvector significance, tests of linear and
nonlinear combinations (**[R] test** and **[R] testnl**), linear and
nonlinear combinations with confidence intervals (**[R] lincom** and
**[R] nlcom**), and nonlinear predictions with confidence intervals
(**[R] predictnl**).

**vce(normal)** also produces the ingredients for adding confidence
intervals to screeplots; see **[MV] screeplot**.

b. Options **level()**, **blanks()**, **novce**, and **norotated** allow more
flexible control of the displayed results.

c. Option **components(***#***)** specifies the number of components to
retain and is a synonym for old option **factor()**.

d. Options **tol()** and **ignore** provide advanced control for
computationally difficult problems.

See **[MV] pca** for more information.

17. New **estat** commands for use after **pca** and **pcamat** provide additional
statistics and results:

a. **estat** **loadings** reports the component loading matrix in any of
several available normalizations of the columns (eigenvectors).

b. **estat** **rotatecompare** reports the unrotated (principal) components
next to the most recent rotated components.

See **[MV] pca postestimation** for more information.

18. New **estat** commands for use after any factor analysis or any
principal components analysis (that is, after **factor** or **factormat** or
after **pca** or **pcamat**) provide additional statistics and results:

a. **estat** **anti** reports the anti-image correlation and anti-image
covariance matrices.

b. **estat** **kmo** reports the Kaiser-Meyer-Olkin measure of sampling
adequacy.

c. **estat** **residuals** reports the difference between the observed
correlation or covariance matrix and the fitted (reproduced)
matrix using the retained factors.

d. **estat** **smc** reports the squared multiple correlations (SMC)
between each variable and all other variables. SMC is a
theoretical lower bound for communality, so it is an upper bound
for the unexplained variance.

See **[MV] factor postestimation** and **[MV] pca postestimation** for more
information.

19. Three new graphs are available after any factor analysis (**factor** and
**factormat**) or after any principal components analysis (**pca** and
**pcamat**):

a. **scoreplot** graphs scatterplots comparing each pair of factors or
components; see **[MV] scoreplot**.

b. **loadingplot** graphs scatterplots comparing loadings for each pair
of factors or components; see **[MV] scoreplot**.

c. **screeplot** plots the eigenvalues of a covariance or correlation
matrix; see **[MV] screeplot**. (**screeplot** replaces **greigen** and has
more features; **greigen** continues to work but is undocumented.)

20. New command **rotate** performs orthogonal and oblique rotations after
**factor**, **factormat**, **pca**, and **pcamat**. Available rotations include
varimax, quartimax, equamax, parsimax, minimum entropy, Comrey's
tandem 1 and 2, promax power, biquartimax, biquartimin, covarimin,
oblimin, factor parsimony, Crawford-Ferguson family, Bentler's
invariant pattern, oblimax, quartimin, and target and partial-target
matrices; see **[MV] rotate**.

New command **rotatemat** performs these same linear transformations
(rotations) on any Stata matrix.

__What's new: Survival analysis__

1. The **[ST]** manual now has a glossary that defines commonly used terms
in survival (or duration) analysis and often explains how these
terms are used in the manual; see **[ST]** **Glossary**.

2. New command **estat** can be used after **stcox** and **streg**. In addition to
the standard **estat** statistics -- information criteria, estimation
sample summary, and formatted variance-covariance matrix (VCE) --
statistics specific to the proportional hazards estimator are
available after **stcox**. These include

a. **estat concordance** computes Harrell's C and Somer's D statistics
measuring concordance -- agreement of predictions with observed
failure order.

b. **estat phtest** replaces the existing **stphtest** for computing tests
and graphs of the proportional hazards assumption. **stphtest**
continues to work.

See **[ST] stcox postestimation** and **[ST] streg postestimation**.

3. Existing command **sts graph** has new options **cihazard** and **per(***#***)**.
**cihazard** draws pointwise confidence bands around the smoothed hazard
function, and **per()** specifies the units used to report the survival
or failure rate. See **[ST] sts**.

4. Existing command **stcurve** now plots over an evenly spaced grid,
producing smooth curves, even in small samples; see **[ST] stcurve**.

5. Existing command **sts graph** has new options **atriskopts()** and
**lostopts()** that let you control how the labels for at-risk and lost
observations look (their color, font size, etc.); see **[ST] sts**.

6. Existing command **stci** has new options for controlling how the
plotted survival line looks (color, thickness, etc.) and for adding
titles, controlling legends, and all other characteristics of the
graph; see **[ST] stci**.

__What's new: General-purpose statistics__

1. New estimation command **asmprobit** fits multinomial probit (MNP)
models to categorical data and is frequently used in choice-based
modeling. **asmprobit** allows several correlation structures for the
alternatives, including completely unstructured, where all possible
correlations are estimated. It also allows for either
heteroskedastic or homoskedastic variances among the alternatives
and allows arbitrary patterns within the alternative variances or
correlations. **asmprobit**'s syntax makes specifying both
case-specific and alternative-in-case-specific regressors easy.

In addition to common postestimation commands, such as **mfx** for
computing marginal effects, new command **estat** provides additional
statistics and results:

a. **estat** **alternatives** reports summary statistics about each of the
alternatives and provides a mapping between the index numbers
labeling the alternatives and their associated values and labels
in the dataset.

b. **estat** **covariance** computes and reports the estimated covariance
matrix for the alternatives.

c. **estat** **correlation** reports the correlations among the
alternatives in matrix form.

Predicted statistics after **asmprobit** include the linear predictor,
the probability an alternative is selected, and the standard error
of the linear predictor.

See **[R] asmprobit**, and **[R] asmprobit postestimation**.

2. New estimation command **mprobit** also fits multinomial probit models
to categorical data but in the simplified situation of having only
case-specific covariates (as with the multinomial logistic
regression, **mlogit**). Maximizing the likelihood is much faster in
such cases because the numeric approximation to the likelihood is
simpler. See **[R] mprobit**.

3. New estimation command **slogit** fits the stereotype logistic
regression model for categorical dependent variables. This model
can be viewed as either a generalization of the multinomial logistic
regression model (**mlogit**) or a generalization of the ordered
logistic regression model (**ologit**) that relaxes the
proportional-odds assumption. See **[R] slogit**.

Predicted statistics after **slogit** include the linear predictor, the
probability of any or all outcomes, and the standard error of the
linear predictor. See **[R] slogit postestimation**.

4. New estimation command **ivprobit** fits probit regression models of
binary outcomes with endogenous regressors. Estimation can be
performed by maximum likelihood estimation (MLE) or by Newey's
minimum chi-squared two-step estimation, but some postestimation
facilities, such as computing marginal effects with **mfx**, are
available only after ML estimation -- the two-step estimator imposes
a transformation that invalidates many postestimation results. See
**[R] ivprobit**.

5. New estimation command **ivtobit** fits linear regression models with
censored dependent variables by maximum likelihood estimation or by
Newey's minimum chi-squared two-step estimation (but see the note
about the two-step estimator in 4 above). See **[R] ivtobit**.

6. New estimation command **ztp** fits a zero-truncated Poisson model of
event counts with truncation at zero.

Predicted statistics after **ztp** include the linear predictor and its
standard error, the predicted number of events, the incidence rate,
the conditional mean, and the likelihood score See **[R] ztp** and **[R]**
**ztp postestimation**.

7. New estimation command **ztnb** fits a zero-truncated negative binomial
model of event counts with truncation at zero and over or under
dispersion.

Predicted statistics after **ztnb** include the linear predictor and its
standard error, the predicted number of events, the incidence rate,
the conditional mean, and the likelihood scores See **[R] ztnb** and **[R]**
**ztnb postestimation**.

8. New estimation commands **mean**, **ratio**, **proportion**, and **total** estimate
means, ratios, proportions, and totals over the entire sample or
over groups within the sample. When estimating over groups, the
entire covariance matrix (VCE) is estimated. These are full
estimation commands that support a range of postestimation
facilities, such as linear and nonlinear tests among the groups (
**test** and **testnl**) and linear and nonlinear combinations of
group-level statistics (**lincom** and **nlcom**). All four commands
support several SE and VCE estimates: robust, cluster-robust,
bootstrap, jackknife, and observed information matrix (the default).

**mean**, **ratio**, and **proportion** also support direct standardization
across strata (groups) using the **stdize()** and **stdweight()** options.

See **[R] mean**, **[R] ratio**, **[R] proportion**, and **[R] total**.

9. To avoid conflict with the new **mean** command, existing command **means**
has been renamed **ameans**, with synonyms **gmeans** and **hmeans**.

10. Existing command **nl** has a new syntax that makes estimating nonlinear
least-squares regressions easier. For most models, estimation is
now as easy as typing the nonlinear expression. Full
programmability has been retained for complex models, and the old
syntax continues to work.

**nl** also now supports robust (Huber/white/sandwich) and
cluster-robust SE and VCE estimates, including two popular
adjustments that can dramatically improve the small-sample
performance of robust SE and VCE estimates.

A number of new reporting and estimation options have also been
added. See **[R] nl**.

11. New option **vce()** selects how standard errors (SEs) and covariance
matrix of the estimated parameters are estimated by most estimation
commands. Choices are **vce(oim)**, **vce(opg)**, **vce(robust)**,
**vce(jackknife)**, and **vce(bootstrap)**, although the choices can vary
estimator by estimator. **vce(robust)** is a synonym for **robust**, and
you can use either. What is new are **vce(jackknife)** and
**vce(bootstrap)**.

**vce(bootstrap)** specifies that the standard errors, significance
tests, and confidence intervals be normal-based bootstrap estimates,
rather than the default analytic estimates based on the observed
information matrix. You can also produce percentile-based or
bias-corrected confidence intervals after estimation using **estat**
**bootstrap**; see **[R] bootstrap postestimation**.

**vce(jackknife)** specifies that the standard errors, significance
tests, and confidence intervals be jackknife estimates.

Both **vce(bootstrap)** and **vce(jackknife)** will automatically perform
either observation or cluster sampling, whichever is appropriate for
the estimator.

Notably, both **vce(bootstrap)** and **vce(jackknife)** compute bootstrapped
or jackknifed estimates of the complete VCE matrix. This means that
many of Stata's postestimation commands are available. You can form
linear and nonlinear combinations or functions of the parameters and
obtain jackknife or normal-based bootstrap standard errors and
confidence intervals for the combinations using **[R] lincom** and **[R]**
**nlcom**. Similarly, you can perform linear and nonlinear tests using
**[R] test** and **[R] testnl**.

12. New command **estat** centralizes the computing and reporting of
additional statistics after estimation, just as **predict** does with
predictions. **estat** allows subcommands. **estat** **summarize**, for
instance, reports summary statistics for the estimation sample and
can be used after any estimator. **estat** also allows subcommands that
are specific to the estimation command. To find out what is
available after a command, see the corresponding postestimation
entry. For example, after **[R] regress**, see **[R] regress**
**postestimation**; or after **[XT] xtmixed**, see **[XT] xtmixed**
**postestimation**.

Existing postestimation commands have been brought into the **estat**
framework:

Estimation Old New **estat**
command command command
--------------------------------------------------
**regress** **ovtest** **estat** **ovtest**
**hettest** **estat** **hettest**
**szroeter** **estat** **szroeter**
**vif** **estat** **vif**
**imtest** **estat** **imtest**
**regress** **dwstat** **estat** **dwatson**
(time series) **durbina** **estat** **durbinalt**
**bgodfrey** **estat** **bgodfrey**
**archlm** **estat** **archlm**
**anova** **ovtest** **estat** **ovtest**
**hettest** **estat** **hettest**
**logit** and **lstat** **estat** **classification**(*)
**logistic** **lfit** **estat** **gof**(*)
**poisson** **poisgof** **estat** **gof**
**stcox** **stphtest** **estat** **phtest**

**xtgee** **xtcorr** **estat** **wcorrelation**
--------------------------------------------------
(*) The new command works after **probit**, as well
as **logit** and **logistic**; the old command worked
after **logit** and **logistic** only.

The original commands continue to work but are undocumented.

Three **estat** subcommands are available after almost all estimators:

a. **estat** **ic** reports Akaike's and Schwarz's Bayesian information
criteria (AIC and BIC).

b. **estat** **summarize** reports summary statistics on the variables in
the estimation model for the estimation sample.

c. **estat** **vce** reports the covariance (VCE) or correlation matrix
estimates. (**estat** **vce** replaces the old **vce** command and has more
features.)

13. Stata has many new prefix commands (commands that behave like **by:**
and **xi:**). New prefix commands include **statsby:**, **bootstrap:**,
**jackknife:**, **permute:**, **simulate:**, **stepwise:**, **svy:**, and **rolling:**. For
instance, to obtain the standard error and confidence interval of
the mean, you might type

**. jackknife: mean earnings**

or to obtain survey-adjusted estimates, you might type

**. svy: mean earnings**

after **svyset**ting your data.

See **[R] bootstrap**, **[R] jackknife**, **[R] permute**, **[TS] rolling**, **[R]**
**simulate**, **[R] stepwise**, **[D] statsby**, and **[SVY] svy**.

14. New prefix commands **bootstrap:** and **jackknife:** replace old commands
**bs** and **jknife**, and in addition to having better syntax, they also
provide new features:

a. They handle and report of expressions better.

b. They post their results as estimation results with a complete
VCE. Most postestimation facilities may now be used after them
and will be based on the bootstrap or jackknife VCE. These
include

**adjust** adjusted predictions
**estimates** cataloging estimation results
**lincom** linear combinations with SEs, tests, and CIs
**nlcom** nonlinear combinations with SEs, tests, and CIs
**mfx** computing marginal effects and elasticities
**predict** predictions, residuals, probabilities, etc.
**predictnl** generalized nonlinear predictions with SEs and CIs
**test** Wald tests of simple and composite linear hypotheses
**testnl** Wald tests of nonlinear hypotheses

c. They produce a model test when applied to the coefficients of
estimation commands.

d. They allow option **seed(***#***)** to set the random-number seed.

e. They allow option **reject(***exp***)** to reject replicates that
explicitly match *exp*.

f. **bootstrap:** uses the normal distribution instead of the Student's
t distribution to compute the normal-approximation confidence
intervals.

g. **jackknife:** now allows **fweight**s to be specified.

See **[R] bootstrap** and **[R] jackknife**.

15. New prefix command **statsby:** replaces old command **statsby** (not a
prefix) and provides enhanced handling and reporting of expressions,
allows **weights**, and allows string variables in the option **by()**. See
**[D] statsby**.

16. New prefix command **stepwise:** replaces old command **sw** and, in
addition to working with all the previous estimators, also works
with **[R] intreg** and **[R] scobit**.

17. Existing prefix command **xi:** has new option **noomit** that prevents it
from omitting a category when generating category indicators for
group variables. See **[R] xi**.

18. New command **tetrachoric** computes a tetrachoric correlation matrix
for a set of binary variables. See **[R] tetrachoric**.

19. Existing command **suest**, which combines estimation results for
subsequent testing, is easier to use and has new features:

a. Scores are now computed for the models you combine; you no
longer need to save scores when estimating.

b. **suest**, used after **svy:** estimation, now accounts for your survey
design.

c. **suest** now works more smoothly with certain estimation commands
that previously required special treatment, including **regress**,
**ologit**, and **oprobit**.

d. **suest** now works with all models estimated by **clogit**, rather than
only those with a single positive outcome per group.

See **[R] suest**.

20. Existing command **clogit** has new features:

a. Robust and cluster-robust SE and VCE estimates are now supported
through options **robust** and **cluster()**.

b. Linear constraints on the parameters are now implemented via
option **constraints()**.

c. New option **vce()** allows SE and VCE estimates to be computed
using OIM (the default), OPG, bootstrap, and jackknife.

See **[R] clogit**.

21. Option **level()** now allows noninteger confidence levels to be
specified. See **[R]** estimation options.

22. Existing command **predict** now generates equation-level scores after
most maximum-likelihood estimation commands; see the documentation
of **predict** in the postestimation entry for each estimation command.

23. Existing command cumul has a new option **equal** to create equal
cumulative values for ties. See **[R] cumul**.

24. Existing command **estimates table** now allows you to specify more
models, and the command wraps the table if necessary. Also allowed
are new options

a. **equations()**, which matches equations by number rather than by
name.

b. **coded**, which displays the table in a compact, symbolic format.

c. **modelwidth()**, which sets the number of characters for displaying
model names.

See **[R] estimates**.

25. **test** after **anova** and **manova** has two new options for performing Wald
tests:

a. **mtest()**, which implements three methods to adjust for multiple
tests: Bonferroni, Holm, and Sidak.

b. **test()**, which makes specifying contrasts easier by accepting a
matrix containing the contrast.

See **[R] anova postestimation**.

26. Commands **ci** and **cii** have new options **exact**, **wilson**, **agresti**,
**jeffreys**, and **wald** for computing different types of binomial
confidence intervals. See **[R] ci**.

27. Command **hausman** has new option **df()** for controlling the degrees of
freedom. See **[R] hausman**.

28. **predict** after **ivreg** has the new **score** option for returning
equation-level scores. See **[R] ivreg postestimation**.

29. Command **mfx** is now faster and has new option **varlist()** for computing
effects of specific variables. See **[R] mfx**.

30. Commands **tabulate** and **tabi** with the **exact** option are now
significantly faster.

31. In existing command **mlogit**, option **basecat** has been renamed
**baseoutcome()** for better consistency with the terminology of choice
models. See **[R] mlogit**.

32. Existing commands **spearman** and **ktau** now allow more than two
variables to be specified and have more flexible output. See **[R]**
**spearman**.

33. Existing command **bsample** for sampling with replacement (bootstrap
sampling) now supports weighted bootstrap resampling using the new
**weight()** option. See **[R] bsample**.

34. Existing command **bstat** for reporting bootstrap results has a number
of new reporting options. In addition, **bstat** previously computed
percentile and other confidence intervals. This is now handled by
**estat bootstrap**, which can be used after any bootstrap estimation,
including **bstat**. See **[R] bstat** and **[R] bootstrap postestimation**.

35. Most maximum likelihood estimators now test for convergence using
the Hessian-scaled gradient, g*inv(H)*g'. This criterion ensures
that the gradient is close to zero when scaled by the Hessian (the
curvature of the likelihood or pseudolikelihood surface at the
optimum) and provides greater assurance of convergence for models
whose likelihoods tend to be difficult to optimize, such as those
for **arch**, **asmprobit**, and **scobit**. You can set the tolerance level
for this test with new option **nrtolerance()**, show the Hessian-scaled
gradient in the iteration log with option **shownrtol**, and turn the
test off with option **nonrtolerance**. See **[R]** maximize.

36. Existing command **set** has new setting **maxiter** -- default value 16000
-- that specifies the maximum number of iterations to be performed
by all estimation commands. You change this setting by typing
**set** **maxiter** *#*, and you may add option **permanently** to retain the
setting in future Stata sessions.

__What's new: New ML features__

Command **ml**, for implementing user-written maximum-likelihood estimators,
has many new features:

1. New option **technique()** sets the optimization technique. BHHH, DFP,
and BFGS optimization techniques are now available; the default
technique remains modified Newton-Raphson.

2. New option **vce()** sets the type of covariance-matrix calculations
that will be made.

**vce(oim)** specifies the observed information matrix (OIM), also
called the Hessian-based estimator; this is (and always has been)
the default.

**vce(opg)** specifies the outer product of the gradients (OPG).
This is new.

**vce(robust)** specifies Taylor-series linearization, also known as
the Huber or White estimator and, in Stata, as simply robust.

3. Most estimators written with **ml** now support estimation with survey
data and correlated data with no additional programming. This
support includes correct treatment of multistage designs, weighting,
stratification, poststratification, and finite-population
corrections, as well as access to linearization, jackknife, and
bootstrap variance estimators. For a discussion, see **[P] program**
**properties**.

4. **ml** has always allowed linear constraints to be applied using the
option **constraints()** with no additional programming. It now handles
irrelevant constraints more elegantly. Irrelevant constraints are
those that have no impact on the model. Previously, irrelevant
constraints caused an error message. Now they are flagged and
ignored.

5. When linear constraints are imposed, **ml** now applies a Wald test for
the overall fit of the model, rather than attempting a
likelihood-ratio (LR) test, which is often inappropriate.

6. **ml** has new subcommand **score** for generating scores after fitting a
model.

7. **ml** has new option **diparm_options()** that automatically performs
transformations of ancillary parameters.

8. **ml** now saves the gradient vector in **e(gradient)**.

9. **ml** has new option **search(norescale)** that prevents rescaling when
searching for starting values.

10. **ml** honors the new setting for maximum iterations, **set maxiter** *#*, and
will iterate a maximum of *#* iterations, even if convergence has not
been achieved.

11. **ml** now displays a prominent message in the footer of the estimation
results when convergence is not achieved. This message continues to
be shown on redisplay of estimation results.

12. **ml** has new option **nofootnote** to suppress printing the new message
warning if convergence is not achieved.

13. **ml** tests for convergence using the Hessian-scaled gradient --
g*inv(H)*g'. This is a true convergence criterion that ensures that
the gradient is close to zero when scaled by the Hessian (the
curvature of the likelihood or pseudolikelihood surface at the
optimum). This new criterion is particularly important when
maximizing difficult likelihoods to prevent stopping the
maximization too soon.

14. New option **nrtolerance()** lets you change the tolerance for the
Hessian-scaled gradient convergence criterion; the default is
**nrtolerance(1e-5)**.

15. New option **shownrtolerance** displays the criterion value of the
Hessian-scaled gradient at each iteration.

16. New undocumented command **mlmatbysum** helps you compute the Hessian of
panel-data likelihoods and is of interest to those seeking the speed
that comes with programming your own second-derivative calculations;
see **mlmatbysum**.

17. **ml** has two new undocumented subcommands -- **ml** **hold** and **ml** **unhold** --
to assist in solving nested optimization problems, see **ml_hold**.

See **[R] ml** for more information on these features. Anyone programming
estimators using **ml** should read the book *Maximum Likelihood Estimation*
*with Stata, 2nd Edition* (Gould, Pitblado, and Sribney 2003). Many of the
features mentioned above are discussed and applied to real problems in
the book.

__What's new: Functions and expressions__

1. The limit for the number of dyadic operators has been increased from
200 to 500; see limits.

2. The default matrix size (**matsize**) for Intercooled Stata is now 200,
rather than 40. The default for Stata/SE remains 400, and for Small
Stata it is 40.

3. The following new functions have been added in the context of
expressions, such as **generate** *newvar* **=** *exp* or **if** *exp*:

name purpose
----------------------------------------------
**binormal()** bivariate normal cumulative
**atan2()** two-argument arc tangent

**regexm()** regular expression matching
**regexr()** regular expression replacement
**regexs()** regular subexpressions

**indexnot()** first string *s1* not in *s2*
----------------------------------------------

See **[FN] Functions by category** or type **help** followed by the function
name, such as **help binormal()**.

In addition, a host of new functions are available through Mata; see
**[M-4] intro**.

4. The following existing functions have been renamed:

old name new name
--------------------------------------
**index() strpos()**
**binorm() binormal()**
**match() strmatch()**
**norm() normal()**
**invnorm() invnormal()**
**normd() normalden()**
**lnfact() lnfactorial()**
**issym() issymmetric()**
**syminv() invsym()**
--------------------------------------

Old names continue to work. Functions were renamed because the new
name is better and because Mata uses the new name, and you want to
be able to use the same names in both environments.

5. The following existing functions now have two names, and you can use
either:

Name 1 Name 2
--------------------------------------
**lower() strlower()**
**upper() strupper()**
**proper() strproper()**
**ltrim() strltrim()**
**rtrim() strrtrim()**
**trim() strtrim()**
**reverse() strreverse()**
**string() strofreal()**
**int() trunc()**
**length() strlen()**
--------------------------------------

In this case, throughout the Stata documentation, we use name 1, but
you can use name 1 or name 2 in your Stata expressions. Name 2
matches the name of the Mata function that does the same thing, so
you may want to standardize on name 2.

6. The following **egen** functions have been renamed:

old name new name
------------------------
**any()** **anyvalue()**
**eqany()** **anymatch()**
**neqany()** **anycount()**
**rfirst()** **rowfirst()**
**rlast()** **rowlast()**
**rmean()** **rowmean()**
**rmin()** **rowmin()**
**rmiss()** **rowmiss()**
**robs()** **rownonmiss()**
**rsd()** **rowsd()**
**rsum()** **rowtotal()**
**sum()** **total()**
------------------------

The new names are more consistent. Old names continue to work but
are not documented.

__What new: Data management__

1. There is a new manual **[D] Data management**, and the data-management
commands have been moved from **[R]** to **[D]**. See **[D] intro** for an
expanded what's new for data-management capabilities.

2. Existing command **set** **type** now has a **permanently** option. You can now
permanently set the default datatype to either **float** (the factory
default) or **double**.

3. New commands **xmlsave** and **xmluse** save and restore datasets in
Extended Markup Language (XML) format. Data may be saved or used in
either Stata **dta** XML format or Microsoft Excel's SpreadsheetML
format. See **[D] xmlsave**.

4. New commands **fdasave**, **fdause**, and **fdadescribe** save, use, and
describe files in the format required by the U.S. Food and Drug
Administration (FDA) for new drug and device applications (NDAs).
These commands are designed to assist people making submissions to
the FDA, but the commands are general enough for use in transferring
data between SAS and Stata. The FDA format is identical to the SAS
XPORT Transport format. See **[D] fdasave**.

5. Value labels may now be up to 32,000 characters long.

6. Existing command **label** has a new subcommand **language** that lets you
create and use datasets containing different variable, value, and
data labels, which might be in different languages. See **[D] label**
**language**.

7. Datasets from the examples in the Stata manuals can now be browsed,
described, and used. Type help dta contents, or select **File** **>**
**Example** **datasets...** from the Stata menu.

8. **statsby** is now a prefix command; see **[U] 11.1.10 Prefix commands**.
For information on its new syntax, see **[D] statsby**. Enhancements to
**statsby** include

a. Rather than requiring a list of expressions for the statistics
to collect, **statsby** now collects a default set.

b. Expressions to be computed and saved can now be grouped together
as equations; see exp_list.

c. String variables are now allowed.

d. Weights are now allowed.

e. New option **force** forces **statsby** to work with survey estimators.
By default, this is prevented because the method **statsby** uses to
select subsamples will generally not produce appropriate
standard-error estimates with survey data (the **subpop** option
must be used with survey data).

f. Dots showing the progress of computations are now shown by
default.

g. New option **nolegend** suppresses the table reporting on what
**statsby** is running.

9. New command **filefilter** copies an input file to an output file while
converting a specified ASCII or binary pattern to another pattern;
see **[D] filefilter**.

10. New command **expandcl** replicates clusters of unique observations,
much like an **expand**, but for clustered data; see **[D] expandcl**.

11. New command **tostring** converts numeric variables to string; see **[D]**
**tostring**.

12. Existing command **codebook** now allows **if** and **in** qualifiers; see **[D]**
**codebook**.

13. New command **rmdir** removes an existing directory (folder); see **[D]**
**rmdir**.

14. New command **clonevar** makes an identical copy of an existing
variable; see **[D] clonevar**.

15. Existing commands **icd9** and **icd9p** have been updated to use the V21
codes; see **[D] icd9** and **[D] icd9p**.

16. Existing command **encode** has new option **noextend** that prevents adding
new value label mappings; see **[D] encode**.

17. Existing command **odbc** for accessing Open DataBase Connectivity
(ODBC) data sources has the following enhancements:

a. ODBC is now supported under Mac OS X and Linux systems that use
the iODBC Driver Manager. For more information on configuring
ODBC for Mac and Linux, see the FAQ at
http://www.stata.com/support/faqs/data/odbcmu.html.

b. **odbc** has new subcommands **odbc insert** and **odbc exec** for writing
data to an ODBC data source. Positioned updates can be
performed using the **odbc exec** command.

c. **odbc** has a new subcommand **sqlfile** for batch processing SQL
instructions.

d. **odbc load** has a new option **sqlshow** for debugging SQL
communication with ODBC drivers.

e. **odbc load** has new options **allstring** and **datestring**, which import
either all data or just dates as strings.

See **[D] odbc**.

18. Existing command **merge** has the following new features:

a. It now accepts multiple **using** files.

b. New option **nosummary** suppresses creating variables that
summarize how the records were merged.

c. New option **sort** option sorts the master and using datasets if
they are not already sorted.

d. Existing options **unique**, **uniqmaster**, and **uniqusing** now require
you to specify matching variables.

e. Warning messages are now given when matching variables do not
uniquely identify observations.

See **[D] merge**.

19. Existing commands **merge** and **append** now incorporate all notes from
the using dataset that do not already appear in the master dataset,
unless new option **nonotes** is specified; see **[D] merge** and **[D]**
**append**.

20. Existing command **contract** has new options **cfreq()**, **percent()**,
**cpercent()**, **float**, and **format()** to create frequency and percentage
variables; see **[D] contact**.

21. Existing commands **corr2data** and **drawnorm** now support triangular
specification of the correlation or covariance matrix; see **[D]**
**corr2data** and **[D] drawnorm**.

22. Existing command **separate** has new option **shortlabel** to specify that
shorter variable labels be created; see **[D] separate**.

23. Existing command **outfile** has new option **missing** that preserves both
standard and extended missing values when the **comma** option is also
specified; see **[D] outfile**.

24. Existing command **clear** now performs **mata:** **mata** **clear** in addition to
everything else; see **[D] clear**.

__What's new: Graphics__

1. Stata now allows multiple Graph windows. The existing **name()** option
now creates a named graph and displays it in its own window. See
**What's new: User interface** below.

2. New command **sunflower** draws sunflower density-distribution plots;
see **[R] sunflower**.

3. **graph twoway** has two new *plottypes* for plotting time-series data,
**tsline** and **tsrline**; see **[TS] tsline** and **[G] graph twoway tsline**.

4. Graphs have better axis labels when graphing dates.

5. **graph twoway** has seven new options that are useful when plotting
time-formatted variables: **tscale()**, **tlabel()**, **tmlabel()**, **ttick()**,
**tmtick()**, **tline()**, and **ttext()**; see **[G]** *axis_options*, **[G]**
*added_line_options*, and **[G]** *added_text_options*.

6. **graph twoway** has seven new *plottypes* for plotting paired-coordinate
data -- data with 4 variables, where two variables form a starting
x-y point and the other two variables form an ending x-y point. The
new *plottypes* are

*plottype* Description
--------------------------------------------------------------------
**pcarrow** plots a directional arrow for each observation's paired
coordinates
**pcbarrow** plots a two-headed arrow for each observation's paired
coordinates
**pcspike** plots a line or spike for each observation
**pccapsym** plots a line with symbols at each end for each
observation
**pcscatter** plots both pairs of x-y variables as a scatter, using a
common style
**pci** immediate form of paired-coordinate plots; plots the
specified coordinate pairs
**pcarrowi** immediate form of **pcarrow**
--------------------------------------------------------------------

See **[G] graph twoway pcarrow**, **[G] graph twoway pcbarrow**, **[G] graph**
**twoway pcspike**, **[G] graph twoway pccapsym**, **[G] graph twoway**
**pcscatter**, **[G] graph twoway pci**, and **[G] graph twoway pcarrowi**.

7. **graph twoway**, **graph bar**, **graph box**, and **graph dot** have new option
**aspectratio()** that controls the aspect ratio of a plot region; see
**[G]** *aspect option*.

8. **graph display** has new option **scale()** that allows all text, symbols,
and line widths to be rescaled when a graph is redisplayed; see **[G]**
**graph display**.

9. **graph export** supports new export formats TIFF, PNG (portable network
graphics), and TIFF previews for EPS files. See **[G] graph export**.

10. New option **preview()** with **graph** **export** embeds a preview of the graph
so that it can be viewed in publishing applications; see **[G] graph**
**export** and **[G]** *eps options*.

11. **graph** now supports CMYK output to Postscript and Encapsulated
Postscript (EPS) files. CMYK stands for Cyan-Magenta-Yellow-blacK
and is popular in the printing industry. See **[G] graph export** and
**[G]** *ps_options*.

**palette color** has the new option **cmyk**, specifying that color values
be reported in CMYK; see **[G] palette**.

12. **graph box** can now label outside values using option **marker()**; see
**[G] graph box** and **[G]** *marker label options*.

13. **graph bar** has new options **over(, reverse)** and **yvaroptions(reverse)**
to specify that the categorical scale be reversed, that it run from
maximum to minimum; see **[G] graph bar**.

14. **graph twoway** has new option **pcycle()** that specifies the maximum
number of plots that may appear on a graph before the pstyles
recycle to the first style; see **[G]** *advanced_options*.

15. **graph combine** has new option **altshrink** that provides alternate
sizing of the text, markers, line thickness, and line patterns on
the individual combined graphs; see **[G] graph combine**.

16. **graph** has improved control over whether the largest and smallest
possible grid lines are drawn. This control is provided by
improving the actions of the existing suboptions [**no**]**gmin** and
[**no**]**gmax**; see **[G]** *axis_label_options*.

17. **graph bar**, **graph dot**, **graph box**, and **graph pie** have new option
**allcategories** specifying that the legend include all **over()** groups,
not just groups in the sample specified by **if** and **in**. See, for
example, **[G] graph bar**.

18. **graph**, and all other commands that draw graphs, have new options for
changing the color of objects and changing the appearance of lines:

a. Options **lstyle()**, **lcolor()**, **lwidth()**, and **lpattern()** are now
accepted anywhere **cl**<*attribute*> and the **bl**<*attribute*> were
allowed. Specifically, the new options replace the following
original options:

new options original options
---------------------------------------
**lstyle()** **clstyle()**, **blstyle()**
**lcolor()** **clcolor()**, **blcolor()**
**lwidth()** **clwidth()**, **blwidth()**
**lpattern()** **clpattern()**, **blpattern()**
---------------------------------------

The new options can be applied to all lines -- lines connecting
points, lines outlining bars, lines around text boxes, etc. The
original option names continue to work but are undocumented.

b. New option **fcolor()** changes area fill colors and can be used
anywhere **bfcolor()** or **afcolor()** were allowed. **bfcolor()** and
**afcolor()** continue to work but are undocumented.

c. New option **color(***arg***)** sets all of a plot's colors; it is the
equivalent of specifying **mcolor(***arg***)**, **lcolor(***arg***)**, and
**fcolor(***arg***)**.

19. The syntax of the ROC curve commands is now consistent across all
the ROC commands -- **roctab**, **roccomp**, **rocgold**, and **rocplot** -- with
some new options added and some old options changing names. The
original options continue to work but are undocumented. See **[R]**
**roctab** and **[R] rocfit postestimation**.

20. Existing commands **fracplot** and **lowess** have new option **lineopts()**
that replaces the confusingly named **rlopts()**.

21. Option **plot()**, available on many graph commands, has been renamed
**addplot()**. **addplot()** allows **twoway** plots, such as scatters, lines,
or function plots to be added to most statistical graph commands.

22. Command **kdensity** has new option **epan2** providing an alternate
Epanechnikov kernel; see **[R] kdensity**. Accordingly, **sts graph** and
**stcurve** now allow **kernel(epan2)** for specifying this new kernel.

23. The base margin for **histogram** graphs is now zero.

__What's new: User interface__

Stata 9 has a number of new features in the graphical user interface
(GUI) that are shared across all platforms, such as multiple Viewer and
Graph windows. There are also some significant improvements that affect
only Windows, such as dockable windows. Most GUI features are documented
in the **Getting** **Started** manual.

1. New versions of Stata are available:

a. Stata for Intel Itanium-based PCs running 64-bit Windows.

b. Stata for x86-64 standard systems, including those based on AMD
Opteron chips, Athlon-64 chips and Intel Xeon emt64 chips
running 64-bit Windows.

c. Stata for Intel Itanium-based PCs running 64-bit Linux.

d. Stata for x86-64 standard systems running 64-bit Linux.

2. Stata for Windows and Stata for Mac now have automatic update
checking (nothing is ever downloaded without your confirmation).
The first time you start Stata and every 7th day afterward, you will
be prompted whether to check for updates.

To control how often you are prompted, or to turn the feature off,
select **Prefs** **>** **General** **Preferences**, and select **Internet**; or you can
type **set** **update_interval** *#* or **set** **update_query** **off** at the Stata
prompt; see **[R] update**.

3. Stata now allows multiple Viewer windows so that you can, for
example, simultaneously view the help for several commands and the
results from several logs or search queries.

There are several ways to open another Viewer window.

a. While viewing something in a Viewer, hold down the shift key,
and click on any link. A new Viewer will appear displaying the
contents of the link.

b. Right-click on the link, and choose **Open** **Link** **in** **New** **Viewer**.
That does the same thing.

c. Click with the middle mouse button on the link. That also does
the same thing.

d. Right-click anywhere in an open Viewer, and choose
**Open** **New** **Viewer**. This will open a new Viewer displaying **help**
**contents**.

See **5. Using the Viewer** in the *Getting Started* manual.

4. The Viewer also has the following new features:

a. It supports links within documents, including help files. You
will see this feature used extensively in Stata's online help.

b. It has the ability to search for text within the window. Click
on the find icon that looks like a pair of binoculars at the top
right of the Viewer.

c. It now remembers its position in the document when you click
**Refresh**.

In addition, both the Viewer and Results windows no longer underline
links when they are displayed on a white background. You can change
this by selecting **Prefs** > **General** **Preferences**.

5. Stata now allows multiple Graph windows. The existing **name()** option
of **[G] graph** now creates a named graph and displays it in its own
window of the same name.

Graph-management commands do what you would expect with the named
windows; **graph drop** drops the graph and closes its window; **graph**
**rename** renames both the graph and its window; and so on. Note that
closing a Graph window does not delete the underlying graph and the
graph can be redisplayed with **graph display**.

6. The **Window** menu now supports multiple Viewer and Graph windows:

a. You can switch to specific Viewers or Graphs from this menu.

b. Menu item **Window > Viewer > Close All Viewers** closes the
Viewers.

c. Menu item **Window Graph > Close All Graphs** closes the graphs.

7. There are a number of enhancements to the toolbar:

a. The **Open** button now has a menu that shows recently opened
datasets and allows you to reopen those datasets with a click.
This even includes datasets loaded over the web from
**File** **>** **Example** **Datasets...** or with **webuse**.

b. The **Print** button has a new menu that lets you select the window
to print.

c. The **Viewer** button lets you switch to any Viewer or close all
Viewers.

d. The **Graph** button lets you switch to any Graph or close all
Graphs.

e. The **Do-file** **Editor** button lets you switch to any Do-file Editor
(Windows and Mac).

8. A number of new features and improvements are available under the
**File** menu:

a. Recently opened datasets can now be reopened by selecting
**File** **>** **Open** **Recent**, and recently opened do-files or ado-files
can likewise be reopened from within the **Do-file** **Editor** by
selecting **File** **>** **Open** **Recent**.

b. **File** **>** **Print** lets you select the window to print.

c. All the datasets shipped with Stata and all the datasets used in
the examples in the manuals can be browsed and loaded by
selecting **File** **>** **Example** **Datasets...**

9. Stata now allows multiple **Do-file** editors under Windows and Mac.
See **14.** **Using** **the** **Do-file** **Editor** in the *Getting Started* manual.

10. Contextual menus for common tasks, such as setting preferences,
copying to the clipboard, and printing, are now available in all
windows; right-click in the window to access them.

11. You can now define multiple windowing preferences and switch easily
among those preferences. For example, you might use small fonts and
large Review and Variables windows for your normal work, but use
large fonts with hidden Review and Variables windows for
presentation. Access this new feature by selecting
**Prefs** **>** **Manage** **Preferences**.

12. The** Data Editor** has several enhancements:

a. The contents of string variables and variables with value labels
are now shown in different colors so that they can be easily
distinguished.

b. Variables with value labels can now be displayed as either the
value of the variable or the label.

c. For variables with value labels, you now may change the value of
the variable by right-clicking on the cell and selecting
**Select Value from Value Label**. You may then select the value
and label from a list.

d. You may now associate an existing value label with a variable by
right-clicking on the variable's column and selecting a value
label from **Assign Value Label to Variable**.

e. You may now define or modify value labels from within the Data
Editor by right-clicking and selecting
**Define/Modify Value Labels...**.

f. You can now access and modify the preferences for the Data
Editor by right-clicking in the editor and selecting
**Preferences...**.

13. Dialogs have new features:

a. Keyboard shortcuts for **Copy**, **Paste**, and **Cut** now work.

b. Anywhere that you need to select a variable or variables for a
*varlist*, you may now select those variables from a drop-down
list (Windows and Mac).

c. The new copy button will copy the command built by the dialog to
the clipboard. The button appears just right of the refresh
button at the bottom left of each dialog. It works just like
**Submit**, but rather than executing the command, it pastes the
command.

d. Pressing the **Return** key now works the same as clicking **OK**;
pressing **Shift**+**Return** works the same as clicking **Submit**.
Pressing the **Escape** key works the same as clicking **Cancel**.

e. Pressing the space bar when the keyboard focus is on a radio
button works the same as clicking on the radio button.

f. Keyboard arrow keys now work with dialog spinner controls.

g. Estimation-command dialogs are laid out better, with the model
specification always appearing on the **Model** tab. You can also
now select standard error (SE) types with a single click in the
**SE/Robust** tab (which includes bootstrap and jackknife SEs as
options for most estimators).

h. The **twoway** **graph** dialog boxes are laid out better, with easier
selection of the plottype (scatter, line, range bar, etc.) and
the addition of the new paired-coordinate time-series plottypes.

In addition, the printed manual and online documentation do a better
job of describing the options and controls available on a dialog.
The option entries in the manual and online are grouped into
categories that match the tabs on the dialog box.

14. Stata for Windows has vastly improved flexibility for managing your
work environment:

a. Most windows -- the dockable ones -- can now be docked with the
main Stata window or with each other. By dragging a dockable
window over another dockable window, you may create either a
single-paned window, containing both the original windows with a
separator in between, or a single window with tabs for each of
the original windows. The Viewer, Command, Review, and
Variables windows are all dockable.

In addition, any of these windows can either be attached
(docked) to the main Stata window or detached and made free
floating. Each also has a **pin** icon in the title bar that makes
the window always shown, or makes it roll up into its title bar
when undocked, or makes it shown only as a tab when docked. For
an overview of these features, see **4.** **The** **Stata** **user** **interface**
in the *Getting Started* manual.

b. Most windows can be moved outside the main Stata window. These
include the Graph, Viewer, Browse, and Edit windows, and include
all dialogs.

c. The toolbar can be detached and repositioned.

d. Double-clicking the Results window, when it is docked, merges it
with the main Stata window as the primary document. This saves
some screen real estate, and we suggest that you try it.
Double-click again to undo it.

e. A number of new window preferences available from the **Windowing**
tab under **Prefs >** **General** **Preferences...** let you control how
windows behave and how they dock. You can lock paned windows so
that they cannot be resized, turn on or off docking, turn on or
off the docking guides, make all windows floating, make the
contents of Viewers persistent so that they maintain their
contents between Stata sessions, and even turn off all the
advanced windowing features to lock your current settings.

f. As with Stata on all other platforms, you can now save multiple
windowing preferences and choose the one most appropriate for
what you are doing, for example, working at home, giving a
presentation, etc.

If you are fond of the way Stata for Windows worked prior to Stata
9, or you like to maximize your Stata window, we suggest that you
select from the menu
**Prefs > Manage Preferences > Load Preferences > Maximized**. Even so,
we recommend that you try using the new layout without maximizing
the Stata window.

15. You may now copy the **Review** window to the Clipboard. Right-click in
the window to access the contextual menu.

16. **help** now displays in the Viewer window; new command **chelp** displays
in the Results window. **help** also has two new options:

a. **nonew** displays help in the topmost viewer rather than in a new
one.

b. **name(***viewername***)** displays the help in the specified viewer. If
that name does not exist, a new viewer will be created with that
name.

17. You may now define and access **notes** for a variable by right-clicking
on the variable name in the **Variables** window. Right-clicking on an
empty space allows you to define and access notes for the dataset.

18. The **Do-file** **Editor** has a new SMCL preview button on its toolbar that
displays the current file in the Viewer as rendered SMCL.

19. (Windows and Mac) You can now copy selected text as an HTML table
using **Edit** **> Copy** **Table** **as** **HTML**.

20. (Unix) The minimize keyboard shortcut <Ctrl>-m has been added to all
windows.

21. (Unix) You can now use the Window menu's keyboard shortcuts from any
window.

22. (Mac) You can now increase or decrease the font size in a window by
pressing Apple + and Apple -.

23. (Mac) The ability to undo or redo multiple actions has been added to
the Do-file editor.

24. (Mac) You can now have Stata automatically bring all windows to the
front when it is active by selecting **Prefs > General Preferences...**.

25. (Mac) You can now have Stata automatically snap windows to the edge
of the main Stata window or to the edge of the screen when you move
or resize them by selecting **Prefs > General Preferences...**.

26. (Mac) You can move all of Stata's currently open windows
simultaneously by holding down the Control key while dragging one of
the windows. This will also bring all Stata's open windows to the
foreground.

27. (Mac) The toolbar may be a floating window or may be anchored to the
menubar. The advantage of making the toolbar float is that it takes
up less room on the screen and can be moved. Access this feature
using **Window** **>** **Toolbar**.

__What's new: Programming__

1. Mata, Stata's new matrix-programming language can by used to code
ado-file subroutines; see **[M-1] ado**.

2. New command **viewsource** displays official and user-written source
code. **viewsource** searches for the specified file along the adopath
and displays the file in the Viewer. This works not only for ado
programs, but also for Mata functions that are programmed themselves
in Mata. See **[P] viewsource**.

3. Programmers of estimation commands or commands that work with
estimation results can tie postestimation analysis facilities into
estat, making their postestimation facilities behave just like those
shipped with Stata; see estat programming.

4. New command **matlist** provides extensive format control for displaying
a matrix; see **[P] matlist**.

5. Macro-extended functions that work on matrices will now work on the
matrices stored in **r()** and **e()**, including **e(b)** and **e(V)**. These
extended functions include **rownames**, **colnames**, **roweq**, **coleq**,
**rowfullnames**, and **colfullnames**. See **[P] matrix**.

6. c() (c-class returned values) has the following new items:

item description
-----------------------------------------------------
**c(Wdays)** "Sun Mon ... Sat"
**c(Weekdays)** "Sunday Monday Tuesday ... Saturday"
**c(alpha)** "a b c d e f h j ... x y z"
**c(ALPHA)** "A B C D E F H J ... X Y Z"
**c(Mons)** "Jan Feb ... Dec"
**c(Months)** "January February March ... December"
**c(tracehilite)** pattern to be highlighted in trace log
**c(maxiter)** maximum iterations for maximum
likelihood estimators
**c(varabbrev)** whether variable abbreviation is on
-----------------------------------------------------

7. A program can now be assigned properties when the program is
declared, and those properties can be checked using macro-extended
functions. Specifically,

a. **program** has the new option **properties()**, which attaches
properties to programs; see **[P] program**.

b. A new **properties** macro-extended function allows programmers to
obtain the list of properties attached to a program; see **[P]**
**macro**.

To learn more, see **[P] program properties**.

8. Estimation results can now be assigned properties using new option
**properties()** of **ereturn post** and **ereturn repost**. These property
settings can be checked with the new function **has_eprop()**. See **[P]**
**ereturn** and **[FN] Programming functions**.

9. **ereturn post** now allows posting results without a beta vector, **e(b)**,
or a covariance matrix, **e(V)**.

10. **version** has new option **born()** to prevent the program from running if
the date of the Stata executable is earlier than the specified date.
**version** also issues more descriptive error messages. See **[P]**
**version**.

11. On Microsoft Windows and Unix platforms, the new command **window**
**manage maintitle** allows you to reset the main title of the Stata
window; see **manage maintitle** under **[P] window programming**.

12. New command **levelsof** displays a sorted list of the distinct values
of a variable. This is especially useful for looping over the
values of a variable with, say, **foreach**. See **[P] levelsof**.

13. Plugins (also known as DLLs or shared objects) written in C can now
be incorporated into Stata to create new Stata commands; see **[P]**
**plugin**.

14. The maximum number of description lines in a stata.toc file has been
increased from 10 to 50; see **[R] net**

15. New undocumented command **_coef_table** is a programmer's tool for
displaying coefficient tables; see **_coef_table**.

16. **trace** has new setting **set tracehilite** to highlight a specified
pattern in the trace output; see **[P] trace**.

17. The functionality of **macval()** has been extended to macro
dereferencing of values in a class. For example, **`macval(.a.b.c)'**
causes the class reference **.a.b.c** to be macro expanded only once,
rather than being recursively re-expanded when the result itself
contained a macro reference.

18. Variable abbreviation can now be turned on and off using the new **set**
**varabbrev**; see **[R] set** or type **help** **set varabbrev**.

19. Command **syntax** has new specifier **syntax anything(everything)** that
specifies that **anything** include **if**, **in**, and **using**; see **[P] syntax**.

20. Command **syntax** has new option descriptor **cilevel** that restricts valid
arguments to a standard confidence level and issues appropriate
error messages for invalid entries; see help **[P] syntax**.

21. A number of new directives and extensions to existing directives
have been added to SMCL. They are summarized below within broad
categories; see smcl for complete documentation.

__Jumping to marked locations in help or other files__
directive description
--------------------------------------------------------------------
**{marker pos1}** marks the current position in a file as
**pos1**
**{help "regress##pos1"}** opens the help file regress.hlp at the
marked position **pos1**
**{view "my.smcl##pos1"}** opens the file my.smcl at the marked
position **pos1**
--------------------------------------------------------------------

__Opening help or other files in new or multiple Viewers__
directive description
--------------------------------------------------------------------
**{help "regress##|mywin"}** opens the help file regress.hlp in a
new Viewer window named **mywin**
**{help "regress##pos1|mywin"}** opens the help file regress.hlp at the
marked position **pos1** in a new Viewer
window named **mywin**
**{help "regress##|_new"}** opens the help file regress.hlp in a
new Viewer window

**{view "my.smcl##|mywin"}** opens the file my.smcl in a new Viewer
window named **mywin**
**{view "my.smcl##pos1|mywin"}** opens the file my.smcl at the marked
position **pos1** in a new Viewer window
named **mywin**
**{view "my.smcl##|_new"}** opens the file my.smcl in a new Viewer
window
--------------------------------------------------------------------

__Special formatting of links to help files__
directive description
--------------------------------------------------------------------
**{helpb** *help_topic***}** creates link to *help_topic*.hlp, just
like **{help ***help_topic***}**, but displays
the link in bold.
**{helpb** *help_topic***:***text***}** creates link to *help_topic*.hlp, just
like **{help ***help_topic***:***text***}**, but
displays *text* in bold.

**{manhelp** *help_topic* *R*:*text*} displays **[R] text** and links to
*help_topic*.hlp
**{manhelpi** *help_topic* *R*:*text*} displays **[R]** *text* and links to
*help_topic*.hlp
--------------------------------------------------------------------

__Two-column tables with indented wrapping of last column__
directive description
--------------------------------------------------------------------
**{p2colset** *#* *#* *#* *#***}** declares column spacing for ensuing table
lines that use the **{p2col:**...**}** directive
**{p2colreset}** restores default column spacing
**{p2col:***text 1***}** displays *text 1* in column 1 and enters
paragraph mode in column2 for any text
that follows until a paragraph end is
signaled; an extended syntax allows
columns to be specified
**{p2coldent:***text 1***}** just like **{p2col**}, except *text_1* is output
with a standard indentation for
syntax-diagram-option tables
**{p2line}** draws a line the width of the table;
extended syntax allows margins around the
line
--------------------------------------------------------------------

__New documentation conventions for syntax-diagram-option tables__
directive description
--------------------------------------------------------------------
**{synoptset** [*#*] [**tabbed**]**}** declares default column spacing for
syntax-diagram-option tables
**{synopt**[**:***option_text*]]**}** displays *option_text* in column 1 and
enters paragraph mode in column 2 for
any text that follows until the
paragraph terminates
**{syntab:***text*]**}** outputs text positioned as a subheading
or "tab" in a syntax diagram option
table
**{synopthdr**[**:***column1_text*]**}** displays a standard header for a
syntax-diagram-option table.
**{synoptline**} draws a horizontal line extending to the
boundaries of the previous **{synoptset}**
--------------------------------------------------------------------

__New documentation conventions for variables and varlists__
directive description
--------------------------------------------------------------------
**{newvar}** displays *newvar* while providing a link to help
newvar; new convention for documenting that a
command accepts a new variable
**{varname}** displays *varname* while providing a link to help
varname; new convention for documenting that a
command accepts a variable
**{var}** displays *varname* while providing a link to help
varname; abbreviated form of **{varname}**
**{varlist}** displays *varlist* while providing a link to help
varlist; new convention for documenting that a
command accepts a varlist
**{vars}** displays *varlist* while providing a link to help
varlist; abbreviated form of **{varlist}**
**{depvar}** displays *depvar* while providing a link to help
depvar; new convention for documenting that a
command accepts a dependent variable
**{depvarlist}** displays *depvarlist* while providing a link to help
depvarlist; new convention for documenting that a
command accepts a list of dependent variables
**{depvars}** displays *depvars* while providing a link to help
depvarlist; abbreviated form of **{depvarlist}**
**{indepvars}** displays *indepvars* while providing a link to help
varlist; new convention for documenting that a
command accepts a list of independent variables
--------------------------------------------------------------------
Note that the only change in convention is the addition of links to
help files describing the syntax of variables and varlists.

Each of the above directives also accepts an optional argument that
is displayed immediately after the standard display but does not
otherwise change the link; for example, **{varlist:_1}** displays
*varlist_1* but continues to link to help varlist.

__Other new documentation conventions__
directive description
--------------------------------------------------------------------
**{ifin}** displays [*if*] [*in*] while providing links to
if and in; new convention for documenting
support for **if** and **in** in a syntax diagram
**{weight}** displays [*weight*] while providing a link to
help weight; new convention for
documenting support for weights in a
syntax diagram
**{dtype}** displays [*type*] while providing a link to
help datatypes; new convention for
documenting that a command accepts an
optional datatype in its syntax
**{dlgtab:***text***}** displays *text* while giving it the
appearance of labeled a dialog tab
(extended forms support additional
formatting)
--------------------------------------------------------------------

__Directives that simplify documenting options__
directive description
--------------------------------------------------------------------
**{opt** *optname***}** document options; equivalent to
**{cmd:***optname*}

**{opt** *opt***:***name***}** document options that can be abbreviated;
**{opt my:opt}** displays __my__**opt**

**{opt my:opt(***arg***)}** document options that take arguments; in
this example, the option is named **myopt**
and can be abbreviated **my** - __my__**opt(***arg***)**;
directive will correctly display
arguments that are lists, such as *a,b,...*
or *a|b|c|...*

**{opth my:opt(***arg***)}** like **{opt my:opt(***arg***)}** documents options
that take arguments, but also provides a
link to help for *arg*; for example,
**{opth my:opt(varlist)}** displays
__my__**opt(***varlist***)**; extended syntax allows
the linked help to differ from the
displayed argument
--------------------------------------------------------------------

__Directives abbreviating standard paragraph forms__
directive description
--------------------------------------------------------------------
**{pstd}** equivalent to **{p 4 4 2}**
**{psee}** equivalent to **{p 4 13 2}**
**{phang}** equivalent to **{p 4 8 2}**
**{pmore}** equivalent to **{p 8 8 2}**
**{pin}** equivalent to **{p 8 8 2}**
**{phang2}** equivalent to **{p 8 12 2}**
**{pmore2}** equivalent to **{p 12 12 2}**
**{pin2}** equivalent to **{p 12 12 2}**
**{phang3}** equivalent to **{p 12 16 2}**
**{pmore3}** equivalent to **{p 16 16 2}**
**{pin3}** equivalent to **{p 16 16 2}**
--------------------------------------------------------------------

__Other new directives and extensions to existing directives__
directive description
--------------------------------------------------------------------
**{mata** *args*[**:***text*]**}** like the **{stata}** directive, but for **mata**;
displays *text*, and when *text* is clicked,
executes the **mata** command *args*

**{rcenter:***text***}** places text one space to the right when
there are unequal spaces left and right

**{hline** *#***}** draws a horizontal line stopping *#*
characters from the end of the line
--------------------------------------------------------------------

22. Existing command **window manage** has the following changes and
additions:

**window manage close graph** [{*graphname* | **_all**}]
closes the Graph window named *graphname*, if it exists.
Specifying **_all**, closes all Graph windows.

**window manage forward graph** [*graphname*]
now brings the Graph window named *graphname* to the top of other
windows and otherwise works as before.

**window manage close viewer** [{*viewername* | **_all**}]
closes the Viewer window named *viewername*. Specifying _all
closes all Viewer windows.

**window manage forward viewer** [*viewername*]
now brings the Viewer window named *viewername* to the top of
other windows and continues to work as before when no viewername
is specified.

**window manage minimize**
minimizes the main Stata window.

**window manage restore**
restores the main Stata window, if it is minimized.

23. Existing command **window menu** now has the new subcommand
**append_recentfiles** to add .dta or .gph files to the **Open Recent**
menu.

24. Existing command **confirm variable** has new option** exact** that disallows
variable abbreviations.

25. New command **svymarkout** resets the value of a supplied 0/1 variable
to 0 when any of the survey-characteristic variables set by **svyset**
contain missing values; see **[SVY] svymarkout**.

26. Help files now allow include files. Syntax is **INCLUDE help**
*helptopic* to include file *helptopic***.ihlp**.

27. String scalars are now supported, meaning that a scalar can contain
either a numeric or string value. The maximum length of a string
scalar is the same as the maximum length of a string -- 244
characters. See **[P] scalar**.

28. In addition to coding "**local x** **:** **all** **scalars**" to obtain a list of
all defined scalars, you can now code "**local x** **:** **all** **numeric**
**scalars**" and "**local x** **:** **all** **string** **scalars**" to obtain the list
restricted to numeric or string scalars. See **[P] macro**.

29. In macro expansion, double backslash (\\) used to become single
backslash (\). Now (but under version control) it becomes single
backslash only if the second backslash precedes macro-expansion
punctuation (**`** or **$**).

__What's new: Documentation__

1. There are new manuals: **[D] Data management**, **[MV] Multivariate**
**Statistics**, and **[M] Mata**.

2. Documentation (printed and online) groups related options into
categories. In addition, the categories match the tabs on dialog
boxes.

3. For all estimation commands, there is now an entry called
**postestimation** following the estimation command. For instance,
following **[R] regress** is **[R] regress postestimation**. The
postestimation entry documents command-specific postestimation
facilities to further analyze the results and also directs you to
other relevant postestimation features.

In the online help system, go to help for the estimation command,
and click on **postestimation** in the upper-right corner.

4. There are now glossaries in the **[M]**, **[SVY]**, **[TS]**, and **[XT]** manuals.
The glossaries define commonly used terms and explain how these
terms are used in the documentation.

5. Stata's **help** command and online help facility have new features:

a. Spaces and colons are now allowed in help topics, for example,
**help** **graph** **intro**, **help** **regress** **postestimation**, or **help** **svy:**
**logistic** (with or without the colon).

b. Typing **help sqrt()** now gives you help for Stata's **sqrt()**
function. Typing **help mata sqrt()** gives you help for Mata's
**sqrt()** function.

c. Many command abbreviations are now recognized; for example, **help**
**reg** **post** is understood to mean **help** **regress** **postestimation**, and
**help** **tw con** is understood to mean **help** **graph** **twoway** **connected**.

--- **previous updates** ----------------------------------------------------------

See whatsnew8.

-------------------------------------------------------------------------------