**[U] 1.3 What's new**

**Contents**
1.3 What's new
1.3.1 Highlights
1.3.2 What's new in statistics (general)
1.3.3 What's new in statistics (multilevel)
1.3.4 What's new in statistics (Bayesian)
1.3.5 What's new in statistics (power and sample size)
1.3.6 What's new in statistics (survival analysis)
1.3.7 What's new in statistics (survey data)
1.3.8 What's new in statistics (SEM)
1.3.9 What's new in statistics (panel data)
1.3.10 What's new in statistics (time series)
1.3.11 What's new in statistics (multivariate)
1.3.12 What's new in functions
1.3.13 What's new in graphics
1.3.14 What's new in data management
1.3.15 What's new in programming
1.3.16 What's new in Mata
1.3.17 What's new in the Stata interface
1.3.18 What's more

This section is intended for users of the previous version of Stata. If
you are new to Stata, you may as well skip to *What's more*, below.

As always, Stata 15 is 100% compatible with the previous releases, but we
remind programmers that it is important to put **version 15**, **version** **14.1**,
or **version 12**, etc., at the top of old do- and ado-files so that they
continue to work as you expect. You were supposed to do that when you
wrote them, but if you did not, go back and do it now.

We will list all the changes, item by item, but first, here are the
highlights.

__1.3.1 Highlights__

The highlights of the release are the following:

1. Latent class analysis (LCA)

2. A **bayes** prefix command that can be used in front of many maximum
likelihood estimation command

3. Linearized dynamic stochastic general equilibrium (DSGE) models

4. Extended regression models (ERMs) that fit continuous, binary,
ordered responses with 1) endogeneity, 2) Heckman-style selection,
and 3) treatment effects

5. Dynamic documents combining Markdown with Stata code to produce HTML
files

6. Nonlinear mixed-effects models

7. Spatial autoregressive (SAR) models

8. Interval-censored parametric survival-time models

9. Finite mixture models (FMMs)

10. Mixed logit models

11. Nonparametric regression using kernel methods

12. Power analysis for cluster randomized trials and regression models

13. Produce PDF and Word documents

14. Graph color transparency or opacity

15. ICD-10-CM and ICD-10-PCS support

16. Federal Reserve Economic Data support

And that is not all. The following could have been highlights, too.

o multilevel tobit and interval regression
o heteroskedastic regression
o panel data cointegration tests
o threshold regression
o zero-inflated ordered probit
o Poisson with Heckman-style sample selection
o tests for multiple breaks in time series
o stream random numbers

There is more to boot. The above and other changes are covered here.
Detailed sections follow the highlights.

**Highlight 1. Latent class analysis (LCA)**

Stata's **gsem** command now supports LCA, which, depending on the jargon
you use, includes latent profile analysis (LPA) and finite mixture
models (FMMs).

All of these models use categorical latent variables. Categorical
means group. Latent means unobserved. Categorical latent variables
can be used to represent consumers with different buying preferences,
patients in different risk groups, or schools serving students with
different interests. Unobserved are the buying preferences, risk
groups, and interests. These unobserved categories are the latent
classes, and LCA is used to identify them while accounting for the
uncertainty of the recovered groups and used to account for their
effects.

LPA is a variation on LCA and is used when the outcome variables are
continuous.

FMM is a synonym for LCA to some people, a subset to others, and a
superset to even others. In any case, **gsem** now has the features.

We consider FMM to be a subset of LCA. If you simply want to fit
finite mixtures of Poisson or linear regression models and the like,
you can use our new **gsem** features, but we have another new feature
for you: the **fmm:** prefix command, which is *Highlight 9* below.

LCA, LPA, and FMM are now part of Stata's **gsem** command, and that
means you can fit regression models and multioutcome path models that
allow parameters to vary across latent classes.

For instance, you might have four binary variables that are
indicators of latent groups of consumers. If you believed that there
are three such groups, you could type

**. gsem (y1 y2 y3 y4 <- _cons), lclass(Consum 3) logit**

**y1**, **y2**, **y3**, and **y4** are observed outcome variables. **Consum** is the
latent categorical variable that we specified as taking on three
values. The result is to fit a model in which **y1**, **y2**, **y3**, and **y4** are
determined by unobserved class.

The command fits four logistic regressions, one for each of the **y**
variables. That would fit four intercepts. Because of the new
**lclass(Consum 3)** option, however, each of the 4 models would be fit
with distinct intercepts for each value of **Consum**, meaning 12
intercepts would be fit for the logistic regressions, and that is not
all. A multinomial logistic regression would be used to predict
**Consum**.

After fitting the model, you can

o use the new **estat lcprob** command to estimate the proportion of
consumers belonging to each class;

o use the new **estat lcmean** command to estimate the marginal means
of **y1**, **y2**, **y3**, and **y4** in each class (the means are probabilities
in this case);

o use the new **estat lcgof** command to evaluate the goodness of fit;

o use the existing **predict** command to obtain predicted
probabilities of class membership and predicted values of
observed outcome variables.

See **[SEM] intro 2**, **[SEM] gsem lclass options**, **[SEM] estat lcprob**,
**[SEM] estat lcmean**, **[SEM] estat lcgof**, and **[SEM] predict after gsem**.

**Highlight 2. bayes prefix**

The new **bayes:** prefix command lets you fit Bayesian regression models
more easily and fit more models. You always could fit a Bayesian
linear regression. Now you can fit it by typing

**. bayes: regress y x1 x2**

That is convenient. What you could not previously do was fit a
Bayesian survival model. Now you can.

**. bayes: streg x1 x2, distribution(weibull)**

You can even fit Bayesian multilevel survival models.

**. bayes: mestreg x1 x2 || id:, distribution(weibull)**

In this model, random intercepts were added for each value of
variable **id**.

You can use the **bayes:** prefix with the following estimation commands:

Command Purpose
-----------------------------------------------------------------------
**bayes: betareg** Beta regression
**bayes: binreg** Binomial regression
**bayes: biprobit** Bivariate probit regression
**bayes: clogit** Conditional logistic regression
**bayes: cloglog** Conditional log-log regression
**bayes: fracreg** Fractional response regression
**bayes: glm** Generalized linear model
**bayes: gnbreg** Negative binomial regression
**bayes: heckman** Heckman selection model
**bayes: heckoprobit** Ordered probit with sample selection
**bayes: heckprobit** Probit with sample selection
**bayes: hetprobit** Heteroskedastic probit
**bayes: hetregress** Heteroskedastic linear regression
**bayes: intreg** Interval regression
**bayes: logistic** Logistic regression (odds ratios)
**bayes: logit** Logistic regression (coefficients)

Multilevel mixed-effects ...
**bayes: mecloglog** complementary log-log regression
**bayes: meglm** generalized linear model
**bayes: meintreg** interval regression
**bayes: melogit** logistic regression
**bayes: menbreg** negative binomial regression
**bayes: meologit** ordered logistic regression
**bayes: meoprobit** ordered probit regression
**bayes: mepoisson** Poisson regression
**bayes: meprobit** probit regression
**bayes: mestreg** parametric survival regression
**bayes: metobit** tobit regression
**bayes: mixed** linear regression

**bayes: mlogit** Multinomial (polytomous) logistic
regression
**bayes: mprobit** Multinomial probit regression
**bayes: mvreg** Multivariate linear regression
**bayes: nbreg** Negative binomial regression
**bayes: ologit** Ordered logistic regression
**bayes: oprobit** Ordered probit regression
**bayes: poisson** Poisson regression
**bayes: probit** Probit regression
**bayes: regress** Linear regression
**bayes: streg** Parametric survival regression
**bayes: tnbreg** Truncated negative binomial regression
**bayes: tobit** Tobit regression
**bayes: tpoisson** Truncated Poisson regression
**bayes: truncreg** Truncated linear regression
**bayes: zinb** Zero-inflated negative binomial regression
**bayes: zioprobit** Zero-inflated ordered probit regression
**bayes: zip** Zero-inflated Poisson regression
-----------------------------------------------------------------------

All of Stata's Bayesian features are supported by the new **bayes:**
prefix command. You can select from many prior distributions for
model parameters or use default priors. You can use the default
adaptive Metropolis-Hastings sampling, or Gibbs sampling, or a
combination of the two sampling methods, when available. And you can
use any other feature included in **bayesmh**. For example, you can
change the default prior distributions for the regression
coefficients:

**. bayes, prior({y: x1 x2}, normal(0,4)): regress y x1 x2**

After estimation, you can use Stata's standard Bayesian
postestimation tools such as **bayesgraph** to check convergence,
**bayesstats** **summary** to estimate functions of model parameters,
**bayesstats** **ic** and **bayestest** **model** to compute Bayes factors and
compare Bayesian models, and **bayestest** **interval** to perform interval
hypotheses testing.

See **[BAYES] bayes** and **[BAYES] bayesian estimation**.

**Highlight 3. Linearized dynamic stochastic general equilibrium (DSGE)**
**models**

Stata now fits linearized DSGE models, which are time-series models
used in economics. These models are an alternative to traditional
forecasting models. Both attempt to explain aggregate economic
phenomena, but DSGE models do this on the basis of models derived
from microeconomic theory.

Being based on microeconomic theory means lots of equations. The key
feature of these equations is that expectations of future variables
affect variables today. This is one feature that distinguishes DSGEs
from a vector autoregression or a state-space model. The other
feature is that, being derived from theory, the parameters can
usually be interpreted in terms of that theory.

You specify the equations with the **dsge** command. Here is a
two-equation model:

**. dsge (p = {beta}*E(f.p) + {kappa}*y) (f.y = {rho}*y, state)**

**p** is a control variable, and **y** is a state variable in state-space
jargon. **f.** is the forward operator. These equation say the
following:

1. The control variable **p** depends on **p** in the future plus kappa
times **y** today.

2. The expected future value of **y** is rho times **y** today. The
**state** option specifies that **y** is a state variable.

There are three kinds of variables in DSGE models. Control variables
and equations such as **p** have no shocks and are determined by the
system of equations. State variables such as **y** have implied shocks
and are predetermined at the beginning of the time period. Shocks
are the stochastic errors that drive the system.

In any case, the above **dsge** command would define a model and fit it.

If we have a theory about the relationship between beta and kappa,
such as they are equal, we could now test it using **test**.

Postestimation commands **estat** **policy** and **estat** **transition** report the
policy and transition matrices. If you type

**. estat policy**

the control variables as a linear function of the state variables
will be displayed. If you had five control variables and three state
variables, each of the controls would be reported as a linear
function of the three states. In the simple example above, the
linear function predicting **p** will be shown as a function of **y** today.

**. estat transition**

reports the transition matrix. Whereas the policy matrix reports **p**
as a function of **y**, the transition matrix reports how **y** evolves
through time exclusive of **p**.

You can produce forecasts using Stata's existing **forecast** command.
You can graph impulse-response functions using Stata's existing **irf**
command.

See **[DSGE] intro**.

**Highlight 4. Extended regression models (ERMs)**

ERMs is our name for regression models that can account for the
following:

1. Endogenous covariates

2. Nonrandom treatment assignment

3. Heckman-style endogenous sample selection

The features may be used in any combination. And it has yet another
feature:

4. Forbidden regressions

You can fit models with interactions of endogenous covariates with
other covariates, exogenous or endogenous, continuous or dummy, and
this includes models containing interactions of an endogenous
variable with itself -- or said another way, with polynomials of
endogenous variables!

In the past, you might have used **heckman** to fit a linear model with
endogenous sample selection or **ivregress** to fit a linear model with
an endogenous covariate, and if you had both problems in one dataset,
you were out of luck. You can now use the new **eregress** command to
fit a model to account for both:

**. eregress y x, select(selvar = x z1 y2) endogenous(y2 = x z2)**

If you instead have endogenous treatment assignment and an endogenous
covariate, type

**. eregress y x, entreat(trtvar = x z1 y2) endogenous(y2 = x z2)**

There are four ERM commands.

New command Fits
-----------------------------------------------------------------------
**eregress** linear regression
**eintreg** interval regression, including tobit
**eprobit** probit binary outcome
**eoprobit** ordered probit ordered categorical outcome
-----------------------------------------------------------------------

Notes:

1. If you use the treatment-effect features, use **estat** **teffects**
after model fitting to obtain treatment effects and
potential-outcome means.

2. All the standard postestimation commands are available.
**predict** provides predicted values. **margins** computes marginal
effects and marginal and conditional mean.

Regressors can be exogenous or endogenous.

Endogenous regressors can be continuous, binary, or ordinal.

Treatment can be endogenous or exogenous. The treatment variable can
be binary or ordinal, which is to say, treatment can be multivalued.

Endogenous selection can be modeled using probit or tobit.

You can now fit models that were previously unavailable, even if you
need only one of the new features, such as

o interval regression with endogenous covariates
o probit regression with a binary endogenous covariate
o probit regression with endogenous ordinal treatment
o ordered probit regression with endogenous treatment
o linear regression with tobit endogenous sample selection

See **[ERM] intro 8** for an overview and see **[ERM] eregress**, **[ERM]**
**eprobit**, **[ERM] eoprobit**, and **[ERM] eintreg**.

**Highlight 5. Dynamic documents using Markdown**

Markdown is a standard markup language that provides text formatting
from plain text input. It was designed to be easily converted into
HTML, the language of the web. Stata now supports it.

You can create HTML files from your Stata output, including graphs.
You will start with a plain text file containing Markdown-formatted
text and dynamic tags specifying instruction to Stata, such as run
this regression or produce that graph. You then use the new **dyndoc**
command to convert the file to HTML,

Want to produce TeX documents? With the new **dyntext** command, you can
produce any text-based document!

See **[P] dyndoc**, **[P] dyntext**, **[P] markdown**, and **[P] dynamic tags**.

**Highlight 6. Nonlinear mixed-effects models**

Stata now fits nonlinear mixed-effects models, also known as
nonlinear multilevel models and nonlinear hierarchical models. These
models can be thought of two ways. You can think of them as
nonlinear models containing random effects. Or you can think of them
as linear mixed-effects models in which some or all fixed and random
effects enter nonlinearly. However you think of them, the overall
error distribution is assumed to be Gaussian.

These models are popular because some problems are not, says their
science, linear in the parameters. These models are popular in
population pharmacokinetics, bioassays, and studies of biological and
agricultural growth processes. For example, nonlinear mixed-effects
models have been used to model drug absorption in the body, intensity
of earthquakes, and growth of plants.

The new estimation command is **menl**. It implements the
popular-in-practice Lindstrom-Bates algorithm, which is based on the
linearization of the nonlinear mean function with respect to fixed
and random effects. Both maximum likelihood and restricted
maximum-likelihood estimation methods are supported.

**menl** is easy to use. Single equations can be entered directly, such
as

**. menl weight = ({b1}+{U[plant]})/(1+exp(-(age-{b2})/{b3}))**

which would fit

b_1 + U_plant
weight = ----------------------------- + epsilon
1 + exp{-(**age**-b_2)/b_3}

To be estimated are b_1, b_2, and b_3. U_plant is a random intercept
for each plant.

**menl** also allows multistage or hierarchical specifications in which
parameters of interest can be defined at each level of hierarchy as
functions of other model parameters and random effects, such as

**. menl weight = {phi1:}/(1+exp(-(age-{phi2:})/{phi3:})),**
**define(phi1:{b1}+{U1[plant]}) define(phi2:{b2}+{U2[plant]})**
**define(phi3:{b3}+{U3[plant]})**

This is the same model except that b_2 and b_3 are allowed to vary
across plants.

Several variance-covariance structures are available to model the
dependence of random effects at the same level of hierarchy. If we
wanted, we could have put dependence between **U1**, **U2**, and **U3** in the
above example.

There is a within-group error in the model, epsilon. Flexible
variance-covariance structures are available to model its
heteroskedasticity and its within-group dependence. For example,
heteroskedasticity can be modeled as a power function of a covariate
or even of predicted mean values, and dependence can be modeled using
an autoregressive model of any order.

In addition to standard features, postestimation features also
include prediction of random effects and their standard errors,
prediction of parameters of interest defined in the model as
functions of other model parameters and random effects, estimation of
the overall within-cluster correlation matrix, and more.

See **[ME] menl** and **[ME] menl postestimation**.

**Highlight 7. Spatial autoregressive (SAR) models**

Stata now fits SAR models, also known as simultaneous autoregressive
models. The new **spregress**, **spivregress**, and **spxtregress** commands
allow spatial lags of the dependent variable, spatial lags of the
independent variables, and spatial autoregressive errors. Spatial
lags are the spatial analog of time-series lags. Time-series lags
are values of variables from recent times. Spatial lags are values
from nearby areas.

The models are appropriate for area (also known as areal) data.
Observations are called spatial units and might be countries, states,
districts, counties, cities, postal codes, or city blocks. Or they
might not be geographically based at all. They could be nodes of a
social network. Spatial models estimate direct effects -- the
effects of areas on themselves -- and estimate indirect or spillover
effects -- effects from nearby areas.

Stata provides a suite of commands for working with spatial data and
a new **[SP]** manual to accompany them. When spatial units are
geographically based, you can download standard-format shapefiles
from the web that defines the map. With a single command, you can
make spillover effects proportional to the inverse distance between
areas or restrict them to be just from neighboring areas. And you
can create your own custom definitions of proximity.

Provided for fitting models are the following:

Command Description Equivalent to
-----------------------------------------------------------------------
**spregress, gs2sls** GS2SLS **regress**
**spregress, ml** maximum likelihood **regress**

**spivregress** endogenous regressors **ivregress**

**spxtregress, fe** panel-data fixed effects **xtreg, fe**
**spxtregress, re** panel-data random effects **xtreg, re**
-----------------------------------------------------------------------

See **[SP] intro**.

**Highlight 8. Interval-censored parametric survival-time models**

Stata's new **stintreg** command joins **streg** for fitting parametric
survival models. **stintreg** fits models to interval-censored data. In
interval-censored data, the time of failure is not exactly known.
What is known, subject by subject, is a time when the subject had not
yet failed and a later time when the subject already had failed.

**stintreg** can fit exponential, Weibull, Gompertz, lognormal,
loglogistic, and generalized gamma survival-time models. Both
proportional-hazards and accelerated failure-time metrics are
supported. Features include

o stratified estimation

o flexible modeling of ancillary parameters

o robust, cluster-robust, bootstrap, and jackknife standard
errors

Survey-data estimation is supported via the **svy** prefix.

In addition to the usual features, postestimation features also
include plots of survivor, hazard, and cumulative hazard functions,
prediction of mean and median times, Cox-Snell and martingale-like
residuals, and more.

See **[ST] stintreg** for details.

**Highlight 9. Finite mixture models (FMMs)**

The new **fmm:** prefix command can be used with 17 Stata estimation
commands to FMMs. The commands are the following:

Command Fits
-----------------------------------------------------------------------
**fmm: betareg** Beta regression
**fmm: cloglog** Complementary log-log regression
**fmm: glm** Generalized linear models
**fmm: intreg** Interval-censored regression
**fmm: ivregress** Instrumental-variable regression
**fmm: logit** Logistic regression
**fmm: mlogit** Multinomial logistic regression
**fmm: nbreg** Negative binomial regression
**fmm: ologit** Ordered logistic regression
**fmm: oprobit** Ordered probit regression
**fmm: poisson** Poisson regression
**fmm: probit** Probit regression
**fmm: regress** Linear regression
**fmm: streg** Parametric survival-time regression
**fmm: tobit** Tobit regression
**fmm: tpoisson** Truncated Poisson regression
**fmm: truncreg** Truncated linear regression
-----------------------------------------------------------------------

**fmm** fits models when the data come from unobserved subpopulations.
That is a broad statement and **fmm:** can support it.

The most typical use of **fmm:** is to fit one model and allow the
parameters (coefficients, location, variance, scale, etc.) to vary
across subpopulations. We will call these unobserved subpopulations
classes. Say we are interested in

**. regress y x1 x2**

but we believe there are three classes across which the parameters of
the model might vary. Even though we have no variable recording the
class membership, we can fit

**. fmm 3: regress y x1 x2**

Reported will be separate models for each class and a model for
predicting membership in them.

**fmm:** can be used with multiple estimation commands simultaneously
when the classes might follow different models, such as

**. fmm: (regress y x1 x2) (poisson y x1 x2 x3)**

In this two-class example, reported will be a linear regression model
for the first class, a Poisson regression for the second, and a model
that predicts class membership.

Postestimation commands are available to 1) estimate each class's
proportion in the overall population (**[FMM] estat lcprob**); 2) report
marginal means of the outcome variables within class (**[FMM] estat**
**lcmean**); and 3) predict probabilities of class membership and
predicted outcomes (**[FMM] fmm postestimation**).

See **[FMM] fmm intro**.

**Highlight 10. Mixed logit models**

Stata fits discrete choice models. Stata 15 will fit them with
random coefficients. Discrete choice is another way of saying
multinomial or conditional logistic regression. The word "mixed" is
used by statisticians whenever some coefficients are random and
others are fixed. Ergo, Stata 15 fits mixed logit models.

Random coefficients arise for many reasons, but there is a special
reason researchers analyzing discrete choices might be interested in
them. Random coefficients are a way around the IIA assumption. If
you have a choice among walking, public transportation, or a car and
you choose walking, the other two alternatives are irrelevant. Take
one of them away, and you would still choose walking. Human beings
sometimes violate this assumption, at least judged by their behavior.

Mathematically speaking, IIA makes alternatives independent after
conditioning on covariates. If IIA is violated, then the
alternatives would be correlated. Random coefficients allow that.

A requirement for fitting random coefficients is that the variable
varies across the alternatives. Thus the mixed logit model is often
said to incorporate alternative-specific variables.

The new Stata 15 command that fits this is named **asmixlogit**.

The new command also allows the random coefficients to be drawn from
different distributions. One might be normal and another log normal.
Also supported are multivariate normal, truncated normal, uniform,
and triangular distributions.

See **[R] asmixlogit**.

**Highlight 11. Nonparametric regression, kernel methods**

Stata now fits nonparametric regressions. In these models, you do
not specify a functional form. You specify

y = g(x_1, x_2, ..., x_k) + epsilon

and g(.) is fit. The method does not assume that g(.) is linear; it
could just as well be

y = beta_1 x_1 + beta_2 x_2^2 + beta_3 x_1 x_2 + ...

and it does not even assume it is linear in the parameters. It could
just as well be

y = beta_1 x_1^{beta_2} + beta_3 cos(x_2+x_3) + ...

or anything else. The result is not returned to you in algebraic
form, but predicted values and derivatives can be calculated.

The new **npregress** command fits the models using local-linear or
local-constant kernel regression. Be aware that fitting accurate
nonparametric regressions needs lots of observations. Stata does not
limit k, but practical issues do.

You might type

**. npregress kernel y x1 x2 x3, vce(bootstrap)**

Reported will be the averages of the partial derivatives of **y** with
respect to **x1**, **x2**, and **x3** and their standard errors, which are
obtained by bootstrapping. The averages are calculated over the
data. After fitting the model, you could obtain predicted values
using **predict**.

Average derivatives are something like coefficients, or at least they
would be if the model were linear, which it is not. Realize that
average derivatives in nonlinear models are not derivatives at the
average. You might want to know the derivative of **y** w.r.t. **x1**, **x2**,
and **x3** at the average values of **x1**, **x2**, and **x3**. You can use **margins**
to obtain that:

**. margins, dydx(x1 x2 x3) atmeans**

Or perhaps you want the predicted values evaluated at specific points
of interest,

**. margins, at(x1=2 x2=3 x3=1) at(x1=2 x2=3 x3=2)**

If you wanted **x3** to be 1, 2, ..., 10, you could type

**. margins, at(x1=2 x2=3 x3=1(1)10)**

Then, you could type

**. marginsplot**

to graph this slice of the function.

By the way, **margins** not only makes calculations, it can also produce
bootstrap standard errors for them.

See **[R] npregress**.

**Highlight 12. Power analysis for linear regression, cluster randomized**
**designs, and your own methods**

Stata's **power** command performs power and sample-size analysis (PSS).
Its features now include PSS for linear regression and for cluster
randomized designs (CRDs). In addition, you can now add your own
power and sample-size methods to the **power** command.

The new PSS methods for linear regression include the following:

o **power oneslope** performs PSS for a slope test in a simple linear
regression. It computes sample size or power or the target slope
given other study parameters. See **[PSS] power oneslope**.

o **power rsquared** performs PSS for an R^2 test in a multiple linear
regression. An R^2 test is an F test for the coefficient of
determination (R^2). The test can be used to test the
significance of all the coefficients, or it can be used to test a
subset of them.

In both cases, **power rsquared** computes sample size or power or
the target R^2 given other study parameters. See **[PSS] power**
**rsquared**.

o **power pcorr** performs PSS for a partial-correlation test in a
multiple linear regression. A partial-correlation test is an F
test of the squared partial multiple correlation coefficient.
The command computes sample size or power or the target squared
partial correlation coefficient given other study parameters.
See **[PSS] power pcorr**.

The new PSS methods for CRDs include the following:

The five existing **power** methods listed below, extended to support
CRDs or clustered data when you specify new option **cluster**.

Command Title
-----------------------------------------------------------------
**power onemean, cluster** One-sample mean test in a CRD
**power oneproportion, cluster** One-sample proportion test in a
CRD
**power twomeans, cluster** Two-sample means test in a CRD
**power twoproportions, cluster** Two-sample proportions test in a
CRD
**power logrank, cluster** Log-rank test in a CRD
-----------------------------------------------------------------

In a CRD, groups of subjects (clusters) are randomized instead of
individual subjects, so the sample size is determined by the
number of clusters and the cluster size. The sample-size
determination consists of the determination of the number of
clusters given cluster size or the determination of cluster size
given the number of clusters. The commands compute one of the
number of clusters, cluster size, power, or minimum detectable
effect size given other parameters and provide options to adjust
for unequal cluster sizes.

For two-sample methods, you can also adjust for unequal numbers
of clusters in the two groups.

As with all other power methods, the new methods allow you to specify
multiple values of parameters and automatically produce tabular and
graphical results.

The final new feature is that you can add your own PSS methods. How
you do that is now documented in **[PSS]** *power usermethod*. It is easy
to do. You write a program that computes sample size or power or
effect size. The **power** command will do the rest for you. It will
deal with the support of multiple values in options and with the
automatic generation of graphs and tables of results.

**Highlight 13. Produce PDF and Word documents**

It is now just as easy to produce PDF and Word documents in Stata as
it is to produce Excel worksheets. Everybody loved **putexcel** in Stata
14. If you are among them, you will love **putpdf** and **putdocx**.

They work just like **putexcel**. That means you can write do-files to
create entire PDF or Word reports containing the latest results,
tables, and graphs. You can automate reproducible reports.

The new **putpdf** command writes paragraphs, images, and tables to a PDF
file. Images include Stata graphs and other images such as your
organization's logo. You can format the objects, too -- bold face,
italics, size, custom tables, etc.

The new **putdocx** command writes paragraphs, images, and tables to a
Word file or, to be precise about it, to Office Open XML (**.docx**)
files. Just as with **putpdf**, images include Stata graphs, and you can
format the objects.

See **[P] putpdf** and **[P] putdocx**.

**Highlight 14. Graph color transparency or opacity**

Up until now, graph one thing on top of any other, and the object on
top covered up the object below. In the jargon, Stata's colors are
fully opaque or, if you prefer, not at all transparent. Stata 15
still works that way by default. Stata 15, however, allows you to
control the opacity (transparency) of its colors.

Opacity is specified at a percent. By default, Stata's colors are
100% opaque.

You can specify opacity whenever you specify a color, such as in the
**mcolor()** option, which controls the colors of markers. Rather than
specifying **green**, you can now specify **green%50**. Rather than
specifying **"0 255 0"** (equivalent to green), you can specify **"0 255**
**0%50"**. And you can simply specify **%50** to make the default color 50%
opaque.

You usually do not want to specify **%0**. Yes, it is fully transparent,
but it is also invisible.

Here is a graph where we use **%70**:

**. twoway rarea high open date in 1/15, color(red%70) ||** **rarea low**
**close date in 1/15, color(green%70)**

**Highlight 15. ICD-10-CM and ICD-10-PCS support**

Stata 15 now supports ICD-10-CM and ICD-10-PCS, the U.S. ICD-10
codes provided by the National Center for Health Statistics (NCHS)
and the Centers for Medicare and Medicaid Services (CMS). These are
the codes mandated for all medical billing in the United States.

Stata has long supported ICD codes for reporting medical diagnosis,
procedures, and mortality. ICD stands for international statistical
classification of diseases and related health problems. Stata began
support of the code in 1998 starting with ICD-9-CM version 16 and
supported all revisions after that.

Stata supports ICD-10 codes revisions since 2003.

See **[D] icd**, **[D] icd10cm**, and **[D] icd10pcs**.

**Highlight 16. Federal Reserve Economic Data support**

The St. Louis Federal Reserve makes available over 470,000 U.S. and
international economic and financial time series to registered users.
Registering is free and easy to do. The service is called FRED.
FRED includes data from 84 sources, including the Federal Reserve,
the Penn World Table, Eurostat, and the World Bank.

In Stata 15, you can use Stata's GUI to access and download FRED
data. You search or browse by category or release or source. You
click to select series of interest. Select 1 or select 100. When
you click on **Import**, Stata will download them and combine them into a
single, custom dataset in memory.

These same features are also available from Stata's command line
interface. The command is **import fred**. The command is convenient
when you want to automate updating the 27 different series that you
are tracking for a monthly report.

Stata can access FRED and it can access ALFRED. ALFRED is FRED's
historical archive data.

See **[D] import fred**.

__1.3.2 What's new in statistics (general)__

Finite mixture models is *Highlight 9* of the release, mixed-logit models
is *Highlight 10*, and nonparametric regression is *Highlight 11*.

Also new are the following:

1. **bayes: prefix works with general-purpose estimation commands**

The new **bayes:** prefix (*Highlight 2* of the releases) works with many
of the general-purpose estimation commands:

Command Purpose
-----------------------------------------------------------------
**betareg** Beta regression
**binreg** Binomial regression
**biprobit** Bivariate probit regression
**clogit** Conditional logistic regression
**cloglog** Complementary log-log regression
**fracreg** Fractional-outcome regression
**glm** Generalized linear model
**gnbreg** Generalized negative binomial regression
**heckman** Heckman selection model
**heckoprobit** Heckman ordered probit with sample
selection
**heckprobit** Heckman probit with sample selection
**hetprobit** Heteroskedastic probit regression
**hetregress** Heteroskedastic linear regression
**intreg** Interval linear regression
**logistic** Logistic regression (default odds ratio)
**logit** Logistic regression (default coefficients)
**mlogit** Multinomial logistic regression
**mprobit** Multinomial probit regression
**mvreg** Multivariate linear regression
**nbreg** Negative binomial regression
**ologit** Ordered logistic regression
**oprobit** Ordered probit regression
**poisson** Ordered Poisson regression
**probit** Probit regression
**regress** Linear regression
**tnbreg** Truncated negative binomial regression
**tobit** Tobit regression
**tpoisson** Truncated Poisson regression
**truncreg** Truncated linear regression
**zinb** Zero-inflated negative binomial regression
**zioprobit** Zero-inflated ordered probit regression
**zip** Zero-inflated Poisson regression
-----------------------------------------------------------------

The list above is of general-purpose estimation commands. The
**bayes:** prefix works with multilevel estimation commands, too.

See **[BAYES] bayes** and **[BAYES] bayesian estimation**.

2. **New command fits heteroskedastic regression**

New estimation command **hetregress** fits heteroskedastic regression by
modeling the variance as an exponential function of the covariates
you specify. Two estimation methods, maximum likelihood and
Harvey's two-step generalized least squares, are provided. Robust,
cluster-robust, bootstrap, and jackknife standard errors are
supported. Survey-data estimation is supported via the **svy** prefix.

See **[R] hetregress**.

3. **New command fits Poisson regression with Heckman-style selection**

New estimation command **heckpoisson** fits a Poisson regression model
with endogenous sample selection. All the standard postestimation
features are provided.

See **[R] heckpoisson**.

4. **New command fits zero-inflated ordered probit regression**

New estimation command **zioprobit** fits zero-inflated ordered probit
models. This model is used when data exhibit a higher fraction of
the zeros than is expected from a standard ordered probit model.

We say 0, imagining that the dependent variable contains 0, 1, 2,
..., but we mean the lowest value of the outcome variable because it
could just as well contain 2, 5, 9, ....

The zero inflation is accounted for by assuming that the zeros come
from both a probit model and an ordered probit model. Each model
can have different covariates.

See **[R] zioprobit**.

5. **Tobit now accepts censoring limits and constraints**

Some people think of tobit as being censored at zero. Stata's **tobit**
estimation command allows you to specify the lower value of the
censoring point, and it allows you to specify an upper censoring
point, too. All that is unchanged. You can now specify censoring
points -- upper, lower, or both -- that vary observation by
observation. The censoring points can be stored in variables.

**tobit** now allows constraints.

**tobit** now has the other standard features that it always should have
had, but that is just for completeness. You can, for instance,
specify initial values.

See **[R] tobit**.

6. **tpoisson, ul()**

Existing estimation command **tpoisson** fits truncated Poisson models.
It previously fit left-truncated models only. It now fits left-,
right-, and both-truncated models. New option **ul()** specifies the
upper truncation limit.

See **[R] tpoisson**.

7. **Factor variables now work more like you would expect they would**

Consider fitting a model with the terms

**.** *est_command* ... **i.a i(2 3).a#i.b** ...

What should happen? What happens now should be more in line with
your expectations. **i.a** adds main-effect coefficients for each level
of **a**, and the interaction **i(2 3).a#i.b** is restricted to **a**'s levels 2
and 3.

What used to happen was rather more surprising. The entire RHS of
the model was restricted to levels 2 and 3 of **a**.

8. **One- and two-sample mean tests with clustered data**

Existing command **ztest** has new option **cluster()** and other new
options to account for clustering.

See **[R] ztest**.

9. **One- and two-sample proportion tests with clustered data**

Existing command **prtest** has new option **cluster()** and other new
options to account for clustering.

See **[R] prtest**.

10. **Note explaining interpretation of intercept when exponentiated**
**coefficients reported**

Many estimation commands report exponentiated coefficients, either
by default or because you specified an option requesting the odds
ratio, incidence rate ratio, hazard ratio, and so on. In those
cases, Stata also reports the exponentiated intercept. This
confuses some people, especially students. Stata now adds a note at
the end of the output explaining the interpretation of the
exponentiated intercept.

Notes also make clear which parameters are exponentiated.

11. **ivtobit has improved convergence**

Existing estimation command **ivtobit** fits instrumental-variable tobit
models. It now converges more reliably when there are two or more
endogenous variables.

12. **New dots() option with replication methods**

Existing prefix commands **bootstrap**, **jackknife**, **permute**, and **simulate**
have new option **dots(***#***)**, which displays dots every *#* replications.
This provides entertainment and confirmation that the command is
still working during long runs.

See **[R] bootstrap**, **[R] jackknife**, **[R] permute**, and **[R] simulate**.

13. **Option noskip renamed lrmodel**

Existing estimation commands **biprobit**, **heckman**, **heckprobit**,
**hetprobit**, and **truncreg** had option **noskip**, which presented the model
test as a likelihood-ratio rather than the default Wald test. This
option has been renamed **lrmodel**. The old option name continues to
work. (There was a justification for the old name. Calculating the
likelihood-ratio test requires fitting the constant-only model.
**noskip** specified that fitting of that model not be skipped!)

14. **hetprobit, waldhet (option rename)**

Existing estimation command **hetprobit** fits heteroskedastic probit
models. Existing option **nolrtest** has been renamed **waldhet**. It
tests whether the variance is heteroskedastic. Old option **nolrtest**
continues to work under version control.

See **[R] hetprobit**.

15. **Stata names free parameters in fitted models differently**

Free parameters are scalar parameters, variances, covariances, and
the like that are part of the model being fit. A free parameter
might be ln(sigma).

We have made a deep change in the internals of Stata. What does
this mean for you? Not much. If a model fits free parameter
**/lnsigma**, you can no longer refer to its value as **_b[lnsigma:_cons]**.
You must refer to it as **_b[/lnsigma]**. You probably always referred
to it that way. It involved less typing.

The renaming might matter in programs that you write; see the next
item.

Old syntax is preserved under version control.

See **[R] ml** and see the next item.

16. **Program your own models? ml uses the new free-parameter syntax**

**ml** now allows and prefers that free parameters be specified as
**/***name*. You can no longer refer to them as if they were
constant-only equations, which is to say *name***:_cons**, except under
version control.

As explained in the previous item, **_b[/lnsigma]** is no longer a
shorthand for **_b[lnsigma:_cons]**. **_b[/lnsigma]** is its own thing. It
is the free parameter named **/lnsigma** and not the constant term from
equation **lnsigma**.

If you were using **ml** to fit your own maximum likelihood model, you
might create **/lnsigma** thinking you were creating an equation named
**lnsigma**. You are not. Now you are creating a single free
parameter. If you want to create a constant-only equation, you must
use **(lnsigma:)**, which always meant to create an "equation" called
**lnsigma**.

There are other implications for programmers writing advanced code.
Matrix row and column names have changed and are now easier to use.
This is mentioned in *What's new in programming*.

See **[R] ml**.

17. **More stored results**

Regular commands store their results in **r()**, and estimation commands
store them in **e()**. There are now more of them. If something is
reported, a result is stored even if what is reported could be
calculated from the other stored results.

__1.3.3 What's new in statistics (multilevel)__

Multilevel mixed-effects models are also known as hierarchical models,
nested data models, mixed models, random coefficient models, random
effects models, random parameter models, and split-plot designs.

Nonlinear mixed models are *Highlight 6* of the release.

Also new are the following:

18. **Multilevel mixed-effects tobit regression**

New estimation command **metobit** fits tobit models. In tobit models,
outcomes below a limit or above a limit are censored. Limits can be
fixed (say, 0 and 1,000) or vary observation by observation.

See **[ME] metobit**.

19. **Multilevel mixed-effects interval regression**

New estimation command **meintreg** fits interval regression models. In
these models, the exact value of the dependent variable is not
observed in some or all observations. Instead, y_i is known to be
within [a_i,b_i]. Ranges can be open ended, so the model handles
censoring as well as intervals.

See **[ME] meintreg**.

20. **bayes: prefix works with multilevel models**

Stata's new **bayes:** prefix command (see *Highlight 2*) may be used with
the following:

Estimation command Fits multilevel mixed-effects ...
-----------------------------------------------------------------
**bayes: mixed** linear regression
**bayes: metobit** tobit regression
**bayes: meintreg** interval regression
**bayes: melogit** logistic regression
**bayes: meprobit** probit regression
**bayes: mecloglog** complementary log-log regression
**bayes: mepoisson** Poisson regression
**bayes: menbreg** negative binomial regression
**bayes: meglm** generalized linear model
**bayes: mestreg** parametric survival models
-----------------------------------------------------------------
See **[BAYES] bayes**.

21. **Standard deviations and correlations instead of variances and**
**covariances**

New postestimation command **estat sd** displays random effects and
within-group error parameter estimates as standard deviations and
correlations instead of the variances and covariances reported in
the estimation output.

See **[ME] estat sd**.

__1.3.4 What's new in statistics (Bayesian)__

The new **bayes:** prefix command is *Highlight 2* of the release.

Also new are the following:

22. **bayesmh, eform()**

Existing estimation command **bayesmh** now allows the **eform** and
**eform(***string***)** options for reporting exponentiated coefficients such
as odds ratios, incidence rate ratios, and the like.

See **[BAYES] bayesmh**.

23. **bayesmh, show()**

Existing estimation command **bayesmh** now allows new option
**show(***paramlist***)** to specify which model parameters should be
presented in the output. Option **show()** joins existing option
**noshow()**. Specify one, the other, or neither.

See **[BAYES] bayesmh**.

24. **bayesmh, showreffects**

Existing estimation command **bayesmh** now allows new option
**showreffects** to specify that all random-effects estimates be
presented in the output. They are not displayed by default.

See **[BAYES] bayesmh**.

25. **Postestimation supports new bayes: prefix command**

If you use the new **bayes:** prefix command with multilevel models such
as **mixed** or **meglm**, **bayesgraph**, **bayesstats ess**, and **bayesstats**
**summary** have new options.

New option **showreffects** displays the results for all random-effects
parameters.

New option **showreffects()** displays specified random-effects
parameters.

By default, results are displayed for all model parameters except
the random-effects parameters.

See **[BAYES] bayesian postestimation**.

26. **bayesmh** with nonlinear models has the following updates to the
linear-combination specifications within substitutable expressions:

1. When you specify **{xb: x1 x2}**, the constant term is included
automatically. That is, **{xb: x1 x2}** is equivalent to

**{xb:x1}*x1 + {xb:x2}*x2 + {xb:_cons}**

You can suppress the constant term by specifying the new
**noconstant** option such as **{xb: x1 x2, noconstant}**.

2. The new **xb** option is required when you include only one
variable in the linear-combination specification such as **{xb:**
**z, xb}**. The specification **{xb:z}** without the **xb** option
corresponds to either a free parameter named **z** with the
grouping label **xb** or a regression coefficient on variable **z**
that was included in the previously defined linear
combination **xb**.

3. Regression coefficients of linear combinations are now
defined as **{***xbname***:***varname***}** instead of **{***xbname***_***varname***}**. For
example, if you specify a linear combination **{xb: x1 x2}**, you
refer to regression coefficients of variables **x1** and **x2** as
**{xb:x1}** and **{xb:x2}** instead of **{xb_x1}** and **{xb_x2}**.

The old behavior of linear-combination specifications is available
under version control.

27. **Programmer alerts**

If you consume matrix results from **e()** after **bayesmh**, be aware of
two small labeling issues:

1. Terms that were marked as omitted were not stored in the
matrix's row and column names. They are now. Old behavior
is available under version control.

2. **bayesmh** now includes equation labels in the row and column
names of **e(mean)** and **e(median)**.

__1.3.5 What's new in statistics (power and sample size)__

New is the following:

28. **Power analysis for linear regression, cluster randomized designs,**
**and your own methods**

Stata's **power** command now includes power and sample-size analysis
for linear regression and for cluster randomized designs. In
addition, you can now add your own power and sample-size methods to
the **power** command.

This is *Highlight 12* of the release.

__1.3.6 What's new in statistics (survival analysis)__

Stata's new ability to fit interval-censored parametric survival models
is *Highlight 8* of the release.

Also new are the following:

29. **bayes: streg**

The new **bayes:** prefix command (*Highlight 2*) can be used with
existing estimation command **streg** to fit Bayesian parametric
survival models.

See **[BAYES] bayes: streg**.

30. **fmm: streg**

The new **fmm:** prefix command (*Highlight 9*) can be used with **streg** to
fit finite mixtures of parametric survival models. See **[FMM] fmm:**
**streg**.

31. **streg, strata(i.varname)**

Existing estimation command **streg**'s option **strata()** now allows a
factor variable as an argument. You can specify **strata(i.agegroup)**
for instance. You can specify **strata(i(2 4 6).agegroup** if you want
the stratum to be 2, 4, 6, and treat the other levels as baseline.

Previously, you specified **strata(agegroup)**, and the option created
new dummy variables in the dataset to include them in the model. If
you specify **strata(agegroup**), it is now interpreted as if you
specified **strata(i.agegroup)**. Old behavior is preserved under
version control.

32. **Better names for streg's free parameters**

Free parameters in **streg** models now have more descriptive names.
The scale parameter ln sigma is now named **/lnsigma**, for instance,
and not **/ln_sig**. What was named **/ln_gam** is now named **/lngamma**.
What was named **/ln_the** is now named **/lntheta**. You use the new names
with **_b[]**. You continue to use the old names under version control.

__1.3.7 What's new in statistics (survey data)__

New features are the following:

33. These new estimation commands may be used with the **svy:** prefix:

Command Purpose
-----------------------------------------------------------------
**svy: asmixlogit** Alternative-specific mixed logit
regression
**svy: heckpoisson** Poisson regression with sample selection
**svy: hetregress** Heteroskedastic linear regression
**svy: stintreg** Parametric interval-censored survival
regression
**svy: zioprobit** Zero-inflated ordered probit

Multilevel mixed-effects ...
**svy: metobit** tobit regression
**svy: meintreg** interval regression

**svy: eregress** Extended linear regression
**svy: eintreg** Extended interval regression
**svy: eprobit** Extended probit regression
**svy: eoprobit** Extended ordered probit regression

**svy: gsem** Generalized SEM, including latent class
analysis
-----------------------------------------------------------------

See **[SVY] svy**.

34. The following existing estimation commands support combined use of
**svy:** and **fmm:** to fit survey-adjusted finite mixture models:

Command Purpose
-----------------------------------------------------------------
**svy: fmm: regress** Linear regression
**svy: fmm: tobit** Tobit regression
**svy: fmm: intreg** Interval regression
**svy: fmm: truncreg** Truncated regression
**svy: fmm: ivregress** Instrumental-variable regression
**svy: fmm: logit** Logistic regression
**svy: fmm: probit** Probit regression
**svy: fmm: cloglog** Conditional log-log regression
**svy: fmm: ologit** Ordered logistic regression
**svy: fmm: oprobit** Ordered probit regression
**svy: fmm: mlogit** Multinomial logistic regression
**svy: fmm: poisson** Poisson regression
**svy: fmm: nbreg** Negative binomial regression
**svy: fmm: tpoisson** Truncated Poisson regression
**svy: fmm: betareg** Beta regression
**svy: fmm: glm** Generalized linear model
**svy: fmm: streg** Parametric survival regression
-----------------------------------------------------------------

See **[SVY] svy** and **[FMM] fmm**.

35. **New dots() option**

Existing prefix commands **svy bootstrap:**, **svy jackknife:**, **svy brr:**,
and **svy sdr:** allow new option **dots(***#***)**. It displays a dot every *#*
replications.

See **[SVY] svy bootstrap**, **[SVY] svy jackknife**, **[SVY] svy brr**, and
**[SVY] svy sdr**.

__1.3.8 What's new in statistics (SEM)__

Stata 15's new latent class analysis capabilities are *Highlight 1* of the
release. Existing estimation command **gsem** performs latent class
analysis.

Also new are the following:

36. **Likelihood ratio, AIC, and BIC after classic latent class analysis**

New postestimation command **estat lcgof** reports the G^2
likelihood-ratio test after fitting classic latent class models
(logit outcome models). The test compares the fitted model to the
saturated model. The new command also reports AIC and BIC.

See **[SEM] estat lcgof**.

37. **Marginal means across latent classes**

New postestimation command **estat lcmean** computes marginal means in
latent class models for all response variables across the latent
classes.

See **[SEM] estat lcmean**.

38. **Predicted marginal probability of class membership**

New postestimation command **estat lcprob** reports the marginal
probabilities of class membership in latent class models.

See **[SEM] estat lcprob**.

39. **New predictions after fitting latent class models**

Existing postestimation command **predict** has new options after
fitting latent class models. They are

o **predict, classpr** to predict latent class probabilities;

o **predict, classposteriorpr** to predict posterior latent class
probabilities;

o **predict, mu marginal** to predict the overall expected value of
each outcome by summing the latent class means weighted by the
latent class probabilities;

o **predict, mu pmarginal** to predict the overall expected value of
each outcome by summing the latent class means weighted by the
posterior latent class probabilities; and

o **predict, mu class(***#***)** to predict the expected value of each
outcome for class *#*.

See **[SEM] predict after gsem**.

40. **gsem now fits multiple-group models**

Existing estimation command **gsem**, whether used to fit the new LCA
models or the existing generalized SEM models, now allows the
**group()** option just as command **sem** does. You can type

**. gsem** ...**, group(agegrp)**

41. **sem and gsem report multiple-group models in separate tables**

Both **sem** and **gsem** now report models in a more readable format.
Rather than a single table encompassing all multiple group
parameters, separate tables are produced.

New option **byparm** produces the old output.

See **[SEM] sem reporting options** and **[SEM] gsem reporting options**.

42. **gsem now fits truncated Poisson models**

**gsem**, whether used to fit the new LCA models or the existing
generalized SEM models, now fits truncated Poisson models if you
specify option **family(poisson, ltruncated(**...**))**.

See **[SEM] gsem family-and-link options**.

43. **Variances and covariances as standard deviations and correlations**

New postestimation command **estat sd** after **gsem** reports the estimated
variance components as standard deviations and correlations.

See **[SEM] estat sd**.

__1.3.9 What's new in statistics (panel data)__

New features are the following:

44. **Cointegration test for nonstationary process in panel data**

The new **xtcointtest** command tests for cointegration in nonstationary
panel data. It provides three methods, the ones due to Kao,
Pedroni, and Westerlund. All assume the same null hypothesis but
differ on their specification of the alternative hypotheses.

See **[XT] xtcointtest**.

45. **Option noskip renamed lrmodel**

Existing estimation commands **xtcloglog**, **xtintreg**, **xtlogit**, **xtnbreg**,
**xtologit**, **xtoprobit**, **xtpoisson**, **xtprobit**, **xtstreg**, and **xttobit** had
option **noskip**, which presented the model test as a likelihood-ratio
rather than the default Wald test. This option has been renamed
**lrmodel**. The old option name continues to work. (There was a
justification for the old name. Calculating the likelihood-ratio
test requires fitting the constant-only model. **noskip** specified
that fitting of that model not be skipped!)

__1.3.10 What's new in statistics (time series)__

Stata 15's new support for retrieving Federal Reserve Economic Data
(FRED) is *Highlight 16* of the release.

Also new are the following:

46. **Threshold regression**

New estimation command **threshold** fits threshold regressions. These
are linear regressions in which the coefficients change by estimated
cutpoints. This could be on the basis of time. Then you have one
set of coefficients before the first threshold, another set after
the first and before the second, and so on.

Or it could be on the basis of an exogenous variable. In that case,
you would have a set of coefficients when **x** < the first threshold,
another set after the first and before the second, and so on.

The lagged value of the dependent variable is an example of an
exogenous variable. In that case, you would have a set of
coefficients when **l.y** < the first threshold, another set after the
first and before the second, and so on. This last case is known as
the self-exciting threshold model.

You can specify or estimate the number of threshold points.

See **[TS] threshold**.

47. **Test for structural breaks after time-series regression**

The new **estat** **sbcusum** command is for use after **regress** on **tsset**
data.

It tests for stability of coefficients based on the cumulative sum
(cusum) of either the recursive residuals or the ordinary
least-squares residuals. This can be used to test for structural
breaks due to changes in regression coefficients over time.

**estat sbcusum** also plots the cusum versus time along with confidence
bands. The graph provides additional information that can help
identify time periods in which the coefficients are unstable.

See **[TS] estat sbcusum**.

48. **rolling, dots()**

Existing estimation command **rolling** fits rolling window and
recursive linear regressions. This can be time consuming. It has
new option **dots(***#***)**, which displays a dot every *#* replications. This
is not only entertaining; it provides information about the percent
of the calculation completed.

See **[TS] rolling**.

__1.3.11 What's new in statistics (multivariate)__

Latent class analysis is *Highlight 1* of the release. It is an
alternative to cluster analysis.

Also new is the following:

49. **New bayes: mvreg command**

Stata 15's new **bayes:** prefix command (*Highlight 2* of the release)
can be used with existing estimation command **mvreg** to fit Bayesian
multivariate regression models.

See **[BAYES] bayes**, **[BAYES] bayesian estimation**, and **[MV] mvreg**.

__1.3.12 What's new in functions__

Functions are used in expressions. For instance, **log()** is a function:

**. generate lincome = log(income)**

The functions listed below are also available in both Stata and Mata.

50. **Cauchy distribution**

A new family of Cauchy distribution functions are provided:

**cauchyden(***a***,***b***,***x***)** computes the density of the Cauchy distribution
with location parameter *a* and scale parameter *b*.

**cauchy(***a***,***b***,***x***)** computes the cumulative distribution function of
the Cauchy distribution with location parameter *a* and scale
parameter *b*.

**cauchytail(***a***,***b***,***x***)** computes the reverse cumulative Cauchy
distribution with location parameter *a* and scale parameter *b*.

**invcauchy(***a***,***b***,***p***)** computes the inverse cumulative Cauchy
distribution. If **cauchy(***a***,***b***,***x***)** = *p*, then **invcauchy(***a***,***b***,***p***)** =
*x*.

**invcauchytail(***a***,***b***,***p***)** computes the inverse reverse cumulative
Cauchy distribution. If **cauchytail(***a***,***b***,***x***)** = *p*, then
**invcauchytail(***a***,***b***,***p***)** = *x*.

**lncauchyden(***a***,***b***,***x***)** computes the natural logarithm of the density
of the Cauchy distribution with location parameter *a* and
scale parameter *b*.

**rcauchy(***a***,***b***)** is a Cauchy random-number generator. It computes
Cauchy random variates with location parameter *a* and scale
parameter *b*.

See **[FN] Statistical functions** and **[FN] Random-number functions**.

51. **Laplace distribution**

A new family of Laplace distribution functions are provided:

**laplaceden(***m***,***b***,***x***)** computes the density of the Laplace
distribution with mean *m* and scale parameter *b*.

**laplace(***m***,***b***,***x***)** computes the cumulative distribution function of
the Laplace distribution with mean *m* and scale parameter *b*.

**laplacetail(***m***,***b***,***x***)** computes the reverse cumulative Laplace
distribution with mean *m* and scale parameter *b*.

**invlaplace(***m***,***b***,***p***)** computes the inverse cumulative Laplace
distribution. If **laplace(***m***,***b***,***x***)** = *p*, then **invlaplace(***m***,***b***,***p***)**
= *x*.

**invlaplacetailb(***m***,***b***,***p***)** computes the inverse reverse cumulative
Laplace distribution. If **laplacetail(***m***,***b***,***x***)** = *p*, then
**invlaplacetail(***m***,***b***,***p***)** = *x*.

**lnlaplaceden(***m***,***b***,***x***)** computes the natural logarithm of the density
of the Laplace distribution with mean *m* and scale parameter
*b*.

**rlaplace(***m***,***b***)** is a Laplace random-number generator. It computes
Laplace random variates with mean *m* and scale parameter *b*.

See **[FN] Statistical functions** and **[FN] Random-number functions**.

52. **Stream random numbers**

All of Stata's and Mata's existing random-number functions can now
produce stream random numbers. Streams are necessary for running
simulations and bootstraps simultaneously. Stata's functions
previously produced single streams. In a single stream, setting the
seed determined the random numbers that would be produced. If two
routines running simultaneously set the same seed, they would obtain
the same random numbers. Multiple streams let them produce
different random numbers. Moreover, stream random-number generators
(RNGs) are designed so that you can simultaneously draw random
numbers and know that you are drawing from different sequences.

By default, Stata's and Mata's random-number functions are based on
an underlying RNG. They are **kiss32** and **mt64**. **mt64** is the default,
and **kiss32** was provided for backward compatibility. Now there is a
third RNG: **mt64s**, which is the stream version of the Mersenne
Twister.

To use stream random numbers, you must first set the RNG to **mt64s**:

**. set rng mt64s**

After that, you set the seed the usual way,

**. set seed** *#*

A new command allows you to set the stream,

**. set rngstream** *#*

where 1 __<__ *#* __<__ 32,767. Each stream can produce 2^{128} pseudorandom
numbers before the sequence repeats.

Thus you can launch multiple Statas and run the same do-file to
produce simulations. Each do-file can (and should) use the same
seed. Before starting the do-file, set the **rngstream**, or have the
do-file accept a stream argument to set the stream. Or launch the
separate Statas in batch mode and specify new start-up option
**rngstream***#*.

See **[R] set rngstream**.

__1.3.13 What's new in graphics__

Stata's new features allowing you to specify the transparency or opacity
of colors is *Highlight 14* of the release.

Also new are the following:

53. **Scalable vector graphics**

Stata now supports scalable vector graphics, also known as SVGs.
Vector graphic format is better than raster format because it is,
well, scalable. If you magnify the graph, it does not become grainy
or pixelated.

Scalable vector graphics are written in **.svg** files. This format is
especially popular for use on web pages.

Use **graph** **export,** **as(svg)**.

See **[G-2] graph export**.

54. **New marker symbols**

Markers are used to show where the data lie. Dots, hollow, or solid
circles are popular. Stata has lots of marker symbols. Now it has
more with short names **X**, **x**, **A**, **a**, **V**, **v**, and **|**. Here are all of
Stata 15's marker symbols:

Synonym
*symbolstyle* (if any) Description
-------------------------------------------------------
**circle** **O** solid
**diamond** **D** solid
**triangle** **T** solid
**square** **S** solid
**plus** **+**
**X** **X**
**arrowf** **A** filled arrow head
**arrow** **a**
**pipe** **|**
**V** **V**

**smcircle** **o** solid
**smdiamond** **d** solid
**smsquare** **s** solid
**smtriangle** **t** solid
**smplus**
**smx** **x**
**smv** **v**

**circle_hollow** **Oh** hollow
**diamond_hollow** **Dh** hollow
**triangle_hollow** **Th** hollow
**square_hollow** **Sh** hollow

**smcircle_hollow** **oh** hollow
**smdiamond_hollow** **dh** hollow
**smtriangle_hollow** **th** hollow
**smsquare_hollow** **sh** hollow

**point** **p** a small dot
**none** **i** a symbol that is invisible
-------------------------------------------------------

You cannot rotate the arrows yet, but that is forthcoming.

See **[G-4]** *symbolstyle*.

55. **New graph command for use after fitting nonparametric regression**
**models**

New postestimation command **npgraph** is for use after fitting a
nonparametric regression model using the new **npregress** command.
**npregress** is *Highlight 11* of the release. **npgraph** plots the
nonparametric function fit by **npregress** along with a scatterplot of
the data.

See **[R] npregress postestimation**.

56. **.gph file format updated**

Stata's **.gph** file format was updated because of the new transparent
colors and marker symbols. Previous Statas will not be able to read
the new format, but Stata 15 can read old formats without
difficulty.

__1.3.14 What's new in data management__

Stata's new ICD-10 features is *Highlight 15* of the release, and Stata's
new support of Federal Reserve Data Economic Data is *Highlight 16*.

Also new are the following:

57. **use ... in faster**

Stata's **use** command is now significantly faster when you specify **in**
*range*.

58. **Import and export of dBase files**

New command **import dbase** imports dBase version III and version IV
**.dbf** files. dBase was one of the first micro database management
systems for microcomputers and is still used today.

New command **export dbase** exports to dBase IV format.

See **[D] import dbase**.

59. **Stata/MP allows up to 120,000 variables**

Stata/MP's increase to 120,000 variables is up from 32,767
variables. Stata/SE continues to support up to 32,767 variables,
and Stata/IC continues to support up to 2,047 variables.

60. **statsby, dots()**

Existing command **statsby** has a new option **dots(***#***)** that displays dots
every *#* replications. This provides entertainment and confirmation
that the command is still working during long runs.

See **[D] statsby**.

__1.3.15 What's new in programming__

Dynamic documents using new command **markdown** is *Highlight 5* of the
release. Producing PDF and Word documents using new commands **putpdf** and
**putdocx** is *Highlight 13*.

Also new are the following:

61. **New Java plugin features**

Stata's Java plugins now have features to store and access

o Stata's returned results

o Stata's dataset characteristics

o Stata's **strL** variables as a buffered array

o Stata's string scalars

o Stata's variable types

o Stata's matrices (they could already handle Mata's matrices)

In addition:

1. It is now easier to access Stata's and Mata's matrix
elements.

2. Java plugins can now call Stata commands.

3. Java plugins now use a custom class loader.

a. Stata no longer needs to be restarted after installation
of a new Java plugin, and you can now detach plugins
without restarting Stata.

b. The loader allows for isolation of dependencies between
plugins.

See **[P] java** and **[P] javacall**.

62. **postfile bug fix**

**postfile** previously allowed variable names sometimes to be reserved
words. You could create a variable named **int**, for instance. This
bug is fixed under version control.

63. **New features to support new syntax for free parameters**

Stata has new syntax for free parameters in fitted models. We
mentioned this from the user's perspective in *What's new in*
*statistics (general)*. To rehash, **/***name* is no longer a synonym for
*name***:_cons**; **/***name* is its own thing for free parameters.

Stata has new functions for dealing with matrix row and column names
that include **/***name*:

**coleqnumb(***M***,***s***)** returns the equation number of matrix *M* associated
with column equation *s*.

**roweqnumb(***M***,***s***)** returns the equation number of matrix *M* associated
with row equation *s*.

**colnfreeparms(***M***)** returns the number of free parameters in columns
of matrix *M*.

**rownfreeparms(***M***)** returns the number of free parameters in the
rows of matrix *M*.

Stata also has new macro functions:

**local** *lname* **:** **colnumb** *matrixname* *string*

**local** *lname* **:** **rownumb** *matrixname* *string*

**local** *lname* **:** **coleqnumb** *matrixname* *string*

**local** *lname* **:** **roweqnumb** *matrixname* *string*

**local** *lname* **:** **colnfreeparms** *matrixname*

**local** *lname* **:** **rownfreeparms** *matrixname*

**local** *lname* **:** **colnlfs** *matrixname*

**local** *lname* **:** **rownlfs** *matrixname*

**local** *lname* **:** **colsof** *matrixname*

**local** *lname* **:** **rowsof** *matrixname*

**local** *lname* **:** **colvarlist** *matrixname*

**local** *lname* **:** **rowvarlist** *matrixname*

**local** *lname* **:** **collfnames** *matrixname*

**local** *lname* **:** **rowlfnames** *matrixname*

See **[FN] Matrix functions** and **[P] macro**.

__1.3.16 What's new in Mata__

New are the following:

64. **Cauchy and Laplace distribution functions**

The Cauchy and Laplace distribution functions added to Stata have
been added to Mata, too.

See *What's new in functions*.

65. **Functions for calculating values and derivatives of the multivariate**
**normal distribution**

These functions allow fixed or varying limits, means, and
variances/covariances/correlations. Define

*L* is the vector lower limits (default -infinity)

*U* is the vector upper limits (default infinity)

*m* is the mean vector (default 0)

*R* is the correlation matrix

*V* is the variance matrix

The new functions return the value of the multivariate normal
distribution between *L* and *U*:

**mvnormal(***U***,***R***)**

**mvnormal(***L***,***U***,***R***)**

**mvnormalcv(***L***,***U***,***M***,***V***)**

**mvnormalderiv(***U***,***R***,***dU***,***dR***)** returns derivatives in *dU*, *dR*

**mvnormalderiv(***L***,***U***,***R***,***dL***,***dU***,***dR***)** returns derivatives in *dL*, *dU*, *dR*

**mvnormalcvderiv(***L***,***U***,***M***,***V***,***dL***,***dU***,***dM***,***dV***)** returns derivatives in *dL*,
*dU*, *dM*, *dV*

There are also versions of the above functions that allow
specification of the number of quadrature points.

See **[M-5] mvnormal()**.

66. **Open Office XML files**

New functions were added to the suite for generating Open Office XML
(**.docx**) files:

**_docx_append()**

**_docx_cell_set_span()**

See **[M-5] _docx*()**.

67. **PDF files**

New functions were added to the suite for generating PDF files:

**PdfDocument.setLandscape()**

**PdfParagraph.addLineBreak()**

**PdfParagraph.setVAlignment()**

**PdfTable.setCellBorderWidth()**

**PdfTable.setCellBorderColor()**

**PdfTable.setCellMargin()**

**PdfTable.setRowSplit()**

**PdfTable.addRow()**

**PdfTable.delRow()**

**PdfTable.addColumn()**

**PdfTable.delColumn()**

The following existing functions now have optional arguments and
added capabilities:

**PdfTable.setBorderWidth()**

**PdfTable.setBorderColor()**

**PdfTable.fillStataMatrix()**

**PdfTable.fillMataMatrix()**

See **[M-5] Pdf*()**.

68. **Percent encoding for URLs**

New function **urlencode(***s* [**,** *useplus*]**)** returns *s* with any reserved
characters changed to percent-encoded ASCII. Special characters are
replaced by **%** followed by two hexadecimal digits. For instance,
each space is replaced with **%20**. If **useplus** is specified and
nonzero, spaces are changed to **+**.

New function **urldecode(***s***)** returns *s* with percent encoding undone.

See **[M-5] urlencode()**.

69. **New option moptimize_init_eq_freeparm()**

New function **moptimize_init_eq_freeparm(***M***,** *i***,** {**"on"**|**"off"**}**)**
specifies whether the equation for the *i*th parameter is to be
treated as a free parameter. This setting is ignored if there are
independent variables or an offset attached to the parameter. Free
parameters have a shortcut notation that distinguishes them from
constant linear equations. The free parameter notation for an
equation labeled name is **/***name*. The corresponding notation for a
constant linear equation is *name***_cons**.

See **[M-5] moptimize()**.

__1.3.17 What's new in the interface__

New are the following:

70. **Do-file Editor is improved**

The Do-file Editor is improved. In Stata for Windows,

o the current (active) line is now highlighted;

o colors for line numbers, margins, bookmarks, etc. can be set;

o bookmarks can be added or deleted by clicking in the
bookmarks margin; and

o high-contrast mode is better supported.

Under all operating systems, including Windows,

o column-mode selection and editing are now provided;

o the new indentation guide aids in writing clean code by
displaying vertical lines at every tab stop;

o character encoding of legacy do-files can now be specified so
that any extended ASCII characters are converted to the right
Unicode character;

o comments (**/* */** and **//**) can be added or removed from a
selection;

o code folding for **program**, **mata**, and **input** is now provided;

o wrapped lines can be marked visually;

o program code can be automatically reindented to be properly
aligned, and spaces are converted to tabs.

Concerning column-mode editing: use *Alt*+mouse dragging or
*Alt*+*Shift*+arrow keys on Windows. Substitute *Option* for *Alt* on Mac
and *Ctrl* for *Alt* on Linux.

71. **set more off now the default**

Stata displays --**more**-- when output is about to scroll off the
screen. You press the space bar or click on the **More** button, and
another page of output appears. This is called **set** **more** **on**.

**set** **more** **off** is now the default.

If you prefer the old behavior, type **set** **more** **on,** **permanently**.

See **[R] more**.

72. **If you do set more on ...**

The **More** button has a useful new feature. When output is paused,
click on the triangle to the right of the **More** button. You will
have two choices:

**Show more results**

**Run to completion**

Click on **Run to completion**, and **more** will be temporarily turned off
until the currently running command (or do-file!) completes.

See **[R] more**.

73. **Option to suppress header and footer in logs**

**log** has new option **nomsg**, which suppresses placing the header and
footer in the log. The header reports the filename and date and
time that the log was opened, and the footer reports the filename
and date and time that the log was closed.

See **[R] log**.

74. **Swedish language support**

Swedish now joins Spanish and Japanese as languages for which
Stata's menus, dialogs, and the like can be displayed. Manuals and
help files remain in English.

If your computer locale is set to Sweden, Stata will automatically
use its Swedish setting. To change languages manually, select **Edit**
**> Preferences** **> User-interface language...** using Windows or Unix, or
on the Mac, select **Stata 15 > Preferences > User-interface**
**language...**. You can also change the language from the command
line; see **[P] set locale_ui**.

StataCorp gratefully acknowledges the efforts of Metrika Consulting
AB, Stata's official distributor in Sweden, Finland, Norway, and
Denmark, for the translation to Swedish.

__1.3.18 What's more__

We have not listed all the changes, but we have listed the important
ones.

Stata is continually being updated. Those between-release updates are
available for free over the Internet.

Type **update query** and follow the instructions.

We hope that you enjoy Stata 15.

-------- **previous updates** -----------------------------------------------------

See whatsnew14.

+---------------------------------------------------------------+
| help file contents years |
|---------------------------------------------------------------|
| whatsnew Stata 15.0 and 15.1 2017 to present |
| **this file** Stata 15.0 new release 2017 |
| whatsnew14 Stata 14.0, 14.1, and 14.2 2015 to 2017 |
| whatsnew13to14 Stata 14.0 new release 2015 |
| whatsnew13 Stata 13.0 and 13.1 2013 to 2015 |
| whatsnew12to13 Stata 13.0 new release 2013 |
| whatsnew12 Stata 12.0 and 12.1 2011 to 2013 |
| whatsnew11to12 Stata 12.0 new release 2011 |
| whatsnew11 Stata 11.0, 11.1, and 11.2 2009 to 2011 |
| whatsnew10to11 Stata 11.0 new release 2009 |
| whatsnew10 Stata 10.0 and 10.1 2007 to 2009 |
| whatsnew9to10 Stata 10.0 new release 2007 |
| whatsnew9 Stata 9.0, 9.1, and 9.2 2005 to 2007 |
| whatsnew8to9 Stata 9.0 new release 2005 |
| whatsnew8 Stata 8.0, 8.1, and 8.2 2003 to 2005 |
| whatsnew7to8 Stata 8.0 new release 2003 |
| whatsnew7 Stata 7.0 2001 to 2002 |
| whatsnew6to7 Stata 7.0 new release 2000 |
| whatsnew6 Stata 6.0 1999 to 2000 |
+---------------------------------------------------------------+
-------------------------------------------------------------------------------