New features in Stata 19

Financial statistics StataNow
Local average treatment effects (LATE) StataNow
Import Parquet data StataNow
Psychometric meta-analysis StataNow
Causal mediation analysis with two mediators StataNow
Proportional odds test StataNow
Moderating effects for heterogeneous DID StataNow
Power analysis for logistic regression StataNow
Convert Word to HTML, EPUB, and more StataNow
Convert PDF to plain text StataNow
VCE additions for linear models StataNow
Discrete derivatives StataNow

The DerivDiscreteDiff() class can be used to compute the coefficients for a real, discrete numerical derivative using finite difference approximation.
The DerivDiscretePartial() class can be used to compute discrete numerical partial derivatives.

HDFE interactions StataNow

The areg, ivrgress 2sls, and xtreg, fe commands now allow you to use factor-variable notation when specifying the variables to be absorbed in the absorb() option. This is particularly useful when you wish to include interactions between high-dimensional fixed effects and continuous variables.

Do-file Editor enhancements StataNow

Bracket pair colorization
The Do-file Editor now supports bracket pair colorization for do- and ado-files. Bracket pair colorization is a feature where matching brackets — (), {}, and [] — are highlighted so that users can follow nested code structure at a glance. Each nesting level of brackets receives a distinct, reusable color, making it easy to trace from an opening (, {, or [ to its matching close bracket even in deeply nested, multiline expressions. Unmatched brackets are also displayed in a unique color, allowing them to stand out in the code. You can enable or disable bracket pair colorization in the Do-file Editor preferences as well as customize the different nesting level colors and the unmatched bracket color.
Change history ribbon
The Do-file Editor can now display the types of changes made to a document as colored markers in the change history ribbon, located in the margin. Two marker colors identify that the lines that have been modified and reverted to original. To specify whether you want the change history ribbon to be visible, right-click in the Do-file Editor, select Preferences ..., and check/uncheck Change history in the General or Display tab.
Code folding line guide
The Do-file Editor can now display tree-style line guides in the code folding margin. To specify whether you want to display line guides in the code folding margin, select Preferences..., and check Show line guide in the General or Display tab.
Default action of Do button
In the Do-file Editor, you can now set the default action of the Do button in the toolbar. To set the default action of the Do button, right-click in the Do-file Editor, select Preferences..., and select an action from the dropdown box Default action of Do button: in the Advanced tab.
Improvements for executing selected lines
In the Do-file Editor, you can now select Tools > Execute (do) lines on Windows and Unix or View > Do-file Editor > Execute (do) lines on Mac while multiple lines of text are selected to execute all selected lines. After the lines are executed, the cursor will automatically advance to the next line or code block that can be executed, skipping comments and blank lines.

Improved variable name truncation in the Data Editor StataNow

In the Data Editor grid, you can now select whether variable names will be truncated at the end (the default), in the middle, or 1, 2, 3, or 4 characters before the end. You can change the truncation behavior in the Data Editor's preference dialog.

Download business calendars StataNow

With the new bcal webcopy command, you can download a business calendar for a common stock exchange—S&P 500, NASDAQ, NYSE, and more—and then use this calendar to define dates and easily analyze data that are collected only on the trading days of the selected exchange.

Faster feasible generalized least-squares estimation StataNow

The xtgls command for fitting linear panel-data models via feasible generalized least squares is now faster when fitting models with a homoskedastic or heteroskedastic error structure and no cross-sectional correlation.

Mata quantile() function StataNow

With the new quantile() function, you can now compute quantiles in Mata. You can choose from 3 discontinuous methods and 11 continuous methods for calculating quantiles.

N-dimensional matrices in Mata StataNow

You can now define an N-dimensional matrix in Mata with the new NDMatrix() class.

Heterogeneous DID enhancements StataNow

The hdidregress twfe and xtdidregress twfe commands have the new over() option, which allows you to specify whether estimation be performed using cohort-time-specific covariate means or cohort-specific covariate means.
The hdidregress twfe and xtdidregress twfe commands have the new noxinteract option, which omits the interaction of covariates with the difference-in-differences variables in the model so that covariates enter only as levels.

Bayesian variable selection for linear model
Bayesian bootstrap and replicate weights
Bayesian quantile regression
Bayesian asymmetric Laplace model
New priors for Bayesian analysis

The bayesmh command and bayes prefix now support the half-Cauchy prior. This heavy-tailed prior is useful for modeling nonnegative parameters that tend to have large values, such as variances and standard deviations.
The bayesmh command and bayes prefix now support the Rayleigh prior. This prior distribution is related to \(\chi^2\) and exponential distributions and thus can be used to model nonnegative parameters with skewed distributions. It is often used in physics and engineering to model parameters—for instance, wind speed—that correspond to a norm of a bivariate standard normal random vector.

Gibbs sampling for normal linear models with Laplace priors

The bayesmh command with a univariate normal likelihood and Laplace prior for regression coefficients or with a mean parameter of a normal prior and a Laplace hyperprior now supports Gibbs sampling via the gibbs suboption of the block() option.

The bayesmh command with user-defined evaluators has the following new features:

bayesmh now provides efficient estimation of random-effects parameters in evaluators by using the reffects subpotion of the block() option.
In addition, the evaluator() and llevaluator() options support the reparameters() suboption, which allows the values of random-effects parameters to be passed to the evaluator as temporary variables. These temporary variables are listed as arguments to the evaluators after the names of temporary scalars and matrices for model parameters specified in the parameters() suboption.
bayesmh now supports predictions in evaluators. You may implement predictions in your posterior or likelihood evaluators and still be able to use the bayespredict postestimation command after bayesmh.
Specifically, the evaluator() and llevaluator() options support the predict suboption, which indicates that the evaluator includes code for generating random samples for the outcome.
Evaluators in bayesmh are now expected to compute log-likelihood values over the observation sample and return them in a vector form instead of returning the scalar overall log-likelihood value.

Conditional average treatment effects (CATE)
Local average treatment effects (LATE) StataNow
Causal mediation analysis with two mediators StataNow
HC3 bias-corrected standard errors for DID estimation StataNow
Moderating effects for heterogeneous DID StataNow
Heterogeneous DID enhancements StataNow

The hdidregress twfe and xtdidregress twfe commands have the new over() option, which allows you to specify whether estimation be performed using cohort-time-specific covariate means or cohort-specific covariate means.
The hdidregress twfe and xtdidregress twfe commands have the new noxinteract option, which omits the interaction of covariates with the difference-in-differences variables in the model so that covariates enter only as levels.

The teffects aipw command for estimating treatment effects via augmented inverse-probability weighting can now provide estimates of the ATETs and can adjust results for sampling weights.
The didregress and xtdidregress commands for difference-in-differences estimation have an improved algorithm to compute confidence intervals via the wild bootstrap.

Additional updates

The estat aggregation command for aggregating average treatment effects on the treated after hdidregress and xthdidregress now allows you to specify the weights() option to determine the type of weights used when aggregating. weights(timecohort) uses weights that vary across cohorts and time periods. weights(cohort) uses weights that vary across cohorts but are constant for all time periods within each cohort.
The hdidregress and xthdidregress commands now allow the usercohort option for specifying a known cohort variable rather than allowing these commands to determine cohorts based on the time and group variables.
The new gencohort command creates a cohort variable to be used with hdidregress and xthdidregress.

Multiple datasets: Modify a set of frames
Import Parquet data StataNow
Download business calendars StataNow

With the new bcal webcopy command, you can download a business calendar for a common stock exchange—S&P 500, NASDAQ, NYSE, and more—and then use this calendar to define dates and easily analyze data that are collected only on the trading days of the selected exchange.

The reshape command for converting data from wide to long form and vice versa now defaults to favoring speed instead of favoring memory.
The list command has three new options:

sepbyexp(exp) draws a separator line whenever the value of expression exp changes. exp does not necessarily have to refer to the variables in the dataset.
footer displays variable names as a footer.
relative displays relative observation numbers when a subset of observations is listed.

Additional updates

You can now easily copy value labels across frames. With the new fromframe() and toframe() options of label copy, you can copy a value label from one frame into another frame. The new frame putlabel command allows you to copy multiple value labels from the current frame into one or more other frames.
The new label rename command allows you to rename value labels.

Financial statistics StataNow

You can now generate returns with finreturns, construct portfolios with finportfolio, summarize financial data with finsummarize, and compute value at risk with finvalrisk. New financial regression models are also available—fit a CAPM with finregress capm, or fit a Fama–MacBeth regression with finregress fmb.

Latent class model-comparison statistics

High-dimensional fixed effects (HDFE)
Bayesian bootstrap and replicate weights
Control-function linear and probit models
Inference robust to weak instruments
Proportional odds test StataNow
More powerful tables for ANOVA and tabulations
HDFE interactions StataNow
- The areg, ivrgress 2sls, and xtreg, fe commands now allow you to use factor-variable notation and specify continuous variables and interactions in the absorb() option. This is particularly useful for including interactions of continuous variables and high-dimensional fixed effects.
The nl command for nonlinear least-squares estimaton is now more powerful, with a substitutable expression parser that allows you to define more complicated expressions for linear and nonlinear functions of parameters. Also, it now uses Mata's moptimize() Gauss–Newton algorithm and Mata's deriv() function for computing numerical derivatives.
The generalized method of moments command gmm is now much faster with the xtinstruments() option for specifying panel-style instruments and numerical derivatives. The speed is much improved when the number of panels is large.
The areg command has the new dfabsorb option to adjust degrees of freedom for pairwise collinearity among absorbed variables.
The ivregress 2sls command now computes regression coefficients and their standard errors by using the Mata orthogonal triangularization routine hqrdp, which is more numerically stable than the previous routine.

Additional updates

The dtable command for creating tables of descriptive statistics now allows you to report Kendall's rank correlation test.

The lincom command now accepts the coef option after logistic regression to report linear combinations of estimated coefficients instead of transforming these linear combinations to odds ratios.
The bootstrap, collect, jackknife, nestreg, permute, simulate, statsby, and stepwise prefix commands are now updated to accommodate community-contributed prefix commands.
The mean, proportion, ratio, and total commands now allow you to specify the dofsubpop option to restrict the degrees-of-freedom calculation to use only observations within each subpopulation defined by the over() option.

Bar graph CIs, heat maps, and more
Colors by variable for more graphs

The colorvar() option is now available with additional two-way plots: line, connected, tsline, rline, rconnected, and tsrline. This means that you can vary color lines, markers, and more in these plots based on the values of a specified variable.

Additional updates

The graph bar, graph box, and graph dot commands now allow the assecondcategory option that is useful when you have specified multiple y variables and you have specified the over() option to create bars, boxes, or dots for each category of the over() variable. With assecondcategory, the bars, boxes, or dots are plotted by grouping first based on the over() categories and grouping second on the y variables.
The twoway bar command for creating bar plots and twoway rbar command for creating range plots with bars now allow you to specify the baroffset() option to offset the center of the bar.

Autocompletion, templates, and more
Stata in French
Do-file Editor enhancements StataNow

Bracket pair colorization
The Do-file Editor now supports bracket pair colorization for do- and ado-files. Bracket pair colorization is a feature where matching brackets — (), {}, and [] — are highlighted so that users can follow nested code structure at a glance. Each nesting level of brackets receives a distinct, reusable color, making it easy to trace from an opening (, {, or [ to its matching close bracket even in deeply nested, multiline expressions. Unmatched brackets are also displayed in a unique color, allowing them to stand out in the code. You can enable or disable bracket pair colorization in the Do-file Editor preferences as well as customize the different nesting level colors and the unmatched bracket color.
Change history ribbon
The Do-file Editor can now display the types of changes made to a document as colored markers in the change history ribbon, located in the margin. Two marker colors identify that the lines that have been modified and reverted to original. To specify whether you want the change history ribbon to be visible, right-click in the Do-file Editor, select Preferences ..., and check/uncheck Change history in the General or Display tab.
Code-folding line guide
The Do-file Editor can now display tree-style line guides in the code-folding margin. To specify whether you want to display line guides in the code-folding margin, select Preferences..., and check Show line guide in the General or Display tab.
Default action of Do button
In the Do-file Editor, you can now set the default action of the Do button in the toolbar. To set the default action of the Do button, right-click in the Do-file Editor, select Preferences..., and select an action from the dropdown box Default action of Do button: in the Advanced tab.
Improvements for executing selected lines
In the Do-file Editor, you can now select Tools > Execute (do) lines on Windows and Unix or View > Do-file Editor > Execute (do) lines on Mac while multiple lines of text are selected to execute all selected lines. After the lines are executed, the cursor will automatically advance to the next line or code block that can be executed, skipping comments and blank lines.

Stata now shows the name of the current (working) frame in the title bar when multiple frames are present. This allows you to easily identify the frame in which any commands issued without a frame prefix will be executed.
Improved variable name truncation in the Data Editor StataNow

In the Data Editor grid, you can now select whether variable names will be truncated at the end (the default), in the middle, or 1, 2, 3, or 4 characters before the end. You can change the truncation behavior in the Data Editor's preference dialog.

Additional updates

You can now specify that the Do-file Editor save back up files in an operating-system-specific directory that is local to the host computer instead of saving the files in the same directory that the document is saved in. This setting can be found in the Do-file Editor's advanced settings. Saving to a directory on the host computer is ideal when editing documents from a network drive where saving files may be slow, or when collaborating with other users where an existing backup file can cause conflicts when editing a shared document.
On Windows and Unix, when a file on disk is open in the Do-file Editor, the Do-file Editor will now prompt you if the contents of the file have been changed outside the Do-file Editor. You can then click on OK to reload the file, click on Cancel to ignore the changes and leave the open file as is, or click on Auto to reload the file and automatically load changed files in the future. You can modify how the Do-file Editor monitors changes to files on disk via the Do-file Editor's Advanced settings. Note that Stata for Mac already supported watching for changes to file on disk.

Conditional average treatment effects (CATE)
High-dimensional fixed effects (HDFE)
Bayesian variable selection for linear model
Correlated random-effects (CRE) model
Control-function linear model
VCE additions for linear models StataNow
Inference robust to weak instruments
Mundlak specification test
Panel-data VAR model
Bayesian quantile regression
HDFE intractions StataNow

The areg, ivrgress 2sls, and xtreg, fe commands now allow you to use factor-variable notation when specifying the variables to be absorbed in the absorb() option. This is particularly useful when you wish to include interactions between high-dimensional fixed effects and continuous variables.

The areg command has the new dfabsorb option to adjust degrees of freedom for pairwise collinearity among absorbed variables.
The ivregress 2sls command now computes regression coefficients and their standard errors using the Mata orthogonal triangularization routine hqrdp, which is more numerically stable than the previous routine.

Machine learning via H2O: Ensemble decision trees

Additional updates

The new h2omlgraph permimp command plots permutation variable importance. The graph is useful for identifying influential predictors after using h2oml for gradient boosting or random forest.
You can now obtain regression diagnostic plots after using h2oml gbregress to perform gradient boosting regression or h2oml rfregress to perform random forest regression.

h2omlgraph rvfplot graphs the residuals against the fitted values.
h2omlgraph rvpplot graphs the residuals against a predictor.

Stata for Mac on Apple silicon now uses LAPACK/OpenBLAS for numerical computations.
The new quantile() function computes quantiles. You can choose from 3 discontinuous methods and 11 continuous methods for calculating quantiles. StataNow
Two Mata classes for discrete derivatives are now available. StataNow
- The DerivDiscreteDiff() class can be used to compute the coefficients for a real, discrete numerical derivative using finite difference approximation.
- The DerivDiscretePartial() class can be used to compute discrete numerical partial derivatives.
The new _invmat() function finds the inverse of a square matrix (or the pseudoinverse if the matrix is not invertible).
The new _solvemat() function solves AX=B, and it allows you to specify the matrix type, such as lower triangular or symmetric and positive definite, and chooses the suitable solver correspondingly.
The new NDMatrix() class defines an N-dimensional matrix. StataNow
The deriv() function for numerical derivatives has the following improvements:

You can now use the Richardson extrapolation method to compute the numerical derivatives by specifying "richardson" in the deriv_init_technique() function.
You can now use deriv_init_tablesize() to specify the table size for computing numerical derivatives by using the Richardson extrapolation method.

The new lssolve() function solves AX=B for the minimum-norm solutionX by using the least-squares method.
The new lsesolve() function solves AX=c with equality constraints for the minimum-norm solution X using the least-squares method.
The new lsglmsolve() function finds the solution of a general Gauss–Markov linear model with equality constraints.
The new st_datalabel() function returns the label of the dataset currently loaded in Stata.
The new st_datalabel(string scalar name) function sets or resets the label of the dataset currently loaded in Stata.
The new st_vldir() function returns a string vector of names of all value labels.

High-dimensional fixed effects (HDFE)
Correlated random-effects (CRE) model
Panel-data VAR model
Bayesian panel-data quantile regression
Mundlak specification test
Driscoll–Kraay SEs for linear fixed-effects models StataNow
The xtreg, fe command has the new dfabsorb option to adjust degrees of freedom for pairwise collinearity among absorbed variables.
Faster feasible generalized least-squares estimation StataNow
- The xtgls command for fitting linear panel-data models via feasible generalized least squares is now faster when fitting models with a homoskedastic or heteroskedastic error structure and no cross-sectional correlation.

The pkequiv command for performing bioequivalence tests has the following improvements:

Log-scale analysis of bioequivalence is now performed by default, following guidance from the US FDA and the European EMA.
Schuirmann's two one-sided tests are now performed by default.
The new reps() option controls the number of bootstrap replications.
Additional results are now stored. These include r(limit_table), a matrix of equivalence limits, confidence intervals, and estimates.

Power analysis for logistic regression StataNow
power and ciwidth have new more descriptive labels for table columns that are ratios of two other columns.

PyStata enhancements
The PyStata features for integrating Python into Stata and integrating Stata into Python have the following improvements:

When running Stata code within an IPython kernel-based environment, such as Jupyter Notebook and console as well as Jupyter Lab and console, and within other environments that support the IPython kernel, such as Spyder IDE and PyCharm IDE, Stata’s variable names in the current working dataset, macro names, and results r(), e(), and s() can now be autocompleted as you type by pressing the Tab key.
A new %help line magic is now available; it allows you to view the help information of the specified Stata command or topic in the web browser.
You can now control whether to echo the Stata commands along with their output when executing them in the Python environment; and you can control whether to display Stata’s output simultaneously when the execution begins or to display the output after Stata finishes execution.
In the sfi module, new class BreakError is available; it allows for interrupting Python execution by using the Break key in Stata.

The Stata–Python API specification has the following new features in the Frame class:

The new getCWF() function returns the name of the current working frame in Stata.
The new getName() function returns the name of the frame.
The new getFrames() function returns the names of all frames in Stata.

The Stata–Java API specification has the following new features in the Frame class:

The new getCWF() function returns the name of the current working frame in Stata.
The new getName() function returns the name of the frame.
The new getFrames() function returns the names of all frames in Stata.

The matrix accum and matrix vecaccum commands now support samples with one observation.

Additional updates

The Stata–Python API specification has the following new features for interacting with NumPy arrays and pandas DataFrames:

In the Data class, the fromNPArray() function loads a NumPy array into Stata's memory, and the toNPArray() function exports values from the current Stata dataset into a NumPy array.
In the Data class, the fromPDataFrame() function loads a pandas DataFrame into Stata's memory, and the toPDataFrame() function exports values from the current Stata dataset into a pandas DataFrame.
In the Frame class, the fromNPArray() function loads a NumPy array into a Stata frame, and the toNPArray() function exports values from a Stata frame into a NumPy array.
In the Frame class, the fromPDataFrame() function loads a pandas DataFrame into a Stata frame, and the toPDataFrame() function exports values from a Stata frame into a pandas DataFrame.
In the Mata class, the fromNPArray() function stores a NumPy array as a Mata matrix, and the toNPArray() function exports a Mata matrix into a NumPy array.
In the Matrix class, the fromNPArray() function stores a NumPy array as a Stata matrix, and the toNPArray() function exports a Stata matrix into a NumPy array.

Command syntax now supports numbers in minimum option abbreviations, identified by capital letters at the beginning of option names. This means you can now define an option name such as case1option with minimum abbrevation case1op by defining the option as CASE1OPtion with the syntax command.

Additional updates

You can now specify whether notes under a table should wrap at the tables width when exporting to SMCL and plain text by using the new collect style smcl and collect style txt commands. Specialized commands for creating tables—table, dtable, etable, and lcstats—have a new option for specifying whether notes under the table should wrap at the table width.
putdocx begin has the new compmode() option, which allows you to set the compatibility mode to be used by Word when opening the generated document. You may choose to create a document that is compatible with the current version of Word or with Word 2013, Word 2010, Word 2007, or Word 2003.

Bayesian bootstrap and replicate weights
You can use the new rwgen command and new options for the bootstrap prefix to implement specialized bootstrap schemes.

rwgen generates standard replication and Bayesian bootstrap weights. The command provides two methods: rwgen bsample generates frequency weights by resampling observations, and rwgen bayes generates Bayesian bootstrap weights using the Dirichlet distribution. These weights can be used directly with the bootstrap prefix.
bootstrap has new fweights() and iweights() options for performing bootstrap replications using custom weights. fweights() allows users to specify frequency weight variables for resampling, and iweights() lets users provide importance weight variables. These options extend bootstrap's flexibility by allowing user-supplied weights instead of internal resampling, making it easier to implement specialized bootstrap schemes and enhance reproducibility.

The spmatrix create command for creating standard-format spatial weighting matrices is now substantially faster.

Latent class model-comparison statistics

More powerful tables with svy: tabulate

Additional updates

The svyset command and the svy prefix now allow you to specify the dofsubpop option to restrict the design degrees-of-freedom calculation to use only within-subpopulation primary sampling units.

Interval-censored multiple-event Cox model
Enhancements to survival graphs

sts graph has two new options, altrisktable and altrisktable(), that provide an alternative at-risk table beneath the plot. This table reports the number at risk, the cumulative number lost, and the cumulative number of failures for each time point on the x axis or at user-specified time points. The new table also allows customization via suboptions. Several new suboptions, including show(), row(), and grouptitle(), are available for further customization.
estat gofplot has new options for customizing goodness-of-fit plots. The new plot#opts() and byplot#opts() options control the rendition of plots. After stmgintcox, the new events() and sepevents options produce event-specific cumulative hazard functions; the new event#opts() option controls the rendition of event-specific plots; and the new graph#opts() and by#opts() options control the look of the #th graph.
stcurve has new options for customizing graphs of survivor and related functions. The new plot#opts() and atplot#opts() options control the rendition of plots. After stmgintcox, the new events() and sepevents options produce event-specific functions; the new event#opts() option controls the rendition of event-specific plots, the new graph#opts() option controls the look of the #th graph, and the new byopts() option controls how subgraphs are combined and labeled.

Financial statistics StataNow
SVAR models via instrumental variables
Instrumental-variables local-projection IRFs
The var command for fitting vector autoregressive (VAR) models now allows the vce(robust) option to estimate robust standard errors.
Cumulative structural impulse–response functions (IRFs) are now computed by irf create after ivsvar and ivlpirf. Cumulative structural IRFs sum individual IRFs and are useful for evaluating the total effect of a shock over time. You can graph these statistics with irf graph csirf and tabulate them with irf table csirf.

New features in Stata 19

Download the updates

Ready to get started?

Experience powerful statistical tools, reproducible workflows, and a seamless user experience—all in one trusted platform.

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies