We introduced many highlights of the Stata 18 release here. But Stata 18
includes much more. Here we list all the new features in Stata 18, organized
by topic so that you can easily find your favorites. And we will continuously add even more features. Explore features
made available at the initial release of Stata 18, features added in
free updates, and features introduced as part of StataNow™. You can also view all Stata features.
The teffects aipw command for estimating treatment effects
via augmented inverse-probability weighting can now provide
estimates of the average treatment effects on the treated
and can adjust results for sampling weights.
The bayesmh command and bayes prefix now support the half-Cauchy prior. This heavy-tailed prior is useful for modeling nonnegative parameters such as variances and standard deviations that tend to have large values.
The bayesmh command and bayes prefix now support the Rayleigh prior. This prior distribution is related to \(\chi^2\) and exponential distributions and thus can be used to model nonnegative parameters with skewed distributions. It is often used in physics and engineering to model parameters—for instance, wind speed—that correspond to a norm of a bivariate standard normal random vector.
The Do-file Editor has the following new features:
Code folding enhancements. Code folding allows you to selectively
hide parts of a document so that you can focus on sections of
interest. Stata’s Do-file Editor allows you to selectively fold
blocks of code in a do-file such as programs, Mata code, Python
code, functions, and if statements by collapsing them to a single
line. You can now quickly fold all foldable blocks of code in
your do-file by using the Fold all menu item. You can then
selectively unfold your code one fold point at a time to show the
more important parts of your do-file, or you can use the Do-file
Editor’s Unfold all menu item to unfold every fold point. You can
also select lines of code and transform them into a foldable block
of code by using the Fold selection menu item. This can tidy up
your code and increase the code’s readability. And finally, there
is a new setting for the Do-file Editor that will automatically
fold every foldable block of code of a do-file when it is opened.
Autocomplete variable names. Stata’s Do-file Editor now includes
the ability to autocomplete variable names from data in memory.
If you pause briefly as you type, the Do-file Editor will suggest
a list of commands, variable names of data in memory, and words
that are already in the do-file. Once the suggestions appear,
more typing will narrow down the possibilities. You can navigate
the suggestions by using the up- and down-arrow keys or keep
typing to narrow them to a single word. Once you have the word
you like, you can press Return to place the word in your do-file.
The colorvar() option is now available with additional
two-way plots: line, connected, tsline, rline, rconnected, and tsrline.
This means you can vary color lines, markers, and more in these
plots based on the values of a specified variable.
The PyStata features for integrating Python into Stata and integrating Stata into Python have the following improvements:
When running Stata code within an IPython kernel-based environment,
such as Jupyter Notebook and console as well as Jupyter Lab and
console, and within other environments that support the IPython
kernel, such as Spyder IDE and PyCharm IDE, Stata’s variable names
in the current working dataset, macro names, and results r(), e(),
and s() can now be autocompleted as you type by pressing the Tab key.
A new %help line magic is now available; it allows you to view the help information of the specified Stata command or topic in the web browser.
You can now control whether to echo the Stata commands along with
their output when executing them in the Python environment; and you
can control whether to display Stata’s output simultaneously when
the execution begins or to display the output after Stata finishes
execution.
In the sfi module, new class BreakError is available; it allows
for interrupting Python execution by using the Break key in Stata.
The bayesmh command and bayes prefix now support the half-Cauchy prior. This heavy-tailed prior is useful for modeling nonnegative parameters such as variances and standard deviations that tend to have large values.
The bayesmh command and bayes prefix now support the Rayleigh prior. This prior distribution is related to \(\chi^2\) and exponential distributions and thus can be used to model nonnegative parameters with skewed distributions. It is often used in physics and engineering to model parameters—for instance, wind speed—that correspond to a norm of a bivariate standard normal random vector.
After fitting a difference-in-differences model with didregress or xtdidregress to data comprising multiple cohorts that are treated at different times, you can use the new estat bdecomp command to decompose the average treatment effect on the treated (ATET) into components. The results are useful in determining whether the treatment effects are heterogeneous and, if so, how much the heterogeneity impacts the overall ATET reported by didregress or xtdidregress.
The teffects aipw command for estimating treatment effects via augmented inverse-probability weighting can now provide estimates of the ATETs and can adjust results for sampling weights.
When you estimate treatment effects with telasso and use a Poisson model for your outcome, you can now specify an exposure variable that reflects the amount of time over which events were observed.
You can now use the eform option to request that exponentiated coefficients be reported when fitting a model with eregress or xteregress. Exponentiated coefficients are useful, for example, when interpreting results of a linear model fit to a log-transformed outcome.
reshape is now faster—often 10 times, 100 times, or even more times faster. With the new favor(speed) option of reshape, you can reshape your data from long form to wide form or from wide form to long form in a manner that is faster but may require more memory than when you use the default favor(memory) option.
Alternatively, you may type set reshape_favor speed to specify that the default method for reshaping favors speed.
The new import haverdirect command allows you to import economic and financial data from Haver Analytics databases with the use of Haver's Cloud platform.
The import haver command has the following improvements:
import haver can now now import 7-daily series from Haver Analytics databases.
import haver has a new frame() option that allows series information to be saved to a specified data frame.
The jdbc connect and jdbc add commands now support usernames and passwords greater than 5,000 characters.
Many Stata estimation commands support the vce(robust) option for estimating robust standard errors and the vce(clusterclustvar) option for estimating cluster–robust standard errors. These options are now supported by two additional commands:
The sureg command, which fits seemingly unrelated regression models.
The reg3 command, which fits systems of simultaneous equations via three-stage least squares.
Exact p-values are now available for Spearman's rank correlation coefficients. The spearman command now supports the exact() option to compute the exact p-value using a Monte Carlo sampling of the permutation distribution or using a complete enumeration of the permutation distribution.
Stata's factor-variable notation allows users to specify categorical variables and interactions in variable lists in many commands. This notation is now supported by additional commands:
The exlogistic command, which fits exact logistic regression models.
The expoisson command, which fits exact Poisson regression models.
After fitting simultaneous-quantile regression models with sqreg, you can use the new estat coefplot command to plot the coefficients and their confidence intervals across quantiles.
The nlcom command, which computes nonlinear combinations of parameters, now supports the eform[()] option to report exponentiated nonlinear parameters.
The table command now computes two additional statistics: the geometric mean and the geometric standard deviation, which are specified using statistic(geomean) and statistic(geosd), respectively. In addition, strL variables may now be used to define the rows, columns, and separate tables.
When computing numerical derivatives with ml model, you can now specify a minimum step size, which is helpful for avoiding unstable results that can arise during iterative maximization when the computed step size is too small.
The colorvar() option is now available with additional
two-way plots: line, connected, tsline, rline, rconnected, and tsrline.
This means you can vary color lines, markers, and more in these
plots based on the values of a specified variable.
When you create by-graphs, you can now specify the altlegbystyle to move the legend to the six o'clock position and use two columns for the legend. For instance, specifying the by(group, style(altleg)) option will create the graph for each value of group and will place the two-column legend at the bottom of these graphs.
You can now specify a minimum length for the axis labels with the labelminlen(#) option. This can be particularly helpful if you are creating multiple graphs that you intend to combine by using graph combine. You can, for instance, specify ylabel(labelminlen(5)) with each graph to ensure that at least five characters are used for the y-axis labels; labels are padded with spaces on the left if needed. Adding this option to each graph specification will allow you to create graphs with labels of the same width so that axes will align nicely when combined.
When exporting a graph to a PDF, you can now customize the font, width, height, and other aspects of the PDF.
We introduced an all-new graph style with a new color palette, horizontal y-axis labels, legend on the right side of the graph, and more. These new features are available in the st family of schemes.
For those who prefer a monochrome graph, we now offer two additional schemes: stmono1 and stmono2. These schemes follow the look of the labels, legends, and other features of the st schemes but with monochromatic markers and fill colors.
For those publishing articles in the Stata Journal, the new stsj scheme reflects the style of the st family with horizontal y-axis labels and a white background. This is now the official scheme for the Stata Journal.
graph twoway now allows you to specify aspect ratios based on common units in the x and y dimensions. Adding the aspectratio(1, units) option creates a graph where one unit in the x dimension and one unit in the y dimension take up the same distance on the plot. This is useful for plotting things such as latitude against longitude, which have common scaling of their units.
You can also specify arbitrary relative scales—typing aspectratio(100, units) specifies that each unit in the y dimension is 100 times the length of a unit in the x dimension.
You can now use OpenType fonts (.otf) in graphs when you use graph export to export your graph to SVG, PDF, PNG, JPEG, EMF, TIF, or GIF.
The Do-file Editor now allows automatic backups
and syntax highlighting of user-defined keywords.
The Data Editor now has pinnable rows and columns,
resizable cell editors, tooltips for truncated text,
the ability to show variable labels in the column
headers, and proportional-width fonts.
The Do-file Editor has the following new features:
Code folding enhancements. Code folding allows you to selectively
hide parts of a document so that you can focus on sections of
interest. Stata’s Do-file Editor allows you to selectively fold
blocks of code in a do-file such as programs, Mata code, Python
code, functions, and if statements by collapsing them to a single
line. You can now quickly fold all foldable blocks of code in
your do-file by using the Fold all menu item. You can then
selectively unfold your code one fold point at a time to show the
more important parts of your do-file, or you can use the Do-file
Editor’s Unfold all menu item to unfold every fold point. You can
also select lines of code and transform them into a foldable block
of code by using the Fold selection menu item. This can tidy up
your code and increase the code’s readability. And finally, there
is a new setting for the Do-file Editor that will automatically
fold every foldable block of code of a do-file when it is opened.
Autocomplete variable names. Stata’s Do-file Editor now includes
the ability to autocomplete variable names from data in memory.
If you pause briefly as you type, the Do-file Editor will suggest
a list of commands, variable names of data in memory, and words
that are already in the do-file. Once the suggestions appear,
more typing will narrow down the possibilities. You can navigate
the suggestions by using the up- and down-arrow keys or keep
typing to narrow them to a single word. Once you have the word
you like, you can press Return to place the word in your do-file.
filter tables of a DSN or filter columns of a table, and
select which columns of a table to load into Stata.
When you run multiple instances of Stata in Windows, the Stata instance number will now appear on the following top-level windows: Do-File Editor, Data Editor, Variables Manager, SEM Builder, Graph windows, and Viewer windows.
In Windows, the new set taskbargroups setting affects how Stata windows are grouped on the taskbar. If taskbar grouping is enabled, different instances of Stata will be grouped separately on the taskbar. This setting is enabled by default.
Stata for Mac now prompts you for your preferred window layout when it is first launched. You can choose the Sidebar layout, which may be preferable for small laptop displays, or the Widescreen layout, which may be preferable for desktop monitors or large laptop displays.
The Do-file Editor now extends syntax highlighting of comments to lines joined by ///.
The Data Editor now supports left- or right-aligning text based on
the justification from a variable's format. The setting can be
enabled from the Data Editor's preferences.
The doedit command, which opens the Do-file Editor, can now open URLs.
When you estimate treatment effects with telasso and use a Poisson model
for your outcome, you can now specify an exposure variable that reflects
the amount of time over which events were observed.
Many Stata estimation commands support the vce(robust) option for estimating robust standard errors and the vce(clusterclustvar) option for estimating cluster–robust standard errors. These options are now supported by two additional commands:
The sureg command, which fits seemingly unrelated regression models.
The reg3 command, which fits systems of simultaneous equations via three-stage least squares.
After fitting a difference-in-differences model with didregress or xtdidregress to data comprising multiple cohorts that are treated at different times, you can use the new estat bdecomp command to decompose the average treatment effect on the treated (ATET) into components. The results are useful in determining whether the treatment effects are heterogeneous and, if so, how much the heterogeneity impacts the overall ATET reported by didregress or xtdidregress.
After fitting simultaneous-quantile regression models with sqreg, you can use the new estat coefplot command to plot the coefficients and their confidence intervals across quantiles.
You can now use the eform option to request that exponentiated coefficients be reported when fitting a model with cnsreg, eivreg, eregress, hetregress, intreg, rreg, tobit, truncreg, or xteregress. Exponentiated coefficients are useful, for example, when interpreting results of a linear model fit to a log-transformed outcome.
When computing numerical derivatives with Mata functions deriv(), moptimize(), optimize(), and solvenl(), you can now specify a minimum step size, which is helpful for avoiding unstable results that can arise during iterative maximization when the computed step size is too small.
You can now use the complex step method when you compute numerical derivatives using Mata's deriv() function.
After meta regress, you can now use predict with the reses() option, which is specified with the reffects option, to compute comparative standard errors for the random effects. Diagnostic standard errors can be obtained by adding the diagnostic suboption.
After meta mvregress, you can now use predict with the reses() option to compute comparative standard errors for the random effects by default. Diagnostic standard errors can be obtained by adding the diagnostic suboption.
Stata's factor-variable notation allows users to specify categorical variables and interactions in variable lists in many commands. This notation is now supported by additional commands:
The discrim knn command, which performs kth-nearest-neighbor discriminant analysis.
The discrim logistic command, which performs logistic discriminant analysis.
You can now use the eform option to request that exponentiated coefficients be reported when fitting a model with xtreg or xteregress. Exponentiated coefficients are useful, for example, when interpreting results of a linear model fit to a log-transformed outcome.
The PyStata features for integrating Python into Stata and integrating Stata into Python have the following improvements:
When running Stata code within an IPython kernel-based environment,
such as Jupyter Notebook and console as well as Jupyter Lab and
console, and within other environments that support the IPython
kernel, such as Spyder IDE and PyCharm IDE, Stata’s variable names
in the current working dataset, macro names, and results r(), e(),
and s() can now be autocompleted as you type by pressing the Tab key.
A new %help line magic is now available; it allows you to view the help information of the specified Stata command or topic in the web browser.
You can now control whether to echo the Stata commands along with
their output when executing them in the Python environment; and you
can control whether to display Stata’s output simultaneously when
the execution begins or to display the output after Stata finishes
execution.
In the sfi module, new class BreakError is available; it allows
for interrupting Python execution by using the Break key in Stata.
The ValueLabel class can now work with Stata's extended missing values using the following methods:
ValueLabel.getLabel(name, value) allows value to be .a, .b, ..., .z in addition to an integer value so that it can return the labels associated with Stata's missing values.
ValueLabel.getValueLabels(name) returns Stata's missing label as a key if the value label contains a missing value associated with a label. Previously, the key was returned as an integer missing value.
ValueLabel.setLabelValue(name, value, label) allows value to be .a, .b, ..., .z in addition to an integer value so that it can set labels for missing values.
ValueLabel.getValues(name) returns Stata's missing label in the result if the value label contains a missing value associated with a label. Previously, the value was returned as an integer missing value.
ValueLabel.removeLabelValue(name, value) allows value to be .a, .b, ..., .z in addition to an integer value so that it can remove labels for missing values.
Missing.getValue(val=None) allows users to input None or ., .a, ..., .z to access Stata's missing values. Previously, val could be None or a, b, ..., z.
Missing.getMissing(value) gets the missing symbol associated with value that represents the corresponding missing value in Stata.
The Data class has a new function, isAlias(var), that returns whether a variable in the current dataset is an alias for a variable in another frame.
The Frame class has a new function, isAlias(var), that returns whether a variable in the current dataset is an alias for a variable in another frame.
The ValueLabel class can now work with Stata's extended missing values using the following methods:
ValueLabel.getLabel(java.lang.String, double) gets the label for a specified value-label value.
ValueLabel.getValueLabels(String name, Map<LabelValue,String> map) gets the value and label pairings for a specified value-label name.
ValueLabel.removeLabelValue(String name, Missing.Extended missingValue) removes a value-label value from the specified value-label name.
ValueLabel.setLabelValue(String name, Missing.Extended missingValue, String label) sets a value and label for a value-label name.
The LabelValue class was added to encapsulate a Stata value-label value.
The Data class has a new function, isAlias(int var), that returns whether a variable in the current dataset is an alias for a variable in another frame.
The Frame class has a new function, isAlias(int var), that returns whether a variable in the current dataset is an alias for a variable in another frame.
The matlist command has two new options for customizing the display of matrices. The rightindent option indents data by one space relative to the end of row lines. The rowtitleleft option displays row titles flush left.
Programmers working with alias variables in frames can take advantage of four new macro functions:
isaliasvarname returns 1 for an alias variable and 0 otherwise.
aliasframevarname returns the name of a frame that varname is linked to.
aliaslinknamevarname returns the name of the linking variable that was used to create varname.
aliasvarnamevarname returns the name of the variable that varname is linked to.
When working with h2o, subcommands of _h2oframe no longer require a
leading underscore. For example, _h2oframe put can be used instead
of _h2oframe _put.
h2o init will now generate credentials
and create new H2O instances using those credentials by default.
The new h2o credentials query and h2o credentials clear commands
are available for retrieving and managing credentials that Stata
automatically creates using h2o init.
The _h2oframe command for interacting with H2O has the following new subcommands:
_h2oframe factor converts columns in the H2O frame to categorical.
_h2oframe levelsof lists all levels of a categorical column.
_h2oframe baselevel sets the base level of a categorical column.
_h2oframe recodelevel assigns new levels of a categorical column.
putdocx and putpdf can now create documents including up to 10,000 tables. You can specify the maximum number of tables allowed by using the new set docx_maxtable and set pdf_maxtable commands.
dtable now allows you to test for differences in continuous variables across groups using the version of the Kruskal–Wallis rank test that adjusts for ties.
When creating tables with collect, you can now more easily hide the title of a factor variable when you are using the table style that stacks row header elements in one column.
Tables produced by collect now automate appropriate labeling of t and z statistics in more situations. For instance, when you create a table of linear regression results from the regress command, the test statistics are labeled as “z”. However, when you request bootstrap standard errors, the statistics are now automatically labeled as “t”.
table now allows you to specify the statistics you would like to report using abbreviations. For instance, you can specify freq rather than frequency when requesting that frequencies be reported.
You can now use OpenType fonts (.otf) when creating PDF documents with putpdf.
You now have more control over formats in tables created by dtable and table. With the new basestyle suboption within the nformat() option, you can change the format for results that do not already have one without overriding the format for all results.
The stcurve command plots the survivor, failure, hazard, or cumulative hazard function after fitting many models for survival-time data. In Stata 18, stcurve has the following new features:
After fitting a shared-frailty Cox model with stcox, you can now specify the expression _frailty = (numlist) in the at() option to adjust estimates of survivor and related functions for frailties set to the values in numlist.
After fitting a Cox model for a multiple-record-per-subject interval-censored dataset using stintcox, you can specify the new atmeans option to evaluate the survivor or other function at time-specific means of the covariate.
After fitting a Cox model for a multiple-record-per-subject interval-censored dataset using stintcox, you can specify the new atframe(frname) option to evaluate the survivor or other function at the values of the variables specified in the frname frame.
After lasso cox or elasticnet cox, you can calculate predictions based on penalized coefficients by default, or you can calculate predictions based on postselection coefficients by specifying the postselection option.
After fitting a shared-frailty Cox model with stcox, predict now allows the atfrailty and atfrailty(varname|#) options when you predict the baseline survivor function, baseline cumulative-hazard function, or baseline hazard contributions. If you specify atfrailty, frailties are set to their estimated values when computing predictions. If you specify atfrailty(varname|#), frailties are instead set to the values in varname or #.
The stintcox command, which fits Cox proportional hazards models for interval-censored data, now supports the vce(robust) option for estimating robust standard errors and the vce(clusterclustvar) option for estimating cluster–robust standard errors.
The var command for fitting vector autoregressive (VAR) models now allows the vce(robust) option to estimate robust standard errors.
Here we have told you about many of the new features in Stata 18. Yet there is still more. See the What's new for a complete chronological list of updates.
Download the updates
Make sure you have access to the new features released in updates. In Stata
18, type
. update all
in the Command window. Then type
. help whatsnew
to see a list of all the new features added since Stata 18 was released.
If you have StataNow, the above steps will also give you access to all StataNow features.
Don't have Stata 18? Upgrade today to access these new features.
We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.
Cookie Settings
Privacy policy
Last updated: 16 November 2022
StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.
These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
Advertising and performance cookies
This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.
Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.