Stata 15 help for whatsnew6to7

What's new in release 7 (compared with release 6)

This help file lists the changes corresponding to the creation of Stata release 7:

+---------------------------------------------------------------+ | help file contents years | |---------------------------------------------------------------| | whatsnew Stata 15.0 and 15.1 2017 to present | | whatsnew14to15 Stata 15.0 new release 2017 | | whatsnew14 Stata 14.0, 14.1, and 14.2 2015 to 2017 | | whatsnew13to14 Stata 14.0 new release 2015 | | whatsnew13 Stata 13.0 and 13.1 2013 to 2015 | | whatsnew12to13 Stata 13.0 new release 2013 | | whatsnew12 Stata 12.0 and 12.1 2011 to 2013 | | whatsnew11to12 Stata 12.0 new release 2011 | | whatsnew11 Stata 11.0, 11.1, and 11.2 2009 to 2011 | | whatsnew10to11 Stata 11.0 new release 2009 | | whatsnew10 Stata 10.0 and 10.1 2007 to 2009 | | whatsnew9to10 Stata 10.0 new release 2007 | | whatsnew9 Stata 9.0, 9.1, and 9.2 2005 to 2007 | | whatsnew8to9 Stata 9.0 new release 2005 | | whatsnew8 Stata 8.0, 8.1, and 8.2 2003 to 2005 | | whatsnew7to8 Stata 8.0 new release 2003 | | whatsnew7 Stata 7.0 2001 to 2002 | | this file Stata 7.0 new release 2000 | | whatsnew6 Stata 6.0 1999 to 2000 | +---------------------------------------------------------------+

Most recent changes are listed first.

--- more recent updates -------------------------------------------------------

See whatsnew7.

--- Stata 7 release 15dec2000 -------------------------------------------------

The features added to Stata 7 are listed under the following headings.

Changes you cannot help but notice Long (32-character) names New varlist abbreviation rules Windowed Stata now across all platforms Improved output, more clickability Improvements to by Sort stability European decimal format Faster

Statistics Estimation commands (exclusive of st and xt) Cross-sectional time-series analysis (xt) Survival analysis (st) Commands for epidemiologists Marginal effects Cluster analysis Pharmacokinetics Other statistical commands Distribution functions

Nonstatistical improvements Graphics New commands New string functions Other new functions


Changes you cannot help but notice

Long (32-character) names

Stata now allows names to be up to 32 characters long. That includes variable names, label names, macro names, and any other name you can think of. This includes program names, and we have renamed a few existing Stata programs:

Prior name New name ------------------------ llogist llogistic xthaus xthausman spikeplt spikeplot stcurv stcurve svyintrg svyintreg svyprobt svyprobit svymlog svymlogit svyolog svyologit svyoprob svyoprobit

The old names continue to work.

In any case, now you do not have to name your variable f_inc1999, you can name it farm_inc_1999 or farm_income_1999 or even farm_income_in_fiscal_year_1999. Where possible, we have adjusted Stata output to allow 12 spaces for displaying names. When names are longer than that, you will discover that Stata abbreviates and shows, for instance, farm_in~1999. ~ is the new Stata abbreviation character, which Stata not only uses in output but which you can use in input (which is to say, in varlists; see help varlist). If you type farm_in~1999, f~1999, or f~in~1999, Stata will understand that you mean farm_income_in_fiscal_year_1999. Thus, if in output Stata presents dose~d1~42, that name is unique and you can type it and Stata will understand it.

describe now has two new options, fullname and numbers. fullname shows the full, 32-character names, instead of shorter ~-abbreviations, and numbers shows the variable number.

New varlist abbreviation rules

Varlists now understand * when used as other than a suffix. You can still type pop*, but you can also type pop*99 or pop*30_40*1999 or even *1999. * means "zero or more characters go here". Also understood is the new ~ abbreviation character mentioned above. * and ~ really mean the same thing and work the same way, except ~ adds the claim "and only one variable matches this pattern", whereas * means "give me all the variables that match this pattern".

The other new abbreviation character is ?, which means "one character goes here", so result?10 might match resultb10 and resultc10, but would not match resultb110.

Windowed Stata now across all platforms

Stata for Unix users now have the same windowed interface that Stata for Windows and Stata for Mac users have: type xstata rather than stata to start Stata. Typing stata brings up the old line-by-line console version of Stata. Typing xstata brings up the new windowed version. The old console version is still useful in batch situations, but Stata(console), as it is now called, can no longer render graphs.

Improved output, more clickability

Stata's output looks better thanks to the new output language called SMCL, which stands for Stata Markup and Control Language. Moreover, all Stata output, whether it be help files in the help window (now called the Viewer), help files in the Results window, or statistical output, is SMCL, meaning all features are available in all contexts. One implication is that if something is clickable, it is clickable regardless of the window in which it is displayed, so you can start by typing help anova and click on links just as you could had you pulled down Help and gone about displaying the help in the help window (Viewer).

Clickability is not limited to help files. You can write programs that display in their output clickable links. The corresponding action can even be the execution of another Stata command or program!

The help window is now called the Viewer because it serves more purposes than solely displaying help files. The Viewer, for instance, is where you look at logs you have previously created or are creating. That's because, by default, Stata logs are now SMCL files and the default file extension for log files is .smcl to remind you of that. When you type `log using myfile', myfile.smcl is created. The file is ASCII, so you can look at it (and even edit it) in your editor or word processor, but it is not a pretty sight.

Formatted, however, it is pretty. The Viewer can print the SMCL logs Stata now creates, and the new translate command can translate the SMCL file to PostScript format, or even standard ASCII text format, so you can get back to just where you were in Stata 6; see help translate. Moreover, you can directly create old-style ASCII text logs if that is your preference; just type `log using myfile.log' or `log using myfile, text'; see help log.

The Viewer can be accessed by pulling down File, or you can use the new view command, which provides some additional features; see help view.

Programmers will want to see help smcl for a complete description of SMCL. You can use SMCL in your ado-files.

There is one other log change: you can now create command logs (ASCII text logs containing only what you type, which used to be called noproc logs) using the new cmdlog command. Even better, you can create command logs and full session logs simultaneously; see help log.

Stata(console) for Unix users: All the above applies to you, too, except that you cannot click. Stata(console) does not have a view command, but type can display .smcl files, and translate can translate them. See help conren for instructions on how to make SMCL output look as good as possible on your line-by-line console.

Improvements to by

by varlist: now has a sort option. You can type, for instance, `by foreign, sort: summarize mpg' or, equivalently, `bysort foreign: summarize mpg', rather than first sorting the data and then typing the by command; see help by.

by has a new parenthesis notation: `by id (time): ...' means to perform ... by id, but first verify that the data are sorted by id and time. `by id (time), sort: ...' says to sort the data by id and time and then perform ... by id.

There is also a new rc0 option, which says to keep on going even if one of the by-groups results in an error.

More importantly, by varlist: is now allowed with virtually every Stata command, including commands implemented as ado-files, including egen. We have been claiming for some time that whether a command is built-in or implemented as an ado-file is irrelevant, it has the same features. Now the claim is true. Programmers: see help byprog for instructions on how to make your programs and ado-files allow the by prefix; it is easy.

The commands generate, replace, drop, keep, and assert no longer present the detailed, group-by-group report when prefixed with by, meaning you no longer need to prefix them with quietly:

. by id: replace bp = bp[_n-1] if bp==. (120 changes made)

Sort stability

Commands that report results of calculations (commands not intended to change the data) no longer change the sort order of the data. If you type `sort id time', you can be assured that your dataset will stay sorted by id and time. This is true even if the command is implemented as an ado-file.

Programmers: see [P] sortpreserve for instructions on making your old programs and ado-files sort stable. It is easy, and the performance penalty is barely measurable.

European decimal format

Stata now understands output formats such as %9,2f as well as %9.2f. In %9,2f, the number 500.5 is displayed as 500,50. In %9,2fc format, the number 1,000.5 is displayed as 1.000,50.

Even better, you can now set dp comma to modify all of Stata's output to use the European format, including all statistical output. See help format.


Stata 7 has more features, but continuing our long tradition, it is also faster; ado-files execute between 8.8 and 11.8 percent faster. Some programs, we have observed, execute 13 percent faster.



Estimation commands (exclusive of st and xt)

First, all maximum-likelihood estimation commands of Stata now allow linear constraints; each has a new constraint() option. See the particular estimator.

boxcox has been rewritten. It now produces maximum likelihood estimates of the coefficients and the Box--Cox transform parameter(s). Box--Cox models may be estimated in various forms, with the transform on the left, on the right, or on both sides. See help boxcox.

glm has also been rewritten. It continues to estimate the generalized linear model, but now offers an expanded choice of link functions and also allows user-specified link and variance functions. glm will now report maximum-likelihood based estimates of standard errors, IRLS based estimates, and many others. See help glm

nlogit estimates nested logit models. In a nested logit model, multiple outcomes are grouped into a nested tree structure, and nested logit has the advantage over multinomial and conditional logistic models of allowing you to parameterize away the assumption of independence of the irrelevant alternatives (IIA). See help nlogit.

treatreg estimates the treatment effects model using either a two-step estimator or a full maximum-likelihood estimator. The treatment effects model considers the effect of an endogenously chosen binary treatment on another endogenous continuous variable, conditional on two sets of independent variables. See help treatreg.

truncreg estimates truncated regression models. Truncated regression refers to regressions estimated on samples drawn based on the dependent variable, and therefore for which (sometimes) neither the dependent nor independent variables are observed (as opposed to tobit, which estimates regression models when the independent variables are observed in all cases). See help truncreg.

Cross-sectional time-series analysis (xt)

xtabond produces the Arellano--Bond one-step, one-step robust, and two-step estimators for dynamic panel-data models, models in which there are lagged dependent variables. xtabond can be used with exogenously unbalanced panels and, uniquely, handles embedded gaps in the time series as well as opening and closing gaps. xtabond allows for predetermined covariates. xtabond allows you to use either the full instrument matrix or a pared down version. xtabond reports both the Sargan and autocorrelation tests derived by Arellano and Bond. See help xtabond.

xtregar estimates cross-sectional time-series models in which epsilon_it is assumed to follow an AR(1) process. xtregar reports the within estimator and a GLS random-effects estimator. xtregar can handle unequally spaced observations and exogenously unbalanced panels. xtregar uniquely reports the modified Bhargava et al. Durbin--Watson statistic and the Baltagi--Wu locally best invariant test statistic for autocorrelation. See help xtregar.

xtivreg estimates cross-sectional time-series regressions with (generalized) instrumental variables, or, said differently, estimates two-stage least squares time-series cross-sectional models. xtivreg can estimate such models using the between-2SLS estimator, the within-2SLS estimator, the first-differenced 2SLS estimator, the Balestra--Varadharajan--Krishnakumar G2SLS estimator, or the Baltagi EC2SLS estimator. All the estimators allow use of balanced or (exogenously) unbalanced panels. See help xtivreg.

xtpcse produces panel-corrected standard errors (PCSE) for linear cross-sectional time-series models where the parameters are estimated by OLS or Prais--Winsten regression. When computing the standard errors and the variance--covariance estimates, the disturbances are, by default, assumed to be heteroskedastic and contemporaneously correlated across panels. See help xtpcse.

Survival analysis (st)

stcox will now estimate proportional hazard models with continuously time-varying covariates, and you do not need to modify your data to obtain the estimates. See the tvc() and texp() options in help stcox.

streg can now estimate parametric survival models with individual-level frailty (unobserved heterogeneity). Two forms of the frailty distribution are allowed: gamma and inverse gaussian. Frailty is allowed with all the parametric distributions currently available. See help streg. (New commands weibullhet, ereghet, etc., allow users to estimate these models outside of the st system; see help weibull.)

streg has also been modified to allow estimation of stratified models, meaning that the distributional parameters (the ancillary parameters and intercept) are allowed to differ across strata. See the strata() option in help streg.

streg has also been modified to allow you to specify any linear-in-the-parameters equation for any of the distributional parameters, which allows you to create various forms of stratification, as well as allowing distributional parameters to be linear functions of other covariates. See the ancillary() option in help streg.

stptime calculates person-time (person-years) and incidence rates and implements computation of the standardized mortality/morbidity ratios (SMR). See help stptime.

sts test has been modified to include additional tests for comparing survivor distributions, including the Tarone--Ware test, the Fleming--Harrington test, and the Peto--Peto--Prentice test. Also new is a test for trend. See help sts.

stci calculates and reports the level and confidence intervals of the survivor function, as well as computing and reporting the mean survival time and confidence interval. See help stci.

stsplit is now much faster and now allows for splitting on failure times, as well as providing some additional convenience options. See help stsplit, but remember that stcox can now estimate with continuous time-varying covariates without you having to stsplit the data beforehand.

stcurve has a new outfile option. See help streg.

Commands for epidemiologists

Five new commands are provided for the analysis of Receiver Operating Characteristic (ROC) curves.

roctab is used to perform nonparametric ROC analyses. By default, roctab calculates the area under the curve. Optionally, roctab can plot the ROC curve, display the data in tabular form, and produce Lorenz-like plots. See help roctab.

rocfit estimates maximum-likelihood ROC models assuming a binormal distribution of the latent variable. rocplot may be used after rocfit to plot the fitted ROC curve and simultaneous confidence bands. See help rocfit.

roccomp tests the equality of two or more ROC areas obtained from applying two or more test modalities to the same sample or to independent samples. See help roccomp.

rocgold independently tests the equality of the ROC area of each of several test modalities against a "gold" standard ROC curve. For each comparison, rocgold reports the raw and the Bonferroni adjusted significance probability. Optionally, Sidak's adjustment for multiple comparisons can be obtained. See help rocgold

binreg estimates generalized linear models for the binomial family and various links. It may be used with either individual-level or grouped data. Each of the link functions offers a distinct, epidemiological interpretation of the estimated parameters. See help binreg.

cc and cci now, by default, compute exact confidence intervals for the odds ratio. See help cc.

icd9 and icd9p assist when you are working with ICD-9-CM diagnostic and procedure codes. These commands allow the cleaning up, verification, labeling, and selection of ICD-9 values. See help icd9.

Marginal effects

mfx reports marginal effects after estimation of any model. Marginal effects refers to df()/dx_i evaluated at x, where f() is any function of the data and the model's estimated parameters, x are the model's covariates, and x_i is one of the covariates. For instance, the model might be probit and f() the cumulative normal distribution, in which case df()/dx_i = the change in the probability of a positive outcome with respect to a change in one of the covariates. x might be specified as the mean, so that the change would be evaluated at the mean.

dprobit would already do that for the probit model, and there have been other commands published in the STB that would do this for other particular models, such as dtobit for performing tobit estimation.

mfx works after estimation of any model in Stata and is capable of producing marginal effects for anything predict can produce. For instance, after tobit, you could get the marginal effect of the probability of an outcome being uncensored, or the expected value of the uncensored outcome, or the expected value of the censored outcome.

mfx can compute results as derivatives or elasticities. See help mfx

Cluster analysis

cluster performs partitioning and hierarchical cluster analysis using a variety of methods. Two partitioning cluster methods are provided -- kmeans and kmedians -- and three hierarchical-cluster methods are provided -- single linkage, average linkage, and complete linkage. Included are 14 binary similarity measures and 7 different continuous measures (counting things such as the Minkowski distance # as one).

The result is to add various characteristics to the dataset, including variables reflecting cluster membership. cluster can then can display results in various ways.

More than one result can be saved simultaneously, so that the results of different analyses may be compared. cluster allows adding notes to analyses and, of course, the dropping of analyses. cluster also provides post-clustering commands that can, for instance, display the dendrogram (clustering tree) from a hierarchical analysis or produce new grouping variables based on the analysis.

cluster has been designed to be extended. Users may program extensions for new cluster methods, new cluster management routines, and new post-analysis summary methods.

See help cluster and, if you are interested in programming extensions, see help clprog.


There are four new estimation commands and two new utilities intended for the analysis of pharmacokinetic data; see help pk.

pkexamine calculates pharmacokinetic measures from time-and-concentration subject-level data. pkexamine computes and displays the maximum measured concentration, the time at the maximum measured concentration, the time of the last measurement, the elimination rate, the half-life, and the area under the concentration-time curve (AUC). See help pkexamine.

pksumm obtains the first four moments from the empirical distribution of each pharmacokinetic measurement and tests the null hypothesis that the measurement is normally distributed. See help pksumm.

pkcross analyzes data from a crossover design experiment. When analyzing pharmaceutical trial data, if the treatment, carryover, and sequence variables are known, the omnibus test for separability of the treatment and carryover effects is calculated. See help pkcross.

pkequiv performs bioequivalence testing for two treatments. By default, pkequiv calculates a standard confidence interval symmetric about the difference between the two treatment means. Optionally, pkequiv calculates confidence intervals symmetric about zero and intervals based on Fieller's theorem. Additionally, pkequiv can perform interval hypothesis tests for bioequivalence. See help pkequiv.

pkshape and pkcollapse help in reshaping the data into the form that the above commands need; see help pkshape and pkcollapse.

Other statistical commands

jknife performs jackknife estimation, which is (1) an alternative, first-order unbiased estimator for a statistic; (2) a data-dependent way to calculate the standard error of the statistic and to obtain significance levels and confidence intervals; and (3) a way of producing measures reflecting the observation's influence on the overall statistic. See help jknife.

lfit, lroc, lsens, and lstat now work after probit just as they do after logit or logistic.

drawnorm draws random samples from a multivariate normal distribution with specified means and covariance matrix. See help drawnorm.

corr2data creates fictional datasets with the specified means and covariance matrix (correlation structure). Thus, you can take published results and duplicate and modify them if the estimator is solely a function of the first two moments of the data, such as regress, ivreg, anova, or factor. See help corr2data.

median performs a nonparametric test that K samples were drawn from populations with the same median. See help median.

tabstat displays tables of summary statistics, possibly broken down (conditioned) on another variable. See help tabstat.

The command avplot now works after estimation using the robust or cluster() options. See help avplot.

ml can now perform estimation with linear constraints. All that is required is that you specify the constraint() option on the ml maximize command. See help ml.

Distribution functions

Stata's density and distribution functions have been renamed. First, all the old names continue to work, even when not documented in the manual, at least under version control. The new standard, however, is, if X is the name of a distribution, then

Xden() is its density X() is its cumulative distribution invX() is its inverse cumulative Xtail() is its reverse cumulative invXtail() is its inverse reverse cumulative

Not all functions necessarily exist and, if they do not, that is not solely due to laziness on our part. In particular, concerning the choice between X() and Xtail(), the functions exist that we have accurately implemented. In theory, you only need one because Xtail() = 1 - X(), but in practice, the one-minus subtraction wipes out lots of accuracy. If one really wants an accurate right-tail or left-tail probability, one needs a separately written Xtail() or X() routine, written from the ground up.

Anyway, forget everything you ever knew about Stata's distribution functions. Here is the new set:

normden() same as old normd() norm() same as old normprob() invnorm() same as old invnorm()

chi2() related to old chiprob(); see below invchi2() related to old invchi(); see below chi2tail() related to old chiprob() invchi2tail() related to old invchi()

F() related to old fprob() invF() related to old invfprob() Ftail() same as old fprob() invFtail() equal to old invfprob()

ttail() related to old tprob(); see below invttail() related to old invt(); see below

nchi2() equal to old nchi() invnchi2() equal to old invnchi() npnchi2() equal to old npnchi()

We want to emphasize that if a function exists, it is calculated accurately. To wit, F() accurately calculates left tails, and Ftail() accurately calculates right tails; Ftail() is far more accurate than 1 - F().

There is no normtail() function. The accurate way to calculate left-tail probabilities (z<0) is norm(z). The accurate way to calculate right-tail probabilities (z>0) is norm(-z).

All the old functions still exist, but in two cases, they work only under version control: The old invt(), under the new naming logic, ought to be the inverse of the cumulative, but is not, so invt() goes into forced retirement for a release or two. It works if version is set to 6 or before; otherwise, you get the error "unknown function invt()". Similarly, the old invchi() goes into forced retirement because it is too close to the new name invchi2().


Nonstatistical improvements


Stata's graph command now allows line styles. Whereas before you might have specified c(lls) on the graph command to indicate the first variable was to be connected by lines, the second variable was to be connected by lines, and the third variable was to be connected by a cubic spline, you can now specify things like c(l l[-] s[-.]) to indicate the same thing and to also specify the style of the lines used to show the result. The first is to be shown by a solid line, the second by a dashed line, and the third by a line in a dash-dot-dash-dot pattern.

You can still specify the old style, or mix old and new style. In the square brackets you can type a pattern which is made up of the following pieces:

l (el) solid line (default) _ (underscore) a long dash - (hyphen) a medium dash . (period) a short dash (almost a dot) # (pound sign) a space

The pattern you specify repeats.

The keys at the top of graphics have been improved -- they now show the line style as well as the point, and you can now exercise control over the keys with the new key1(), key2(), key3(), and key4() options. The key#() options allow you to specify the text, the symbol, the line style, and the color, in any combination. key1(c(l[.-]) s(x) p(2) "Explanatory text") creates a key displaying a dot-dash-dot-dash line pattern, symbol small x (symbol(x) is new), in the color of pen 2, with the text "Explanatory text".

You can now specify xsize(#) and ysize(#) options on graph (and with the programming command gph open). These specify the size of the graph, in inches, and take effect when you print the graph. The default is xsize(6) and ysize(4).

Printing is now a little different. Because Stata 7 now includes a windowed interface for all operating systems, Unix included, you can pull down File and choose Print Graph. You can also use the new print command; see help print. The translate command can translate from .gph format to other file formats.

Compared to previous versions, this means the Unix stand-alone executables gphdot and gphpen are now gone; you do not need them. print is better. This also means the old gphprint command of Stata, available under Windows and Mac only, is also supplanted for printing by print and for file translation by translate.

The .gph file format has changed, meaning Stata 6 cannot display or print Stata 7 .gph files (but Stata 7 can display and print Stata 6 files). The old Stage editor cannot edit Stata 7 graphs.

The line-by-line console version of Stata for Unix can no longer display graphs, although the graph command works in the sense that you can graph into a file and print the results. To see graphs on the screen, you must use the windowed version of Stata.

The programmer's command gph continues unmodified, but programmers are alerted that Stata 7 has a new programmable bottom-layer graphics engine. You may wish to code your graphics programs using this new feature and, if so, point your browser at

Documentation for the new developmental system resides there.

Note: Your copy of Stata may have new graphic features not listed here. New features might be added when you type update to obtain and install the latest updates from To find out about any new graphics features see help whatsnew. Help whatsnew gives a complete list of all new features, graphics and otherwise, provided by your current update. Help graphics will document new end-user graphics features that are added through the life of Version 7.

New commands

foreach is a new programming command, but it can be used directly and is a useful alternative to for and while. With foreach, you can type things such as

. foreach file in this.dta that.dta theother.dta { 2. use `file', clear 3. replace bp=. if bp==999 4. save `file', replace 5. }

See help foreach.

Likewise, the new forvalues programming command is a useful alternative to for and while that steps through numeric values. Instead of coding

. local i = 1 . while `i' <= `n' { 2. ... `i' ... 3. local i = `i' + 1 4. }

you code

. forvalues i = 1(1)`n' { 2. ... `i' ... 3. }

See help forvalues.

continue (and continue, break) allow you to continue out of, or break out of, while, forvalues, and foreach loops; see help continue.

net search searches the web for user-written additions to Stata, including, but not limited to, user-written additions published in the STB. The user-written materials found are available for immediate download and automatic installation by clicking on the link. net search is the latest incarnation of webseek, a command not included in Stata 6 but which was made available during the release, and which continues to work but is now undocumented. See help net.

destring makes converting variables from string to numeric easier. See help destring.

The following new egen functions have been added: any(), concat(), cut(), eqany(), ends(), kurt(), mad(), mdev(), mode(), neqany(), pc(), seq(), skew(), and tag(). In addition, group() and rank() have new options. See help egen.

statsby creates a dataset of the results of a command executed by varlist:. The results can be any of the saved results of the specified command and, if it is an estimation command, the coefficients and the standard errors. Typing `statsby "regress mpg weight" _b _se e(r2), by(foreign)', for instance, would create a two-observation dataset in which the first recorded the coefficients, standard error, and R^2 for foreign = 0, and the second recorded them for foreign = 1. See help statsby.

xi has been modified to exploit Stata's longer variable names to create more readable names for the interaction terms. See help xi.

hexdump will give you a hexadecimal dump of a file. Even more useful is its analyze option, which will analyze the dump for you and report just the summary. This can be useful for diagnosing problems with raw datasets. See help hexdump.

type has a new asis option. The default behavior of type has been changed when the filename ends in .smcl to interpret the SMCL codes. This way, if you previously created a session log by typing `log using mylog', you can type `type mylog.smcl' to display it as you probably want to see it. If you wanted to see the raw SMCL codes, you would type `type mylog.smcl, asis'. See help type.

net stata.toc and *.pkg files now allow the v directive. You are supposed to code `v 2' at the top of the files and, if you do that, you may use SMCL directives in the files; see help net and smcl.

format now allows you to type the %fmt first or last, so you can equally well type `format mpg weight %9.2f' or `format %9.2f mpg weight'. See help format.

version may now be used as a prefix command; you can type `version 6: ...' to mean that ... is to be run under version 6. See help version.

There are now three shell-like commands, depending on your operating system: shell, xshell, and winexec. Stata for Window's users: nothing has changed. Stata for Mac users: nothing has changed. Stata(console) for Unix users: nothing has changed. Stata(GUI) for Unix, however, is more complicated, and it all has to do with whether you want a new xterm window created for the application. See help shell.

Numlists may now be specified as a[b]c as well as a(b)c. See help numlist.

list now has a doublespace option. See help list.

confirm names verifies that what follows, follows Stata's naming syntax -- which is to say, starts with a letter or underscore and thereafter contains letters, underscores, or digits -- and is not too long.

estimates hold has two new options and one new behavior that will be of interest to programmers. The new behavior is that if estimates are held under a temporary name, they are now automatically discarded when the program terminates. The new restore option schedules the held estimates for automatic restoration on program termination. The new not option to estimates unhold cancels the previously scheduled restoration. The new copy option to estimates hold copies the current estimates rather than moving them. See help estimates.

_rmcoll and _rmdcoll assist in removing collinear variables from varlists; see help _rmcoll and _rmdcoll.

New string functions

There are four new string functions: match(), subinstr(), subinword(), and reverse().

match(s_1,s_2) returns 1 if string s_1 "matches" s_2. In the match, * in s_2 is understood to mean zero or more characters go here, and ? is understood to mean one character goes here. match("this","*hi*") is true. In s_2, \\, \?, and \* can be used if you really want a \, ?, or * character.

subinstr(s_1,s_2,s_3,n) and subinword(s_1,s_2,s_3,n) substitute the first n occurrences of s_2 in s_1 with s_3. subinword() restricts "occurrences" to be occurrences of words. In either, n may be coded as missing value, meaning to substitute all occurrences. For instance, subinword("measure me","me","you",.) returns "measure you", and subinstr("measure me","me","you",.) returns "youasure you".

reverse(s) returns s turned around. reverse("string") returns "gnirts".

A fifth new string function is really intended for programmers: abbrev(s,n) returns the n-character ~-abbreviation of the variable name s. abbrev(s,12) is the function used throughout Stata to make 32-character names fit into 12 spaces.

See help functions.

Other new functions

The new functions inrange() and inlist() make choosing the right observations easier.

inrange() handles missing values elegantly when selecting subsamples such as a <= x <= b. inrange(x,a,b) answers the question, "Is x known to be in the range a to b?" Obviously, inrange(.,1000,2000) is false. a or b may be missing. inrange(x,a,.) answers whether it is known that x >= a, and inrange(x,.,b) answers whether it is known that x <= b. inrange(.,.,.) returns 0 which, if you think about it, is inconsistent but is probably what you want.

inlist(x,a,b,...) selects observations if x = a or x = b or ....

See help functions for more information on the above functions. Other functions have been added. _by(), _bylastcall(), and _byindex() deal with making programs and ado-files allow the by varlist: prefix; see help byprog.

The new macro extended function: {r|e|s}({scalars|macros|matrices|functions}) returns the names of all the saved results of the indicated type. For instance, local x : e(scalars) returns the names of all the scalars currently stored in e(). See help macro.

--- previous updates ----------------------------------------------------------

See whatsnew6.


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index