__What's new in release 7 (compared with release 6)__

This help file lists the changes corresponding to the creation of Stata
release 7:

+---------------------------------------------------------------+
| help file contents years |
|---------------------------------------------------------------|
| whatsnew Stata 15.0 and 15.1 2017 to present |
| whatsnew14to15 Stata 15.0 new release 2017 |
| whatsnew14 Stata 14.0, 14.1, and 14.2 2015 to 2017 |
| whatsnew13to14 Stata 14.0 new release 2015 |
| whatsnew13 Stata 13.0 and 13.1 2013 to 2015 |
| whatsnew12to13 Stata 13.0 new release 2013 |
| whatsnew12 Stata 12.0 and 12.1 2011 to 2013 |
| whatsnew11to12 Stata 12.0 new release 2011 |
| whatsnew11 Stata 11.0, 11.1, and 11.2 2009 to 2011 |
| whatsnew10to11 Stata 11.0 new release 2009 |
| whatsnew10 Stata 10.0 and 10.1 2007 to 2009 |
| whatsnew9to10 Stata 10.0 new release 2007 |
| whatsnew9 Stata 9.0, 9.1, and 9.2 2005 to 2007 |
| whatsnew8to9 Stata 9.0 new release 2005 |
| whatsnew8 Stata 8.0, 8.1, and 8.2 2003 to 2005 |
| whatsnew7to8 Stata 8.0 new release 2003 |
| whatsnew7 Stata 7.0 2001 to 2002 |
| **this file** Stata 7.0 new release 2000 |
| whatsnew6 Stata 6.0 1999 to 2000 |
+---------------------------------------------------------------+

Most recent changes are listed first.

--- **more recent updates** -------------------------------------------------------

See whatsnew7.

--- **Stata 7 release 15dec2000** -------------------------------------------------

The features added to Stata 7 are listed under the following headings.

Changes you cannot help but notice
Long (32-character) names
New varlist abbreviation rules
Windowed Stata now across all platforms
Improved output, more clickability
Improvements to by
Sort stability
European decimal format
Faster

Statistics
Estimation commands (exclusive of st and xt)
Cross-sectional time-series analysis (xt)
Survival analysis (st)
Commands for epidemiologists
Marginal effects
Cluster analysis
Pharmacokinetics
Other statistical commands
Distribution functions

Nonstatistical improvements
Graphics
New commands
New string functions
Other new functions

-------------------------------------------------------------------------------

__Changes you cannot help but notice__

__Long (32-character) names__

Stata now allows names to be up to 32 characters long. That includes
variable names, label names, macro names, and any other name you can think
of. This includes program names, and we have renamed a few existing Stata
programs:

Prior name New name
------------------------
**llogist** llogistic
**xthaus** xthausman
**spikeplt** spikeplot
**stcurv** stcurve
**svyintrg** svyintreg
**svyprobt** svyprobit
**svymlog** svymlogit
**svyolog** svyologit
**svyoprob** svyoprobit

The old names continue to work.

In any case, now you do not have to name your variable **f_inc1999**, you can
name it **farm_inc_1999** or **farm_income_1999** or even
**farm_income_in_fiscal_year_1999**. Where possible, we have adjusted Stata
output to allow 12 spaces for displaying names. When names are longer than
that, you will discover that Stata abbreviates and shows, for instance,
**farm_in~1999**. **~** is the new Stata abbreviation character, which Stata not
only uses in output but which you can use in input (which is to say, in
varlists; see help varlist). If you type **farm_in~1999**, **f~1999**, or
**f~in~1999**, Stata will understand that you mean
**farm_income_in_fiscal_year_1999**. Thus, if in output Stata presents
**dose~d1~42**, that name is unique and you can type it and Stata will
understand it.

describe now has two new options, **fullname** and **numbers**. **fullname** shows the
full, 32-character names, instead of shorter **~**-abbreviations, and **numbers**
shows the variable number.

__New varlist abbreviation rules__

Varlists now understand ***** when used as other than a suffix. You can still
type **pop***, but you can also type **pop*99** or **pop*30_40*1999** or even ***1999**. *****
means "zero or more characters go here". Also understood is the new **~**
abbreviation character mentioned above. ***** and **~** really mean the same thing
and work the same way, except **~** adds the claim "and only one variable
matches this pattern", whereas ***** means "give me all the variables that
match this pattern".

The other new abbreviation character is **?**, which means "one character goes
here", so **result?10** might match **resultb10** and **resultc10**, but would not
match **resultb110**.

__Windowed Stata now across all platforms__

Stata for Unix users now have the same windowed interface that Stata for
Windows and Stata for Mac users have: type **xstata** rather than **stata** to
start Stata. Typing **stata** brings up the old line-by-line console version
of Stata. Typing **xstata** brings up the new windowed version. The old
console version is still useful in batch situations, but Stata(console), as
it is now called, can no longer render graphs.

__Improved output, more clickability__

Stata's output looks better thanks to the new output language called SMCL,
which stands for Stata Markup and Control Language. Moreover, all Stata
output, whether it be help files in the help window (now called the
Viewer), help files in the Results window, or statistical output, is SMCL,
meaning all features are available in all contexts. One implication is
that if something is clickable, it is clickable regardless of the window in
which it is displayed, so you can start by typing **help** **anova** and click on
links just as you could had you pulled down **Help** and gone about displaying
the help in the help window (Viewer).

Clickability is not limited to help files. You can write programs that
display in their output clickable links. The corresponding action can even
be the execution of another Stata command or program!

The help window is now called the Viewer because it serves more purposes
than solely displaying help files. The Viewer, for instance, is where you
look at logs you have previously created or are creating. That's because,
by default, Stata logs are now SMCL files and the default file extension
for log files is **.smcl** to remind you of that. When you type `**log using**
**myfile**', **myfile.smcl** is created. The file is ASCII, so you can look at it
(and even edit it) in your editor or word processor, but it is not a pretty
sight.

Formatted, however, it is pretty. The Viewer can print the SMCL logs Stata
now creates, and the new **translate** command can translate the SMCL file to
PostScript format, or even standard ASCII text format, so you can get back
to just where you were in Stata 6; see help translate. Moreover, you can
directly create old-style ASCII text logs if that is your preference; just
type `**log using myfile.log**' or `**log using myfile, text**'; see help log.

The Viewer can be accessed by pulling down **File**, or you can use the new
**view** command, which provides some additional features; see help view.

Programmers will want to see help smcl for a complete description of SMCL.
You can use SMCL in your ado-files.

There is one other log change: you can now create command logs (ASCII text
logs containing only what you type, which used to be called **noproc** logs)
using the new **cmdlog** command. Even better, you can create command logs and
full session logs simultaneously; see help log.

Stata(console) for Unix users: All the above applies to you, too, except
that you cannot click. Stata(console) does not have a **view** command, but
**type** can display **.smcl** files, and **translate** can translate them. See help
conren for instructions on how to make SMCL output look as good as possible
on your line-by-line console.

__Improvements to by__

**by** *varlist***:** now has a **sort** option. You can type, for instance, `**by**
**foreign, sort: summarize mpg**' or, equivalently, `**bysort foreign: summarize**
**mpg**', rather than first sorting the data and then typing the **by** command;
see help by.

**by** has a new parenthesis notation: `**by** *id* **(***time***):** *...*' means to perform
*...* by *id*, but first verify that the data are sorted by *id* and *time*. `**by**
*id* **(***time***), sort:** *...*' says to sort the data by *id* and *time* and then perform
*...* by *id*.

There is also a new **rc0** option, which says to keep on going even if one of
the by-groups results in an error.

More importantly, **by** *varlist***:** is now allowed with virtually every Stata
command, including commands implemented as ado-files, including **egen**. We
have been claiming for some time that whether a command is built-in or
implemented as an ado-file is irrelevant, it has the same features. Now
the claim is true. Programmers: see help byprog for instructions on how
to make your programs and ado-files allow the **by** prefix; it is easy.

The commands **generate**, **replace**, **drop**, **keep**, and **assert** no longer present
the detailed, group-by-group report when prefixed with **by**, meaning you no
longer need to prefix them with **quietly**:

**. by id: replace bp = bp[_n-1] if bp==.**
(120 changes made)

__Sort stability__

Commands that report results of calculations (commands not intended to
change the data) no longer change the sort order of the data. If you type
`**sort** *id* *time*', you can be assured that your dataset will stay sorted by *id*
and *time*. This is true even if the command is implemented as an ado-file.

Programmers: see **[P] sortpreserve** for instructions on making your old
programs and ado-files sort stable. It is easy, and the performance
penalty is barely measurable.

__European decimal format__

Stata now understands output formats such as **%9,2f** as well as **%9.2f**. In
**%9,2f**, the number 500.5 is displayed as 500,50. In **%9,2fc** format, the
number 1,000.5 is displayed as 1.000,50.

Even better, you can now **set dp comma** to modify all of Stata's output to
use the European format, including all statistical output. See help
format.

__Faster__

Stata 7 has more features, but continuing our long tradition, it is also
faster; ado-files execute between 8.8 and 11.8 percent faster. Some
programs, we have observed, execute 13 percent faster.

-------------------------------------------------------------------------------

__Statistics__

__Estimation commands (exclusive of st and xt)__

First, all maximum-likelihood estimation commands of Stata now allow linear
constraints; each has a new **constraint()** option. See the particular
estimator.

**boxcox** has been rewritten. It now produces maximum likelihood estimates of
the coefficients and the Box--Cox transform parameter(s). Box--Cox models
may be estimated in various forms, with the transform on the left, on the
right, or on both sides. See help boxcox.

**glm** has also been rewritten. It continues to estimate the generalized
linear model, but now offers an expanded choice of link functions and also
allows user-specified link and variance functions. **glm** will now report
maximum-likelihood based estimates of standard errors, IRLS based
estimates, and many others. See help glm

**nlogit** estimates nested logit models. In a nested logit model, multiple
outcomes are grouped into a nested tree structure, and nested logit has the
advantage over multinomial and conditional logistic models of allowing you
to parameterize away the assumption of independence of the irrelevant
alternatives (IIA). See help nlogit.

**treatreg** estimates the treatment effects model using either a two-step
estimator or a full maximum-likelihood estimator. The treatment effects
model considers the effect of an endogenously chosen binary treatment on
another endogenous continuous variable, conditional on two sets of
independent variables. See help treatreg.

**truncreg** estimates truncated regression models. Truncated regression
refers to regressions estimated on samples drawn based on the dependent
variable, and therefore for which (sometimes) neither the dependent nor
independent variables are observed (as opposed to **tobit**, which estimates
regression models when the independent variables are observed in all
cases). See help truncreg.

__Cross-sectional time-series analysis (xt)__

**xtabond** produces the Arellano--Bond one-step, one-step robust, and two-step
estimators for dynamic panel-data models, models in which there are lagged
dependent variables. **xtabond** can be used with exogenously unbalanced
panels and, uniquely, handles embedded gaps in the time series as well as
opening and closing gaps. **xtabond** allows for predetermined covariates.
**xtabond** allows you to use either the full instrument matrix or a pared down
version. **xtabond** reports both the Sargan and autocorrelation tests derived
by Arellano and Bond. See help xtabond.

**xtregar** estimates cross-sectional time-series models in which epsilon_it is
assumed to follow an AR(1) process. **xtregar** reports the within estimator
and a GLS random-effects estimator. **xtregar** can handle unequally spaced
observations and exogenously unbalanced panels. **xtregar** uniquely reports
the modified Bhargava et al. Durbin--Watson statistic and the Baltagi--Wu
locally best invariant test statistic for autocorrelation. See help
xtregar.

**xtivreg** estimates cross-sectional time-series regressions with
(generalized) instrumental variables, or, said differently, estimates
two-stage least squares time-series cross-sectional models. **xtivreg** can
estimate such models using the between-2SLS estimator, the within-2SLS
estimator, the first-differenced 2SLS estimator, the
Balestra--Varadharajan--Krishnakumar G2SLS estimator, or the Baltagi EC2SLS
estimator. All the estimators allow use of balanced or (exogenously)
unbalanced panels. See help xtivreg.

**xtpcse** produces panel-corrected standard errors (PCSE) for linear
cross-sectional time-series models where the parameters are estimated by
OLS or Prais--Winsten regression. When computing the standard errors and
the variance--covariance estimates, the disturbances are, by default,
assumed to be heteroskedastic and contemporaneously correlated across
panels. See help xtpcse.

__Survival analysis (st)__

**stcox** will now estimate proportional hazard models with continuously
time-varying covariates, and you do not need to modify your data to obtain
the estimates. See the **tvc()** and **texp()** options in help stcox.

**streg** can now estimate parametric survival models with individual-level
frailty (unobserved heterogeneity). Two forms of the frailty distribution
are allowed: gamma and inverse gaussian. Frailty is allowed with all the
parametric distributions currently available. See help streg. (New
commands **weibullhet**, **ereghet**, etc., allow users to estimate these models
outside of the st system; see help weibull.)

**streg** has also been modified to allow estimation of stratified models,
meaning that the distributional parameters (the ancillary parameters and
intercept) are allowed to differ across strata. See the **strata()** option in
help streg.

**streg** has also been modified to allow you to specify any
linear-in-the-parameters equation for any of the distributional parameters,
which allows you to create various forms of stratification, as well as
allowing distributional parameters to be linear functions of other
covariates. See the **ancillary()** option in help streg.

**stptime** calculates person-time (person-years) and incidence rates and
implements computation of the standardized mortality/morbidity ratios
(SMR). See help stptime.

**sts test** has been modified to include additional tests for comparing
survivor distributions, including the Tarone--Ware test, the
Fleming--Harrington test, and the Peto--Peto--Prentice test. Also new is a
test for trend. See help sts.

**stci** calculates and reports the level and confidence intervals of the
survivor function, as well as computing and reporting the mean survival
time and confidence interval. See help stci.

**stsplit** is now much faster and now allows for splitting on failure times,
as well as providing some additional convenience options. See help
stsplit, but remember that **stcox** can now estimate with continuous
time-varying covariates without you having to **stsplit** the data beforehand.

**stcurve** has a new **outfile** option. See help streg.

__Commands for epidemiologists__

Five new commands are provided for the analysis of Receiver Operating
Characteristic (ROC) curves.

**roctab** is used to perform nonparametric ROC analyses. By default, **roctab**
calculates the area under the curve. Optionally, **roctab** can plot the ROC
curve, display the data in tabular form, and produce Lorenz-like plots.
See help roctab.

**rocfit** estimates maximum-likelihood ROC models assuming a binormal
distribution of the latent variable. **rocplot** may be used after **rocfit** to
plot the fitted ROC curve and simultaneous confidence bands. See help
rocfit.

**roccomp** tests the equality of two or more ROC areas obtained from applying
two or more test modalities to the same sample or to independent samples.
See help roccomp.

**rocgold** independently tests the equality of the ROC area of each of several
test modalities against a "gold" standard ROC curve. For each comparison,
**rocgold** reports the raw and the Bonferroni adjusted significance
probability. Optionally, Sidak's adjustment for multiple comparisons can
be obtained. See help rocgold

**binreg** estimates generalized linear models for the binomial family and
various links. It may be used with either individual-level or grouped
data. Each of the link functions offers a distinct, epidemiological
interpretation of the estimated parameters. See help binreg.

**cc** and **cci** now, by default, compute exact confidence intervals for the odds
ratio. See help cc.

**icd9** and **icd9p** assist when you are working with ICD-9-CM diagnostic and
procedure codes. These commands allow the cleaning up, verification,
labeling, and selection of ICD-9 values. See help icd9.

__Marginal effects__

**mfx** reports marginal effects after estimation of any model. Marginal
effects refers to df()/dx_i evaluated at x, where f() is any function of
the data and the model's estimated parameters, x are the model's
covariates, and x_i is one of the covariates. For instance, the model
might be probit and f() the cumulative normal distribution, in which case
df()/dx_i = the change in the probability of a positive outcome with
respect to a change in one of the covariates. x might be specified as the
mean, so that the change would be evaluated at the mean.

**dprobit** would already do that for the probit model, and there have been
other commands published in the STB that would do this for other particular
models, such as **dtobit** for performing tobit estimation.

**mfx** works after estimation of any model in Stata and is capable of
producing marginal effects for anything **predict** can produce. For instance,
after **tobit**, you could get the marginal effect of the probability of an
outcome being uncensored, or the expected value of the uncensored outcome,
or the expected value of the censored outcome.

**mfx** can compute results as derivatives or elasticities. See help mfx

__Cluster analysis__

**cluster** performs partitioning and hierarchical cluster analysis using a
variety of methods. Two partitioning cluster methods are provided --
kmeans and kmedians -- and three hierarchical-cluster methods are provided
-- single linkage, average linkage, and complete linkage. Included are 14
binary similarity measures and 7 different continuous measures (counting
things such as the Minkowski distance *#* as one).

The result is to add various characteristics to the dataset, including
variables reflecting cluster membership. **cluster** can then can display
results in various ways.

More than one result can be saved simultaneously, so that the results of
different analyses may be compared. **cluster** allows adding notes to
analyses and, of course, the dropping of analyses. **cluster** also provides
post-clustering commands that can, for instance, display the dendrogram
(clustering tree) from a hierarchical analysis or produce new grouping
variables based on the analysis.

**cluster** has been designed to be extended. Users may program extensions for
new cluster methods, new cluster management routines, and new post-analysis
summary methods.

See help cluster and, if you are interested in programming extensions, see
help clprog.

__Pharmacokinetics__

There are four new estimation commands and two new utilities intended for
the analysis of pharmacokinetic data; see help pk.

**pkexamine** calculates pharmacokinetic measures from time-and-concentration
subject-level data. **pkexamine** computes and displays the maximum measured
concentration, the time at the maximum measured concentration, the time of
the last measurement, the elimination rate, the half-life, and the area
under the concentration-time curve (AUC). See help pkexamine.

**pksumm** obtains the first four moments from the empirical distribution of
each pharmacokinetic measurement and tests the null hypothesis that the
measurement is normally distributed. See help pksumm.

**pkcross** analyzes data from a crossover design experiment. When analyzing
pharmaceutical trial data, if the treatment, carryover, and sequence
variables are known, the omnibus test for separability of the treatment and
carryover effects is calculated. See help pkcross.

**pkequiv** performs bioequivalence testing for two treatments. By default,
**pkequiv** calculates a standard confidence interval symmetric about the
difference between the two treatment means. Optionally, **pkequiv** calculates
confidence intervals symmetric about zero and intervals based on Fieller's
theorem. Additionally, **pkequiv** can perform interval hypothesis tests for
bioequivalence. See help pkequiv.

**pkshape** and **pkcollapse** help in reshaping the data into the form that the
above commands need; see help pkshape and pkcollapse.

__Other statistical commands__

**jknife** performs jackknife estimation, which is (1) an alternative,
first-order unbiased estimator for a statistic; (2) a data-dependent way to
calculate the standard error of the statistic and to obtain significance
levels and confidence intervals; and (3) a way of producing measures
reflecting the observation's influence on the overall statistic. See help
jknife.

**lfit**, **lroc**, **lsens**, and **lstat** now work after **probit** just as they do after
**logit** or **logistic**.

**drawnorm** draws random samples from a multivariate normal distribution with
specified means and covariance matrix. See help drawnorm.

**corr2data** creates fictional datasets with the specified means and
covariance matrix (correlation structure). Thus, you can take published
results and duplicate and modify them if the estimator is solely a function
of the first two moments of the data, such as **regress**, **ivreg**, **anova**, or
**factor**. See help corr2data.

**median** performs a nonparametric test that K samples were drawn from
populations with the same median. See help median.

**tabstat** displays tables of summary statistics, possibly broken down
(conditioned) on another variable. See help tabstat.

The command **avplot** now works after estimation using the **robust** or **cluster()**
options. See help avplot.

**ml** can now perform estimation with linear constraints. All that is
required is that you specify the **constraint()** option on the **ml** **maximize**
command. See help ml.

__Distribution functions__

Stata's density and distribution functions have been renamed. First, all
the old names continue to work, even when not documented in the manual, at
least under version control. The new standard, however, is, if *X* is the
name of a distribution, then

*X***den()** is its density
*X***()** is its cumulative distribution
**inv***X***()** is its inverse cumulative
*X***tail()** is its reverse cumulative
**inv***X***tail()** is its inverse reverse cumulative

Not all functions necessarily exist and, if they do not, that is not solely
due to laziness on our part. In particular, concerning the choice between
*X***()** and *X***tail()**, the functions exist that we have accurately implemented.
In theory, you only need one because *X***tail()** = 1 - *X***()**, but in practice,
the one-minus subtraction wipes out lots of accuracy. If one really wants
an accurate right-tail or left-tail probability, one needs a separately
written *X***tail()** or *X***()** routine, written from the ground up.

Anyway, forget everything you ever knew about Stata's distribution
functions. Here is the new set:

**normden()** same as old **normd()**
**norm()** same as old **normprob()**
**invnorm()** same as old **invnorm()**

**chi2()** related to old **chiprob()**; see below
**invchi2()** related to old **invchi()**; see below
**chi2tail()** related to old **chiprob()**
**invchi2tail()** related to old **invchi()**

**F()** related to old **fprob()**
**invF()** related to old **invfprob()**
**Ftail()** same as old **fprob()**
**invFtail()** equal to old **invfprob()**

**ttail()** related to old **tprob()**; see below
**invttail()** related to old **invt()**; see below

**nchi2()** equal to old **nchi()**
**invnchi2()** equal to old **invnchi()**
**npnchi2()** equal to old **npnchi()**

We want to emphasize that if a function exists, it is calculated
accurately. To wit, **F()** accurately calculates left tails, and **Ftail()**
accurately calculates right tails; **Ftail()** is far more accurate than
1 - **F()**.

There is no **normtail()** function. The accurate way to calculate left-tail
probabilities (z<0) is **norm(z)**. The accurate way to calculate right-tail
probabilities (z>0) is **norm(-z)**.

All the old functions still exist, but in two cases, they work only under
version control: The old **invt()**, under the new naming logic, ought to be
the inverse of the cumulative, but is not, so **invt()** goes into forced
retirement for a release or two. It works if **version** is set to 6 or
before; otherwise, you get the error "unknown function invt()". Similarly,
the old **invchi()** goes into forced retirement because it is too close to the
new name **invchi2()**.

-------------------------------------------------------------------------------

__Nonstatistical improvements__

__Graphics__

Stata's **graph** command now allows line styles. Whereas before you might
have specified **c(lls)** on the **graph** command to indicate the first variable
was to be connected by lines, the second variable was to be connected by
lines, and the third variable was to be connected by a cubic spline, you
can now specify things like **c(l l[-] s[-.])** to indicate the same thing and
to also specify the style of the lines used to show the result. The first
is to be shown by a solid line, the second by a dashed line, and the third
by a line in a dash-dot-dash-dot pattern.

You can still specify the old style, or mix old and new style. In the
square brackets you can type a pattern which is made up of the following
pieces:

**l** (el) solid line (default)
**_** (underscore) a long dash
**-** (hyphen) a medium dash
**.** (period) a short dash (almost a dot)
**#** (pound sign) a space

The pattern you specify repeats.

The keys at the top of graphics have been improved -- they now show the
line style as well as the point, and you can now exercise control over the
keys with the new **key1()**, **key2()**, **key3()**, and **key4()** options. The **key***#***()**
options allow you to specify the text, the symbol, the line style, and the
color, in any combination. **key1(c(l[.-]) s(x) p(2) "Explanatory text")**
creates a key displaying a dot-dash-dot-dash line pattern, symbol small x
(**symbol(x)** is new), in the color of pen 2, with the text "Explanatory
text".

You can now specify **xsize(***#***)** and **ysize(***#***)** options on **graph** (and with the
programming command **gph open**). These specify the size of the graph, in
inches, and take effect when you print the graph. The default is **xsize(6)**
and **ysize(4)**.

Printing is now a little different. Because Stata 7 now includes a
windowed interface for all operating systems, Unix included, you can pull
down **File** and choose **Print Graph**. You can also use the new **print** command;
see help print. The **translate** command can translate from .gph format to
other file formats.

Compared to previous versions, this means the Unix stand-alone executables
gphdot and gphpen are now gone; you do not need them. **print** is better.
This also means the old **gphprint** command of Stata, available under Windows
and Mac only, is also supplanted for printing by **print** and for file
translation by **translate**.

The .gph file format has changed, meaning Stata 6 cannot display or print
Stata 7 .gph files (but Stata 7 can display and print Stata 6 files). The
old Stage editor cannot edit Stata 7 graphs.

The line-by-line console version of Stata for Unix can no longer display
graphs, although the **graph** command works in the sense that you can graph
into a file and print the results. To see graphs on the screen, you must
use the windowed version of Stata.

The programmer's command **gph** continues unmodified, but programmers are
alerted that Stata 7 has a new programmable bottom-layer graphics engine.
You may wish to code your graphics programs using this new feature and, if
so, point your browser at

http://developer.stata.com/graphics

Documentation for the new developmental system resides there.

Note: Your copy of Stata may have new graphic features not listed here.
New features might be added when you type **update** to obtain and install the
latest updates from www.stata.com. To find out about any new graphics
features see help whatsnew. Help whatsnew gives a complete list of all new
features, graphics and otherwise, provided by your current update. Help
graphics will document new end-user graphics features that are added
through the life of Version 7.

__New commands__

**foreach** is a new programming command, but it can be used directly and is a
useful alternative to **for** and **while**. With **foreach**, you can type things
such as

**. foreach file in this.dta that.dta theother.dta {**
**2. use `file', clear**
**3. replace bp=. if bp==999**
**4. save `file', replace**
**5. }**

See help foreach.

Likewise, the new **forvalues** programming command is a useful alternative to
**for** and **while** that steps through numeric values. Instead of coding

**. local i = 1**
**. while `i' <= `n' {**
**2.** *...* **`i'** *...*
**3. local i = `i' + 1**
**4. }**

you code

**. forvalues i = 1(1)`n' {**
**2.** *...* **`i'** *...*
**3. }**

See help forvalues.

**continue** (and **continue, break**) allow you to continue out of, or break out
of, **while**, **forvalues**, and **foreach** loops; see help continue.

**net** **search** searches the web for user-written additions to Stata, including,
but not limited to, user-written additions published in the STB. The
user-written materials found are available for immediate download and
automatic installation by clicking on the link. **net** **search** is the latest
incarnation of **webseek**, a command not included in Stata 6 but which was
made available during the release, and which continues to work but is now
undocumented. See help net.

**destring** makes converting variables from string to numeric easier. See
help destring.

The following new **egen** functions have been added: **any()**, **concat()**, **cut()**,
**eqany()**, **ends()**, **kurt()**, **mad()**, **mdev()**, **mode()**, **neqany()**, **pc()**, **seq()**,
**skew()**, and **tag()**. In addition, **group()** and **rank()** have new options. See
help egen.

**statsby** creates a dataset of the results of a command executed **by** *varlist***:**.
The results can be any of the saved results of the specified command and,
if it is an estimation command, the coefficients and the standard errors.
Typing `**statsby "regress mpg weight" _b _se e(r2), by(foreign)**', for
instance, would create a two-observation dataset in which the first
recorded the coefficients, standard error, and R^2 for foreign = 0, and the
second recorded them for foreign = 1. See help statsby.

**xi** has been modified to exploit Stata's longer variable names to create
more readable names for the interaction terms. See help xi.

**hexdump** will give you a hexadecimal dump of a file. Even more useful is
its **analyze** option, which will analyze the dump for you and report just the
summary. This can be useful for diagnosing problems with raw datasets.
See help hexdump.

**type** has a new **asis** option. The default behavior of **type** has been changed
when the filename ends in **.smcl** to interpret the SMCL codes. This way, if
you previously created a session log by typing `**log using mylog**', you can
type `**type mylog.smcl**' to display it as you probably want to see it. If
you wanted to see the raw SMCL codes, you would type `**type mylog.smcl,**
**asis**'. See help type.

**net** **stata.toc** and ***.pkg** files now allow the **v** directive. You are supposed
to code `**v 2**' at the top of the files and, if you do that, you may use SMCL
directives in the files; see help net and smcl.

**format** now allows you to type the **%***fmt* first or last, so you can equally
well type `**format mpg weight %9.2f**' or `**format %9.2f mpg weight**'. See help
format.

**version** may now be used as a prefix command; you can type `**version 6: ***...*'
to mean that *...* is to be run under version 6. See help version.

There are now three **shell**-like commands, depending on your operating
system: **shell**, **xshell**, and **winexec**. Stata for Window's users: nothing has
changed. Stata for Mac users: nothing has changed. Stata(console) for
Unix users: nothing has changed. Stata(GUI) for Unix, however, is more
complicated, and it all has to do with whether you want a new **xterm** window
created for the application. See help shell.

Numlists may now be specified as *a***[***b***]c** as well as *a***(***b***)***c*. See help numlist.

**list** now has a **doublespace** option. See help list.

**confirm** **names** verifies that what follows, follows Stata's naming syntax --
which is to say, starts with a letter or underscore and thereafter contains
letters, underscores, or digits -- and is not too long.

**estimates** **hold** has two new options and one new behavior that will be of
interest to programmers. The new behavior is that if estimates are held
under a temporary name, they are now automatically discarded when the
program terminates. The new **restore** option schedules the held estimates
for automatic restoration on program termination. The new **not** option to
**estimates** **unhold** cancels the previously scheduled restoration. The new
**copy** option to **estimates** **hold** copies the current estimates rather than
moving them. See help estimates.

**_rmcoll** and **_rmdcoll** assist in removing collinear variables from varlists;
see help _rmcoll and _rmdcoll.

__New string functions__

There are four new string functions: **match()**, **subinstr()**, **subinword()**, and
**reverse()**.

**match(***s_1***,***s_2***)** returns 1 if string *s_1* "matches" *s_2*. In the match, ***** in
*s_2* is understood to mean zero or more characters go here, and **?** is
understood to mean one character goes here. **match("this","*hi*")** is true.
In *s_2*, **\\**, **\?**, and **\*** can be used if you really want a **\**, **?**, or *****
character.

**subinstr(***s_1***,***s_2***,***s_3***,***n***)** and **subinword(***s_1***,***s_2***,***s_3***,***n***)** substitute the first *n*
occurrences of *s_2* in *s_1* with *s_3*. **subinword()** restricts "occurrences" to
be occurrences of words. In either, *n* may be coded as missing value,
meaning to substitute all occurrences. For instance, **subinword("measure**
**me","me","you",.)** returns "measure you", and **subinstr("measure**
**me","me","you",.)** returns "youasure you".

**reverse(***s***)** returns *s* turned around. **reverse("string")** returns "gnirts".

A fifth new string function is really intended for programmers:
**abbrev(***s***,***n***)** returns the *n*-character **~**-abbreviation of the variable name *s*.
**abbrev(***s***,12)** is the function used throughout Stata to make 32-character
names fit into 12 spaces.

See help functions.

__Other new functions__

The new functions **inrange()** and **inlist()** make choosing the right
observations easier.

**inrange()** handles missing values elegantly when selecting subsamples such
as *a* <= *x* <= *b*. **inrange(***x***,***a***,***b***)** answers the question, "Is *x* known to be in
the range *a* to *b*?" Obviously, **inrange(.,1000,2000)** is false. *a* or *b* may be
missing. **inrange(***x***,***a***,.)** answers whether it is known that *x* >= *a*, and
**inrange(***x***,.,***b***)** answers whether it is known that *x* <= *b*. **inrange(.,.,.)**
returns 0 which, if you think about it, is inconsistent but is probably
what you want.

**inlist(***x***,***a***,***b***,***...***)** selects observations if *x* = *a* or *x* = *b* or* ...*.

See help functions for more information on the above functions. Other
functions have been added. **_by()**, **_bylastcall()**, and **_byindex()** deal with
making programs and ado-files allow the **by** *varlist***:** prefix; see help
byprog.

The new macro extended function:
{**r**|**e**|**s**}**(**{**scalars**|**macros**|**matrices**|**functions**}**)** returns the names of all the
saved results of the indicated type. For instance, **local x : e(scalars)**
returns the names of all the scalars currently stored in **e()**. See help
macro.

--- **previous updates** ----------------------------------------------------------

See whatsnew6.

-------------------------------------------------------------------------------