[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: -defv- and -egen- [was: RE: RE: RE: RE: RE: old question, new solutions?]

From   "Nick Cox" <>
To   <>
Subject   st: -defv- and -egen- [was: RE: RE: RE: RE: RE: old question, new solutions?]
Date   Tue, 17 Nov 2009 11:17:00 -0000

Say rather that I am aware of -defv-. -defv- was one of several original
contributions to Stata made by John R. Gleason, who abandoned Stata for
R about 2000 or 2001. Bill Gould played around with -defv- (with John's
blessing) a few years ago on the list. I've not searched for the thread
or recalled the outcome. 

Last year there was some discussion on the list of similar approaches.
See -labgen- from SSC, which includes this partial summary in its help: 

"A first version of -labgen- was posted by Paul Lin to Statalist on 16
July 1996. NJC posted a revision on 18 July 1996. Alan H. Feiveson drew
my attention to it once more in Statalist postings on 25 September 2008.
-labgen- as published on SSC is revised and -labreplace- is new as of 13
October 2008.

Note that the :lblname syntax of -generate- is not supported.

Exceptionally, if the definition exceeds 80 characters in length, then
the definition is inserted in notes for the variable and the variable
label is a pointer to that effect.

For other commands in similar spirit, -search genl- and see Weesie
(1997), or -search defv- and see Gleason (1997, 1999)."

Gleason, J.R. 1997.  Defining variables and recording their definitions.
Stata Technical Bulletin 40: 9-10.  (STB Reprints 7: 48-49)

Gleason, J.R. 1999. Update to defv.  Stata Technical Bulletin 51: 2.
(STB Reprints 9: 14-15)

Weesie, J. 1997.  Automatic recording of definitions.  Stata Technical
Bulletin 35: 6-7.  (STB Reprints 6: 18-20)

I am not aware of any similar command that treats -egen- in the same
way. It could be done, but the programming would be painful, or at least
tedious. Whenever the idea occurs to me, I wait quietly until it goes

As the history shows, various people have played with the idea of
defining (by any major command, -generate-, -replace-, etc.) variables
in such a way that the variable label includes details of the
definition. That's a good principle, and the Stata world might be
slightly different if something like that had been made an automatic, or
at least default, side-effect of such a command. But it didn't happen.
Such side-effects are often considered sensible by users, and about as
often considered bad style by language designers. 

Even when three or four people played with the idea over a decade ago,
it was already too late to change many users' habits. Writing -labgen-
because it was something others wanted has had zero effect on my own

Another good principle is that what most people want in their tables,
graphs and listings is intelligible prose free of Stataspeak such as
"group(nation firm contract) if year == 1999". That you do by
customising a variable label directly. 


Hoffman, George

Thank you nick.
My personal habit of avoiding egen will have to change.....
Are you familiar with 'defv', which is a wrapper for gen/replace that
'self'-documents on the fly by adding to notes about the variable names
so created/modified? It does not work with egen, as far as I know.
Have another nifty command to document egen commands on the fly, or a
modification of defv that works with egen?

See defv:
      STB-51 dm50_1.  Update to defv.

      STB insert by John R. Gleason, Syracuse University
      After installation, see help defv

INSTALLATION FILES                                  (click here to

Nick Cox

These problems you mention are barely problems. 

You can compute groups sds with -egen- in advance. That's one extra

You don't need to -collapse- at all. The price of not doing -collapse-
may be a bloated graph file, but you can use -egen, tag()- to select one
data point per group. 

If you don't want to see the data points on -stripplot- you just specify
--ms(none)-. No special option is needed. There are examples in the help

Hoffman, George

Nick and Jose -- almost but not quite. 
Serrbar requires computing the sd/ci and collapse prior to invoking
Stripplot plots all the individual values in addition to means with the
bar option. Is there any way to suppress plotting the individual values
in stripplot?

'graph dot (mean) Y (p25) Y (p75) Y, over(X)' will compute stats
internally. It couldn't be too far to include sd's / ci's in stats.....?

Nick Cox

But personally I'd prefer -stripplot- from SSC. Check out its -bar-

Nick Cox 

The official command -serrbar- does precisely this. 

Hoffman, George

Has anyone found/written/adapted a program which will plot means +/- sd
for observations of Y over categories X, without drawing a connecting
line between means (as in grby)? The desired solution would be like
'graph box Y, over(X)' except that box would be a single symbol with
upper and lower ranges for sd or sd. Nick cox' civplot is close, but
doesn't allow sd's. 
I've gotten around this by using joanne garrett's 'predxat' but that
plots se's not sd's.

I understand some of the reasons why not to use this sort of display,
but it does occasionally become useful to display data.

*   For searches and help try:

© Copyright 1996–2023 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index