Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Thoughts on automatic operations on labels and publication quality output


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Thoughts on automatic operations on labels and publication quality output
Date   Thu, 28 Aug 2003 19:36:37 +0100

Renzo Comolli

> I follow up here on "st: Towards publication quality output"
> http://www.stata.com/statalist/archive/2003-08/msg00413.html
> And I take the occasion to put down a few other thoughts/wishes

> Marcello Pagano wrote, "With Stata 8, there are often complaints
about
> publication quality tables and regression outputs"

Strictly, this was Fred Wolfe. Marcello forwarded the post on behalf
of Fred.

> I agree. My personal experience is that I switched from beginner in
Gauss to
> beginner in Stata because Stata promised much more in terms of
output
> formatting (I know there are millions other reasons, but that was it
for
> me). A quantum leap in the output formatting would also be the
natural
> complement to the strategy began with the GUI and the graphics.
Moreover, my
> own questions on this very list are mainly on formatting.

> About automatic operations on labels
> I am now used to it, but I find "philosophically" (for lack of
better word)
> strange and counterintuitive that you can write
> . recode religion (1=2) (2=1)
> But that the labels remain the same. What purpose does this intended
> behavior serve?
> It would seem to me more intuitive to have the labels adjusted
> automatically, maybe with an option
> [, dontadjustlables]

> If there are strong objection to this automatism as a default
behavior, it
> should at least be available as an option.
> I imagine there are small hurdles to overcome (e.g. one label is
attached to
> more variables...), but these are such small hurdles to overcome
(e.g.
> automatically generate a second label only for the variable you
recode)

I think that one reason for this behaviour, as you say, lies in the
key principle that a given set of value labels can be attached to
several variables.

Imagine that you have a bundle of variables for all of which the
appropriate labels are

. label def yesno 1 yes 0 no

Then someone says, "Actually for -foobar- the coding is the
wrong way round!"

Suppose then you leap for -recode- and it not only changes
the values of -foobar-, but also changes the labels. That is
definitely
not what you want.

Of course, this situation could be met with your syntax, but
it hasn't been done that way, yet.

Perhaps the best way to summarize the principle which seems
to be at work is to invoke a principle often mentioned in
Unix discussions: the best programs do just one thing well.
You want to recode? OK, use -recode-. You also want
to change labels? OK, use -label- as well. Of course,
a general recode-and-relabel command is also possible
and defensible; it just wasn't the road travelled. As usual,
there are trade-offs between simplicity and flexibility.

Except that -- in Stata 8 -recode- was revised a lot, and
there's more label changing possible, so long as you are
producing a new variable.

> Same idea for things like
> . replace state=state+2

Is this the same idea?

> I know of the excellent labutils and I definitely thank Nick Cox
very much
> for writing it (it saved me a lot of time).
> If I can add a little wish, it would be great if Nick were to find
the time
> to write a wrapper around -labvalch-, let's call it "recodelab" that
mimics
> -recode- in syntax, so that one could write

> . recode religion (1=2) (2=1)
> . recodelab religion (1=2) (2=1)

> This is just for my own private, selfish convenience, because I tend
to
> forget the -from() to()- and the -swap()- syntax

That's a very nice compliment, so what I am going to say at first
is going to sound very ungracious.

No, I won't do this.

As it happens, I dislike -recode- and I almost never use it.
This is partly a chicken-and-egg situation in that I hardly ever
need something like it, but also a personal reaction to its syntax,
which I find fiddly, very difficult to remember and fairly easy
to get wrong. (I am not sure that -labvalch- is much better!)
That's not important, even to me -- and there will
be Stata users who use -recode- a lot and find it indispensable -- but
it is sufficient reason for me not to be tempted at all by this
suggestion. (There are three reasons to write a program: (1) fun;
(2) you want to use it; (3) money.)

However, what I have done recently, and I will ask Kit Baum to
put it on SSC, is to write a -vreverse- program, which reverses
a categorical variable and changes the labels on the fly. (It
doesn't overwrite the original variable; it generates a new
one.) Here's the idea:

. sysuse auto
(1978 Automobile Data)

. tab foreign

   Car type |      Freq.     Percent        Cum.
------------+-----------------------------------
   Domestic |         52       70.27       70.27
    Foreign |         22       29.73      100.00
------------+-----------------------------------
      Total |         74      100.00

. tab foreign, nola

   Car type |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |         52       70.27       70.27
          1 |         22       29.73      100.00
------------+-----------------------------------
      Total |         74      100.00

. vreverse foreign, gen(Foreign)

. tab Foreign

   Car type |      Freq.     Percent        Cum.
------------+-----------------------------------
    Foreign |         22       29.73       29.73
   Domestic |         52       70.27      100.00
------------+-----------------------------------
      Total |         74      100.00

. tab Foreign, nola

   Car type |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |         22       29.73       29.73
          1 |         52       70.27      100.00
------------+-----------------------------------
      Total |         74      100.00

This is the single kind of recoding I want most
frequently: that an ordered (could be binary)
categorical variable is in some sense the wrong
way round, so I want to flip low and high, and
carry the labels too. Perhaps it is what Renzo
had in mind with his religion example (two values?).

Nick
n.j.cox@durham.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index