Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: Re: st: RE: Labeling values of variables


From   n j cox <n.j.cox@durham.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: Re: st: RE: Labeling values of variables
Date   Fri, 15 Feb 2008 19:34:14 +0000

The variable names come from the stub specified in the -generate()- option which is passed to -tabulate, gen()- which adds suffixes 1, 2 and so forth (provided that the names implied are all new and legal).

The variable labels come from the value labels produced by -egen, group()- with the -label- option. In the program I delete the stuff that -tabulate- would otherwise add as prefix.

The program I posted was, as described, a quick hack, and does not purport to be an industrial strength dummy generation program with enough options to satisfy a variety of tastes. Such a program would probably be 10-20 times longer and take much longer to write.

What could be done is to extend the -generate()- option so that it takes either a list of new variable names or a stub. That should not be too difficult. The user wanting names like yours would then have to spell them all out. One possible problem, however, is that things get messy unless you specify variable names in exactly the order that the program would produce them. That really could be a nightmare as the consequences of something have the wrong name could be serious for understanding.

What could be tried is that the program might automatically look at the value labels and work out informative variable names. That would barf on value labels that contained spaces or characters not allowed in variable names or if the implied variable name was too long to be legal. No doubt there are partial work-arounds or the program could just fall back on stub plus integers if the variable name would not work.

I don't have ambitions under any of these headings. If you or anybody else wanted to take over that code and do it properly, that would be fine by me.

Sometimes of course just doing it directly is just as easy:

gen byte demograph_married_male =
cond(missing(married, male), . (married == 1 & male == 1))

Otherwise it's a deep programming truth that if you want something customised to your precise desires you may have to write it yourself.

Nick

Peter Dijkstra
==============

Thanks Nick, it works fine!
I tried to understand what is going on in this program you wrote, but I cannot seem to find out where the variable names and labels are created - I guess I do not know enough of writing programs in Stata. What if I want to create the variable names "demograph_married_male" or label "married, male"?

Nick Cox
========

> One answer is to form a composite by
>
> egen group = group(relation sex), label
>
> and then form your own dummies using -tabulate, gen()-.
> But that still leaves rather ugly looking variable labels.
> You could in turn fix those with -labvarch- from -labutil- from SSC.
>
> That is all getting rather complicated. Here is a quick hack to
> do it in one. Note that
>
> nicelylabelleddummies varname, gen(frog)
>
> will produce the dummies for varname
>
> while
>
> nicelylabelleddummies var1 var2, gen(toad)
>
> will produce the dummies for var1*var2, but it won't produce
> produce dummies for var1 and var2 separately.
>
> *! NJC 1.0.0 14 Feb 2008
> program nicelylabelleddummies
> version 8.2
> syntax varlist [if] [in], Generate(str)
> marksample touse, strok
> qui count if `touse'
> if r(N) == 0 error 2000
>
> tempvar group
> qui egen `group' = group(`varlist') if `touse', label
> su `group', meanonly
> local nvars = r(max)
> forval i = 1/`nvars' {
> capture confirm new variable `generate'`i'
> if _rc {
> di "`generate'`i' not acceptable as new varname"
>
> exit _rc
> }
> }
>
> qui tab `group' if `touse', gen(`generate')
>
> forval i = 1/`nvars' {
> local label : var label `generate'`i'
> local label : subinstr local label "`group'==" ""
> label var `generate'`i' `"`label'"'
> }
> end

Peter Dijkstra
==============

> I like to have sensible value labels of variables, and use
> label define relation 0 "single" 1 "married" 2 "divorced"
> label define sex 0 "female" 1 "male"
> in StataSE 8.2. However, when using
> xi i.relation * i.sex
> the labels automatically become "relation==1 & sex==1", "relation==2 &
> sex==1". How do I obtain labels which say "married male" and "divorced
> male"?

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index