Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Useful labelling of dummy variables following logit


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Useful labelling of dummy variables following logit
Date   Wed, 24 Aug 2011 12:57:26 +0100

I don't understand your comment on

"I'm not sure this would follow through to the table"

Once your variables are named, you may use them directly.

Using the last word of the labels with -dummieslab foreign- would give
names "car" and "car", which you wouldn't want, quite apart from it
being illegal. This is not a case of -dummieslab- not working very
well, but a case of a request being rejected as a bad idea. Not doing
something that makes no sense is a feature, not a limitation.

-dummieslab- goes back to 2003; there are official Stata functions
that might make it more versatile if rewritten now. It's certainly not
the last word on the subject.

Despite being a co-author of -dummieslab-, I would tend to do what
Maarten suggests: invent variable names on the fly that are as
meaningful as possible and edit a document retrospectively, replacing
"_" with " "or "-", etc.

As I guess is clear, Stata can't use variable labels here as there is
just not enough space for labels that could be up to 80 characters
long; the altermative of using abbreviations could also be highly
problematic.

It may be that there is a solution with the well-used user-written
programs -esttab-, -outreg-, -outreg2- (all SSC), but I can't advise
as I've never used any of them. The Stata world divides into those who
use such programs virtually all the time or not at all, with not much
middle ground, or so I suspect.

Nick

On Wed, Aug 24, 2011 at 12:22 PM, Tim Evans <Tim.Evans@wmciu.nhs.uk> wrote:
> Hi Maarten and Nick,
>
> Thanks for your help. I guess what I'm after is the labelling of the dummy variables when printed in the output table. I couldn't get dummieslab to work very well on the example using the auto data - I got this error message:
>
> sysuse auto
> (1978 Automobile Data)
> label define newfor 0 "Domestic car" 1 "Foreign (European or Japanese) car"
> label values foreign newfor
> dummieslab foreign
> dummieslab foreign, word(1)
> dummieslab foreign, word(-1)
>
> implied variable names contain duplicates
> r(498);
>
> Even after this works, I'm not sure this would follow through to the table.
>
> As a quick workaround, the label list would work, however, if I'm running multiple regressions, this may not be ideal
>
> Best wishes
>
> Tim
>
>
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
> Sent: 24 August 2011 11:43
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: Useful labelling of dummy variables following logit
>
> In addition to Maarten's excellent advice, sSee also -dummieslab- from SSC:
>
> Generating dummy variables from categorical variable using value label names
>
>        dummieslab varname [if exp] [in range]
>                 [, word(integer) from(string) to(string)
> template(string) truncate(integer) novarlabel ]
>
>
> Description
>
>    dummieslab generates a set of dummy variables from a categorical
> variable. One dummy variable is
>    created for each level of the original variable. Names for the
> dummy variables are derived from the
>    value labels of the categorical variable. (Raw (unlabelled) values
> are used if the categorical
>    variable has no value labels attached.)
>
>    Two different behaviours can be chosen for the variable names:
> (i) use full value labels; (ii) use
>    the sth word of the label. In both cases, all invalid characters
> are stripped from the new variable
>    names.
>
>    Any user-defined prefix and/or suffix can be added using the
> template option.
>
>    In all cases, no new variable will be generated unless all implied
> new names are valid.
>
>    dummieslab applied to variables with no label appends the level to
> the original variable name (very
>    much like what tabulate does).
>
>
> Options
>
>    word(s) requests that the sth word of the label be used as the new
> variable name. Note the use of
>        word(-1) to specify the last word of the label.
>
>    from(string) and to(string) are used together to make replacements
> to the strings used to create the
>        new variables. from(string) contains a list of words to be
> replaced by the list of words supplied
>        in to(string), i.e. the first item in from is substituted by
> the first item in to, the second
>        item in from is substituted by the second item in to, etc.  By
> default, all invalid characters
>        are dropped from the value labels to create new variable
> names. This behaviour can be overridden
>        by the use of from(string) and to(string). For example, use
> from(" ") and to("_") to replace all
>        blanks by underscores.
>
>    template(word)  specifies a template for the new variable name. @
> is used as a placeholder for
>        inserting the extracted label. This option is used to insert a
> prefix (anything before @ in word)
>        and/or a suffix (anything after @ in word).
>
>    truncate(n) truncates new variable names after n characters.
>
>    novarlabel prevents automated variable labelling of the generated dummies.
>
>
> Saved results
>
>    local
>      r(names)   List of names of created dummies
>       r(from)   Name of the original categorical variable
>
>
> Examples
>
>    . sysuse auto
>    . label define newfor 0 "Domestic car" 1 "Foreign (European or
> Japanese) car"
>    . label values foreign newfor
>    . dummieslab foreign
>    . dummieslab foreign, word(1)
>    . dummieslab foreign, word(-1)
>    . dummieslab foreign, from(" ") to("_")
>    . dummieslab foreign, from(car or Foreign) to("" "_" "")
>    . dummieslab foreign, from(car Foreign or) to("" "" "_")
>    . dummieslab foreign, word(1) template("My_@_car")
>
>
> Acknowledgments
>
>    Patrick Joly made helpful suggestions on the first version of
> dummieslab, which led to the addition
>    of the from and to options. Daniel Klein suggested option novarlabel.
>
>
> Authors
>
>    Philippe Van Kerm, CEPS/INSTEAD, Differdange, G.-D. Luxembourg
>    philippe.vankerm@ceps.lu
>
>    Nicholas J. Cox, Durham University, U.K.
>    n.j.cox@durham.ac.uk
>
>
> Also see
>
>    On-line:  tabulate
>    On-line (if installed):  dummies
>
>
>
> On Wed, Aug 24, 2011 at 11:30 AM, Maarten Buis <maartenlbuis@gmail.com> wrote:
>> On Wed, Aug 24, 2011 at 12:12 PM, Tim Evans wrote:
>>> I'm running a logistic regression analysis (logit) in Stata11.2and capturing the output in a log file to which I intend to refer to even when not using Stata. However, the dummy variables in the table output is not user friendly in that I need to be looking at Stata to decode the dummy variables and I wanted to know whether there was a way to get Stata to label up the dummy variables? I'm using the following command:
>>>
>>> xi: logit Early1  i.eth2 age i.invsurg i.region i.dep if dep!=9 & sex==2, or
>>
>> If you are not going to use post-estimation commands like -margins-
>> than I would just create the dummies myself, that way I have complete
>> control over how they are named. This is what I used to do in Stata <
>> 11, I hardly ever used -xi-.
>>
>> If you want to use post-estimation commands like -margins-, I would
>> leave out -xi- but otherwise leave the command unchanged thus using
>> the factor variable notation, see -help fvvarlist-. The output will be
>> a bit clearer, but it will still not contain labels. You could use
>> -label list- below your regression to add a "legend" below your
>> output.
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> _DISCLAIMER:
> This email and any attachments hereto contains proprietary information, some or all of which may be confidential or legally privileged. It is for the exclusive use of the intended recipient(s) only. If an addressing or transmission error has misdirected this e-mail and you are not the intended recipient(s), please notify the author by replying to this e-mail. If you are not the intended recipient you must not use, disclose, distribute, copy, print, or rely on this e-mail or any attachments, as this may be unlawful.
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index