Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: 'Re-ordering' the labels of a variable


From   Phil Schumm <pschumm@uchicago.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: 'Re-ordering' the labels of a variable
Date   Thu, 25 Mar 2010 21:11:57 -0500

On Mar 25, 2010, at 1:16 PM, Nick Cox wrote:
3. Although -sencode- is undeniably useful, and its existence underlines a view elsewhere expressed that -encode- should be revisited by StataCorp to make it more comprehensive, I see nothing here that could not also be achieved directly with -encode-.


Nick is, of course, correct; by using the -label()- option, you have complete control over how -encode- assigns integers when encoding a string variable. This appears to have been what the OP was asking about.

There is, however, a related problem which occurs frequently: You have a categorical variable which is already encoded, but you want to change the mapping. For example, you might have a numeric variable called myvar with corresponding value label

myvar:
           1 yes
           2 no

and you want to change this so that the response "no" is represented by 0 instead of 2 (e.g., so that you can use boolean operators). Or, you might have a variable encoded thusly:

mylab:
           1 a lot
           2 some
           3 a little
           4 not at all

and you want to reverse this so that the order of the integers reflects the natural ordering of the responses (e.g., so that when you fit an ordinal regression model, your coefficients have a straightforward interpretation). Although you can often avoid these problems when creating your own dataset, those who work with secondary datasets in which variables come pre-encoded often wish to change the default encoding to facilitate analysis and interpretation. Many (if not most) datasets available in social science archives contain encoded variables.

Now, I often see people attack this problem with a combination of - recode- and -label define, modify-. This is a pain, leads to code that is difficult to read, and, most importantly, is error prone. An alternative solution is provided by a command I wrote called -re2lab- (for "recode to label"), which can be obtained via

    net install re2lab, from(http://rcg-software.uchicago.edu/stata)

This comes with a help file explaining its features and usage. The goal was to simplify the process of re-encoding variables, and to reduce the likelihood of making a mistake when doing so.


-- Phil

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index