Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Re: -encode- help..


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Re: -encode- help..
Date   Wed, 19 Nov 2008 20:25:07 -0000

This just isn't correct. Stata will willingly add to a set of value
labels. You do _not_ need to start with the union of all possible
values. You do _not_ need to change your data structure, even
temporarily. It is really is not as difficult as implied. 

. list

     +---------------------------+
     | var1   var2   var3   var4 |
     |---------------------------|
  1. |    a      b      c      d |
  2. |    a      b      c      d |
  3. |    a      b      c      d |
  4. |    a      b      c      d |
     +---------------------------+

. encode var1, gen(evar1)

. encode var2, gen(evar2) label(evar1)

. encode var3, gen(evar3) label(evar1)

. encode var4, gen(evar4) label(evar1)

. l

     +-----------------------------------------------------------+
     | var1   var2   var3   var4   evar1   evar2   evar3   evar4 |
     |-----------------------------------------------------------|
  1. |    a      b      c      d       a       b       c       d |
  2. |    a      b      c      d       a       b       c       d |
  3. |    a      b      c      d       a       b       c       d |
  4. |    a      b      c      d       a       b       c       d |
     +-----------------------------------------------------------+

. l, nola

     +-----------------------------------------------------------+
     | var1   var2   var3   var4   evar1   evar2   evar3   evar4 |
     |-----------------------------------------------------------|
  1. |    a      b      c      d       1       2       3       4 |
  2. |    a      b      c      d       1       2       3       4 |
  3. |    a      b      c      d       1       2       3       4 |
  4. |    a      b      c      d       1       2       3       4 |
     +-----------------------------------------------------------+

In this example, each variable adds an extra different value, but Stata
just adds a new value label. 
 
It is true that the labels are not guaranteed to be in alphabetical
order when you do it in sequence. 

(Start with -var4-, for example, and "d" will be encoded 1.) 

Perhaps Sergei has that in mind as a feature of a good encoding, but it
is not logically essential. 

Nick 
n.j.cox@durham.ac.uk 

Sergiy Radyakin

this will not work in case when one of the variables does not contain
all codes present in all other variables.

(table shows unique values only,sorted alphabetically, like Stata does
it)

Var1 Var2 Var3
A       B       A
B      C       B
C

After encoding Var1 will be coded as 1=A, 2=B, 3=C; Var2: 1=B, 2=C;
Var3: 1=A,2=B, which is in all probability not what moleps islon
wanted.

In general one must find a union of all possible values, then encode.
Practically this is probably easier to solve by reshaping the data to
the long format, encoding the single string variable, and then
reshaping back. If there are a few values to label, and those are
known a priori, I would hardwire them into the program and label
manually.

Best regards,
   Sergiy Radyakin

On Wed, Nov 19, 2008 at 2:52 PM, Martin Weiss <martin.weiss1@gmx.de>
wrote:
> Well, if they share the same values, then the -encode- will lead to a
> redundancy because technically you would need only one -label- so that
>
> **************
> webuse hbp2, clear
>
> forv i =1/5{
> clonevar sex`i'=sex
> }
>
> encode sex, g(gender)
>
> ds sex?
>
> foreach var in `r(varlist)'{
> encode `var', g(gender`var') l(gender)
> }
>
> desc
> *****************
>
> you can reuse it with the -label- option to -encode-...
>
>
> HTH
> Martin
> _______________________
> ----- Original Message ----- From: "moleps islon" <moleps2@gmail.com>
> To: <statalist@hsphsun2.harvard.edu>
> Sent: Wednesday, November 19, 2008 8:30 PM
> Subject: st: -encode- help..
>
>
>> I've got 30 different text variables that all have the same possible
>> values. Is there an easy way to encode all 30 variables using the
same
>> label or do I have to do it manually. Also is it possible, somehow,
to
>> specify stata to start encoding with tha value 0 instead of 1 ?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index