Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: help with encode


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: help with encode
Date   Fri, 24 Jul 2009 12:34:27 +0100

The explanation is simple. "C" and "S" are not included in your -label-
definition. So, -encode- sees no labels of "C" and "S". So, it has to go
beyond your -label- definition. 

You want -encode- to understand that "C" and "Control" mean the same as
far as you are concerned, and similarly "S" and "Salmon", but you
provide no information for Stata to know that. Otherwise put, Stata is
totally literal here in handling strings. "C" is not equal to Control. 

What you want can be achieved with extra lines, e.g.  

replace fish = "Control" if fish == "C"
replace fish = "Salmon" if fish == "S" 
label def fish 0 "Control" 1 "Salmon"
encode fish, gen(fish1) label(fish)

Nick 
n.j.cox@durham.ac.uk 


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Miranda Kim
Sent: 24 July 2009 12:28
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: RE: help with encode

Thank you for your input. As you say, there are many different ways to 
make this work and so I have bypassed using encode.
I am grateful for the many suggestions, and I apologize if my initial 
question was not laid out clearly enough.
Many thanks,
Miranda
ps: here is what I got trying to use encode:
. tab1 fish

-> tabulation of fish 

       fish |      Freq.     Percent        Cum.
------------+-----------------------------------
          C |         56       50.45       50.45
          S |         55       49.55      100.00
------------+-----------------------------------
      Total |        111      100.00

. desc fish

              storage  display     value
variable name   type   format      label      variable label
------------------------------------------------------------------------
--------------
fish            str1   %1s                   

. label def fish 0 "Control" 1 "Salmon"

. encode fish, gen(fish1) label(fish)

. tab1 fish1, nolabel

-> tabulation of fish1 

      fish1 |      Freq.     Percent        Cum.
------------+-----------------------------------
          2 |         56       50.45       50.45
          3 |         55       49.55      100.00
------------+-----------------------------------
      Total |        111      100.00

Michael Hanson wrote:
> If that really is the situation you're in -- all encoded values are 
> 2's and 3's -- then why not simply linearly transform the encoded 
> variables to the values that you want?  That is, type:
>
> replace gender = gender - 2
>
> (Subtract 1 if you had 1's and 2's as mentioned in your original
post.)
>
> However, I suspect something else is going on.  My expectation is that

> you have more than just two string values ("m" and "f") in your 
> series.  Can you provide to the list the output of -table female-?  I 
> suspect that if you try Nick's -tab ...- command shown below, you 
> would also find unexpected values.
>
> However, these are all conjectures.  You would likely resolve your 
> situation much more quickly (and waste fewer people's time on the list

> in the process) if you followed the guidance clearly laid out in the 
> Statalist FAQ:
>
> 3.3 Stata references in your question
> Say exactly what you typed and exactly what Stata typed (or did) in 
> response. N.B. exactly!
>
> Why not at least show us what actually happened?  In this case, copy 
> and paste from the results window the *exact* -encode- command and 
> resulting Stata output.  Then copy and past whatever command led you 
> to conclude that you were getting 2's and 3's, and the *exact* and 
> *complete* output from that command.  (In consideration to Statalist 
> readers, please don't provide us with a lengthy -list- output.  Use 
> -table- instead, as suggested above.)
>
> Hope this helps,
> Mike
>
>
> On Jul 24, 2009, at 4:09 AM, Miranda Kim wrote:
>
>> I tried this but couldn't make it work, as it then automatically 
>> encodes the variables with 2's and 3's...
>>
>> Nick Cox wrote:
>>> In addition to other answers the direct answer to the second 
>>> question is
>>> "Yes":
>>> label def female 0 "m" 1 "f" encode gender, gen(female)
label(female)
>>>
>>> It would no harm to check on any missings:
>>> tab gender if !inlist(female, 0, 1)
>>> Nick n.j.cox@durham.ac.uk
>>> Miranda Kim
>>>
>>> How can I efficiently convert string variables (such as gender with 
>>> values 'f' 'm') into binary 0/1 variables?
>>> Can I fiddle with encode so that it codes 0/1 instead of 1/2?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index