[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: RE: help with encode |

Date |
Fri, 24 Jul 2009 12:34:27 +0100 |

The explanation is simple. "C" and "S" are not included in your -label- definition. So, -encode- sees no labels of "C" and "S". So, it has to go beyond your -label- definition. You want -encode- to understand that "C" and "Control" mean the same as far as you are concerned, and similarly "S" and "Salmon", but you provide no information for Stata to know that. Otherwise put, Stata is totally literal here in handling strings. "C" is not equal to Control. What you want can be achieved with extra lines, e.g. replace fish = "Control" if fish == "C" replace fish = "Salmon" if fish == "S" label def fish 0 "Control" 1 "Salmon" encode fish, gen(fish1) label(fish) Nick n.j.cox@durham.ac.uk -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Miranda Kim Sent: 24 July 2009 12:28 To: statalist@hsphsun2.harvard.edu Subject: Re: st: RE: help with encode Thank you for your input. As you say, there are many different ways to make this work and so I have bypassed using encode. I am grateful for the many suggestions, and I apologize if my initial question was not laid out clearly enough. Many thanks, Miranda ps: here is what I got trying to use encode: . tab1 fish -> tabulation of fish fish | Freq. Percent Cum. ------------+----------------------------------- C | 56 50.45 50.45 S | 55 49.55 100.00 ------------+----------------------------------- Total | 111 100.00 . desc fish storage display value variable name type format label variable label ------------------------------------------------------------------------ -------------- fish str1 %1s . label def fish 0 "Control" 1 "Salmon" . encode fish, gen(fish1) label(fish) . tab1 fish1, nolabel -> tabulation of fish1 fish1 | Freq. Percent Cum. ------------+----------------------------------- 2 | 56 50.45 50.45 3 | 55 49.55 100.00 ------------+----------------------------------- Total | 111 100.00 Michael Hanson wrote: > If that really is the situation you're in -- all encoded values are > 2's and 3's -- then why not simply linearly transform the encoded > variables to the values that you want? That is, type: > > replace gender = gender - 2 > > (Subtract 1 if you had 1's and 2's as mentioned in your original post.) > > However, I suspect something else is going on. My expectation is that > you have more than just two string values ("m" and "f") in your > series. Can you provide to the list the output of -table female-? I > suspect that if you try Nick's -tab ...- command shown below, you > would also find unexpected values. > > However, these are all conjectures. You would likely resolve your > situation much more quickly (and waste fewer people's time on the list > in the process) if you followed the guidance clearly laid out in the > Statalist FAQ: > > 3.3 Stata references in your question > Say exactly what you typed and exactly what Stata typed (or did) in > response. N.B. exactly! > > Why not at least show us what actually happened? In this case, copy > and paste from the results window the *exact* -encode- command and > resulting Stata output. Then copy and past whatever command led you > to conclude that you were getting 2's and 3's, and the *exact* and > *complete* output from that command. (In consideration to Statalist > readers, please don't provide us with a lengthy -list- output. Use > -table- instead, as suggested above.) > > Hope this helps, > Mike > > > On Jul 24, 2009, at 4:09 AM, Miranda Kim wrote: > >> I tried this but couldn't make it work, as it then automatically >> encodes the variables with 2's and 3's... >> >> Nick Cox wrote: >>> In addition to other answers the direct answer to the second >>> question is >>> "Yes": >>> label def female 0 "m" 1 "f" encode gender, gen(female) label(female) >>> >>> It would no harm to check on any missings: >>> tab gender if !inlist(female, 0, 1) >>> Nick n.j.cox@durham.ac.uk >>> Miranda Kim >>> >>> How can I efficiently convert string variables (such as gender with >>> values 'f' 'm') into binary 0/1 variables? >>> Can I fiddle with encode so that it codes 0/1 instead of 1/2? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: help with encode***From:*Miranda Kim <mk@mrc.soton.ac.uk>

**References**:**st: help with encode***From:*Miranda Kim <mk@mrc.soton.ac.uk>

**st: RE: help with encode***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: RE: help with encode***From:*Miranda Kim <mk@mrc.soton.ac.uk>

**Re: st: RE: help with encode***From:*Michael Hanson <mshanson@mac.com>

**Re: st: RE: help with encode***From:*Miranda Kim <mk@mrc.soton.ac.uk>

- Prev by Date:
**Re: st: RE: help with encode** - Next by Date:
**Re: st: RE: help with encode** - Previous by thread:
**Re: st: RE: help with encode** - Next by thread:
**Re: st: RE: help with encode** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |