Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Unanticipated behavior of -encode-


From   "Lacy,Michael" <[email protected]>
To   "[email protected]" <[email protected]>
Subject   st: Unanticipated behavior of -encode-
Date   Tue, 20 Aug 2013 03:18:34 +0000

Under certain circumstances, -encode- will number  the numeric version of a string variable starting where it left off at the last encode, rather
than starting at 1.  I encountered this while encoding a varlist of string variables in a large file, which gave me oddities such 
a string variable with the values "male" and "female" being encoded with large consecutive numbers rather than with 1 and 2.
This is hardly tragic, but it is inconvenient, and not behavior I could anticipate from the documentation of -encode-.

 Here's an example of code showing a mild version of this:

clear
version 13
set seed 23456
set obs 4
gen str x = cond(runiform() > 0.5, "this", "that")
gen str y = cond(runiform() > 0.5, "blue", "green ")
//
foreach v of varlist x y {
   encode `v', gen(temp)
   drop `v'
   rename temp `v'
}
tab1 x y, nolab
//
-> tabulation of x  

          x |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |          2       50.00       50.00
          2 |          2       50.00      100.00
------------+-----------------------------------
      Total |          4      100.00

-> tabulation of y  

          y |      Freq.     Percent        Cum.
------------+-----------------------------------
          3 |          3       75.00       75.00
          4 |          1       25.00      100.00
------------+-----------------------------------
      Total |          4      100.00


I would expect both x and y to be encoded with 1 and 2. This oddity can be avoided by not using "temp" repeatedly, but I'm curious if others can explain why this
occurs

Regards,


Mike Lacy
Dept. of Sociology
Colorado State University
Fort Collins CO 80523-1784

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index