Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Understanding Factor variables - is order significant ?


From   "Feiveson, Alan H. (JSC-SK311)" <alan.h.feiveson@nasa.gov>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Understanding Factor variables - is order significant ?
Date   Thu, 27 May 2010 09:38:31 -0500

The trouble with -encode- is that it puts the levels in alphabetical order. So in the L, M, H case you would get H = 1, L = 2, M = 3. To get around this you would have to define a label by hand and assign it to the string variable.

Al F.

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Richard Williams
Sent: Thursday, May 27, 2010 9:31 AM
To: statalist@hsphsun2.harvard.edu
Subject: RE: st: Understanding Factor variables - is order significant ?

At 09:42 AM 5/27/2010, Ploutz-Snyder, Robert (JSC-SK)[USRA] wrote:
>Nick brings up sorting as an explanation of why to not pursue string 
>variables as factor variables in Stata.
>
>If the factor variable represents an ordinal categorization, the 
>analyst need merely modify his/her labels, just as now we do so by 
>choosing which number represents the "first" category..etc. 
>Following with Nick's example, if I wanted "low" to be first, I 
>could code the values as A, B, C, and have the order that I desire.

Or, code them as "1 low" "2 medium" etc.

>Far more common, I think, are factor variables that are nominal 
>instead of ordinal.  Male vs. Female, Trmt vs. Control, Drug vs. 
>Drug+Therapy vs. Therapy vs. Control, Race and/or Ethnicity 
>categories...  Those sorts of factor variables are commonly used and 
>should be allowed as factor vars in Stata (as they are in other 
>highly respected Stats languages).
>
>I receive/import data coded as string variables all that time, and 
>to have the ability to use string vars as factors would be a much 
>welcomed improvement.

I am not opposed to this, but isn't it just a matter of using a one 
line -encode- command to create a numeric variable from a string variable?


-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index