Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Understanding Factor variables - is order significant ?

From   Richard Williams <>
Subject   RE: st: Understanding Factor variables - is order significant ?
Date   Thu, 27 May 2010 10:31:28 -0400

At 09:42 AM 5/27/2010, Ploutz-Snyder, Robert (JSC-SK)[USRA] wrote:
Nick brings up sorting as an explanation of why to not pursue string variables as factor variables in Stata.

If the factor variable represents an ordinal categorization, the analyst need merely modify his/her labels, just as now we do so by choosing which number represents the "first" category..etc. Following with Nick's example, if I wanted "low" to be first, I could code the values as A, B, C, and have the order that I desire.

Or, code them as "1 low" "2 medium" etc.

Far more common, I think, are factor variables that are nominal instead of ordinal. Male vs. Female, Trmt vs. Control, Drug vs. Drug+Therapy vs. Therapy vs. Control, Race and/or Ethnicity categories... Those sorts of factor variables are commonly used and should be allowed as factor vars in Stata (as they are in other highly respected Stats languages).

I receive/import data coded as string variables all that time, and to have the ability to use string vars as factors would be a much welcomed improvement.

I am not opposed to this, but isn't it just a matter of using a one line -encode- command to create a numeric variable from a string variable?

Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index