Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | David Hoaglin <dchoaglin@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Code to generate dummy variable from several categorical variables? |
Date | Tue, 17 Jan 2012 07:22:05 -0500 |
It would help to have further clarification. As Nick pointed out, an indicator variable (aka dummy variable) has two (non-missing) values: 0 and 1. Please explain what you mean by "a dummy variable with 4 indicators" and then give an explicit definition of the desired "dummy variable" in terms of A, B, and C. If you actually want a categorical variable with 4 categories (which would necessarily be mutually exclusive), please define those categories in terms of A, B, and C. Your explanation of the "dummy variable" abnlX lists three indicator variables. If you intend abnlX to be a categorical variable, those three indicators are not mutually exclusive. It would help if you described the role that the new variable will play in an analysis. Some regression models, for example, could include the binary variables A, B, and C as they stand; they would not need to be mutually exclusive. BTW, three binary variables yield 8 possible combinations. The one not in your list is A=1, B=0, C=1. Why is it necessary to re-categorize this subject and subjects #2, #3, and #5? David Hoaglin On Mon, Jan 16, 2012 at 7:46 PM, DEBORAH L. HUANG <huangdx@u.washington.edu> wrote: > Thank you for input and to clarify what I'm trying to do: > > I'm trying to generate a dummy variable with 4 indicators; the values of the > indicators are to be determined by 3 other binary variables which are not > mutually exclusive. If generating a categorical variable could be done more > easily that would be fine. I've already tried generating a composite > categorical variable but have recoding problems as A, B and C are not > mutually exclusive. > > For example, possible values for binary variables A, B and C as follows: > > A B C > 1. 1 0 0 > 2. 1 1 0 > 3. 1 1 1 > 4. 0 1 0 > 5. 0 1 1 > 6. 0 0 1 > 7. 0 0 0 > > So I'd like to generate dummy variable abnlX, where > - abnlX1 includes all subjects where A=1 > - abnlX2 includes all subjects where B=1 > - abnlX3 includes all subjects where C=1 > > My difficulty is in figuring out how to code in order to re-categorize > subjects #2, #3 and #5 into all the appropriate categories (e.g., subject #2 > should count toward abnlX1 and abnlX2). Additionally, there are some missing > values for any of the variables A, B or C (subject may be missing value for > A but have values for B and C, etc.) but I would still like to be able to > include the available values. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/