Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <n.j.cox@durham.ac.uk> |

To |
"'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Code to generate dummy variable from several categorical variables? |

Date |
Tue, 17 Jan 2012 21:37:59 +0000 |

Note that David's suggestion of a composite categorical variable as one way to tackle this echoes http://www.stata.com/statalist/archive/2012-01/msg00549.html in which egen group = group(A B C), label missing was flagged as possible code. Deciding between that and egen group2 = group(A B C), label would regard a decision on what to do with missings. Nick n.j.cox@durham.ac.uk -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of David Hoaglin Sent: 17 January 2012 21:23 To: statalist@hsphsun2.harvard.edu Subject: Re: st: Code to generate dummy variable from several categorical variables? Deborah, The additional description is helpful. Thank you. I would describe your planned ANOVAs as a preliminary analysis, comparing the continuous demographic variables among groups defined by the three outcome variables A, B, and C (jointly). In a one-way ANOVA, the groups must be mutually exclusive. From your initial message, some subjects have both A=1 and B=1 (and other combinations in which more than one of the outcome variables are not 0). As a result, the groups defined by your three indicator variables are not mutually exclusive. Since you want to consider the three outcome variables together, I think you have two main choices. Either you can enumerate the combinations of A, B, and C that occur in your data (all 8 or only some of the 8?), define a categorical variable that has a distinct value for each of those mutually exclusive groups, and use that variable to define the groups in a one-way ANOVA; or you can consider a three-way ANOVA with A, B, and C as the factors and decide which terms to include in the model (only main effects, main effects and two-factor interactions, or main effects and two-factor and three-factor interactions). Once you have settled on the mutually exclusive groups (and before any ANOVA), it would be a good idea to check whether each of the demographic variables is suitable for an ANOVA or should be transformed. Making boxplots of the demographic variable by group would be one way to start. I hope this discussion helps. David Hoaglin On Tue, Jan 17, 2012 at 2:39 PM, DEBORAH L. HUANG <huangdx@u.washington.edu> wrote: > Basically what I'm hoping to do is "collapse" the outcome variables A, B and > C (all binary) into the new outcome indicator variable abnlX for ANOVA > (e.g., comparison mean age across indicators, among other continuous > demographic variables). > > The new outcome variable abnlX would have 3 indicators (my mistake in the > earlier message). As an indicator variable abnlX would be defined as > follows: > > abnlX indicator #1 =0 if A is 0 or missing, B is 0/1/missing, C is > 0/1/missing; =1 if A is 1, B is 0/1/missing, C is 0/1/missing > abnlX indicator #2 =0 if B is 0 or missing, A is 0/1/missing, C is > 0/1/missing; =1 if B is 1, A is 0/1/missing, C is 0/1/missing > abnlX indicator #3 =0 if C is 0 or missing, A is 0/1/missing, B is > 0/1/missing; =1 if C is 1, A is 0/1/missing, C is 0/1/missing > > Alternately for a categorical outcome variable abnlX it would be defined as > follows: > abnlX=0 if A=0 or missing & B=0 or missing & C=0 or missing > abnlX=1 if A=1 & B=0/1/missing & C=0/1/missing > abnlX=2 if B=1 & A=0/1/missing & C=0/1/missing > abnlX=3 if C=1 & A=0/1/missing & B=0/1/missing > > Thank you again to everyone for your input, and hopefully this further > clarifies my question. > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: Code to generate dummy variable from several categorical variables?***From:*"DEBORAH L. HUANG" <huangdx@u.washington.edu>

**Re: st: Code to generate dummy variable from several categorical variables?***From:*David Hoaglin <dchoaglin@gmail.com>

- Prev by Date:
**Re: st: Code to generate dummy variable from several categorical variables?** - Next by Date:
**st: re: sensitivity analysis for binary tx and continuous outcome in 3:1 nn** - Previous by thread:
**Re: st: Code to generate dummy variable from several categorical variables?** - Next by thread:
**st: df_r in xtreg, fe** - Index(es):