Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
David Hoaglin <dchoaglin@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Code to generate dummy variable from several categorical variables? |

Date |
Tue, 17 Jan 2012 16:22:51 -0500 |

Deborah, The additional description is helpful. Thank you. I would describe your planned ANOVAs as a preliminary analysis, comparing the continuous demographic variables among groups defined by the three outcome variables A, B, and C (jointly). In a one-way ANOVA, the groups must be mutually exclusive. From your initial message, some subjects have both A=1 and B=1 (and other combinations in which more than one of the outcome variables are not 0). As a result, the groups defined by your three indicator variables are not mutually exclusive. Since you want to consider the three outcome variables together, I think you have two main choices. Either you can enumerate the combinations of A, B, and C that occur in your data (all 8 or only some of the 8?), define a categorical variable that has a distinct value for each of those mutually exclusive groups, and use that variable to define the groups in a one-way ANOVA; or you can consider a three-way ANOVA with A, B, and C as the factors and decide which terms to include in the model (only main effects, main effects and two-factor interactions, or main effects and two-factor and three-factor interactions). Once you have settled on the mutually exclusive groups (and before any ANOVA), it would be a good idea to check whether each of the demographic variables is suitable for an ANOVA or should be transformed. Making boxplots of the demographic variable by group would be one way to start. I hope this discussion helps. David Hoaglin On Tue, Jan 17, 2012 at 2:39 PM, DEBORAH L. HUANG <huangdx@u.washington.edu> wrote: > Basically what I'm hoping to do is "collapse" the outcome variables A, B and > C (all binary) into the new outcome indicator variable abnlX for ANOVA > (e.g., comparison mean age across indicators, among other continuous > demographic variables). > > The new outcome variable abnlX would have 3 indicators (my mistake in the > earlier message). As an indicator variable abnlX would be defined as > follows: > > abnlX indicator #1 =0 if A is 0 or missing, B is 0/1/missing, C is > 0/1/missing; =1 if A is 1, B is 0/1/missing, C is 0/1/missing > abnlX indicator #2 =0 if B is 0 or missing, A is 0/1/missing, C is > 0/1/missing; =1 if B is 1, A is 0/1/missing, C is 0/1/missing > abnlX indicator #3 =0 if C is 0 or missing, A is 0/1/missing, B is > 0/1/missing; =1 if C is 1, A is 0/1/missing, C is 0/1/missing > > Alternately for a categorical outcome variable abnlX it would be defined as > follows: > abnlX=0 if A=0 or missing & B=0 or missing & C=0 or missing > abnlX=1 if A=1 & B=0/1/missing & C=0/1/missing > abnlX=2 if B=1 & A=0/1/missing & C=0/1/missing > abnlX=3 if C=1 & A=0/1/missing & B=0/1/missing > > Thank you again to everyone for your input, and hopefully this further > clarifies my question. > > Deborah Huang * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: Code to generate dummy variable from several categorical variables?***From:*Nick Cox <n.j.cox@durham.ac.uk>

**References**:**Re: st: Code to generate dummy variable from several categorical variables?***From:*"DEBORAH L. HUANG" <huangdx@u.washington.edu>

- Prev by Date:
**st: Spurious inference from endogeneity tests** - Next by Date:
**RE: st: Code to generate dummy variable from several categorical variables?** - Previous by thread:
**RE: st: Code to generate dummy variable from several categorical variables?** - Next by thread:
**RE: st: Code to generate dummy variable from several categorical variables?** - Index(es):