[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Richard Williams <Richard.A.Williams.5@nd.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Statistical/Stata question |

Date |
Tue, 17 Feb 2004 22:03:59 -0500 |

It isn't silly at all.Agreed. To offer some substantive examples: When you have a categorical variable, you may not necessarily want to create and use all possible dummies. The Ns for some categories may be very small, or, for the purposes of your analysis, the differences between groups may be small or nonexistent.

In effect, you are forcing the first two categories (1 and 2) to have exactly the same odds (probability) for the event. In other words, you are constraining the odds ratio for category2 vs. category1 to be exactly equal to 1. As a consequence, the comparison of category3 to category1 has the same odds ratio as the comparison of category3 to category2 (and similarly for category4). For the purposes of the model, there are only 3 categories: a combined 1/2 category, category3 and category4.

Suppose, for example, that Race has been coded white, black and other in your data. You might wonder whether (a) you need separate dummies for blacks and others, because both groups differ from whites and they also differ from each other, or (b) do you just need the dummy for white, in which case you'd be saying that the important contrast is white versus nonwhite and the differences among nonwhites are not important.

Combining categories may cost you some information, and it may obscure important differences by in effect assuming that, say, all minorities are the same. But, combining categories can also make the analysis much more manageable and help the most critical points to be clearer. You can, of course, do formal tests to see whether it is legit to combine categories together, e.g. if the IVs were black and other and a test reveals that their effects do not significantly differ from each other, you could just use a dummy var for white/nonwhite instead. Or, going back to the original example, if the IVS were category2, category3, and category4, and category2 was not significant, that would suggest that you can simplify things by just pooling category1 and category2 together (especially if it makes logical sense that the groups would differ little if any from each other).

-------------------------------------------

Richard Williams, Notre Dame Dept of Sociology

OFFICE: (574)631-6668, (574)631-6463

FAX: (574)288-4373

HOME: (574)289-5227

EMAIL: Richard.A.Williams.5@ND.Edu

WWW (personal): http://www.nd.edu/~rwilliam

WWW (department): http://www.nd.edu/~soc

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: Binary outcome Blinder-Oaxaca decomposition***From:*"Joao Pedro W. de Azevedo" <jazevedo@provide.com.br>

**st: Binary outcome Blinder-Oaxaca decomposition***From:*"Joao Pedro W. de Azevedo" <jazevedo@provide.com.br>

**References**:**st: Statistical/Stata question***From:*Jose Maria <jmpsouza@usp.br>

**st: Upcoming NetCourses***From:*"Shannon Driver, StataCorp" <sdriver@stata.com>

**Re: st: Statistical/Stata question***From:*Constantine Daskalakis <C_Daskalakis@mail.jci.tju.edu>

- Prev by Date:
**st: RE: Statistical/Stata question** - Next by Date:
**Re: st: data management** - Previous by thread:
**Re: st: Statistical/Stata question** - Next by thread:
**st: Binary outcome Blinder-Oaxaca decomposition** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |