[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Hoetker, Glenn" <ghoetker@uiuc.edu> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Dummy Variables vs. Subgroup Models in Logistic Regression |

Date |
Fri, 22 Oct 2004 10:30:31 -0500 |

At 01:45 PM 10/22/2004 +0000, brian.h.nathanson@att.net wrote: >Dear Stata Users, > > I'm creating a logistic regression model with many dichotomous > variables along with one term that has 8 categories coded 1,2,..8. I can > create 7 dummy variables and have a very large model. Would it be > legitimate if my sample sizes are large enough to create 8 separate > models with each model representing one subgroup? Can anyone comment on > the pros and cons of using dummy variables versus creating separate > "subgroup" models based on the remaining independent variables? Thanks! Comparing logit/probit coefficients across groups is actually considerably more difficult than doing so in OLS. This reflects the fact that the betas are not identified in a logit model without imposing a restriction by setting the variance of the error term to pi^2/3. As a result, the estimated coefficients are the underlying "true" effect scaled by the amount of unobserved heterogeneity (a.k.a. residual variation). If the unobserved heterogeneity varies across groups, as it often will, then the estimated betas will vary too, even if the "true" effect is the same. Allison (1999) discusses this and proposes a test for detecting differences in unobserved heterogeneity and differences in underlying coefficients. Other discussions of the scale issue include Maddala (1983:23), Long (1997:47), and Train (2004). Hoetker (2004) uses Monte Carlo simulations to show that (a) the problem Allison identified isn't just theoretical--it leads to misleading inferences in common situations and (b) Allison's tests are a significant improvement over current practice, but are not a panacea. It also offers some alternative analytical approaches, including code in Stata (of course) to implement them. One finding in particular is that the use of interaction terms to detect inter-group differences in logit equations if likely to yield misleading results if unobserved heterogeneity differs across groups. In some circumstances, it's actually more likely to find significant results in the OPPOSITE direction than in the right direction. For cross-group comparisons in general, Liao (2002) is a helpful reference. Sorry to actually muddy the waters rather than providing a simple solution. Best wishes. Glenn Hoetker Assistant Professor of Strategy College of Business University of Illinois at Urbana-Champaign 217-265-4081 ghoetker@uiuc.edu Allison, P.D. 1999. Comparing logit and probit coefficients across groups. SMR/Sociological Methods & Research 28(2): 186-208. Hoetker, Glenn (2004). Confounded coefficients: Extending recent advances in the accurate comparison of logit and probit coefficients across groups. Working paper (http://www.business.uiuc.edu/ghoetker/wp.htm) Liao, T.F. 2002. Statistical group comparison. Wiley Series in Probability and Statistics. New York : Wiley-Interscience. Long, J.S. 1997. Regression models for categorical and limited dependent variables. Advanced Quantitative Techniques in the Social Sciences. Thousand Oaks, CA: Sage Publications. Maddala, G.S. 1983. Limited-dependent and qualitative variables in econometrics. New York: Cambridge University Press. Train, K.E. 2004. Discrete choice methods with simulation. Cambridge : Cambridge University Press. -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Richard Williams Sent: Friday, October 22, 2004 9:42 AM To: statalist@hsphsun2.harvard.edu; statalist@hsphsun2.harvard.edu Subject: Re: st: Dummy Variables vs. Subgroup Models in Logistic Regression If you estimate separate models, you are allowing ALL parameters to differ across groups, e.g. the effect of education could be different in each group. If you just add dummies, you are allowing the intercept to differ in each group, but the effects of the other variables stay the same. If you estimate separate models for each group, your models will certainly be much less parsimonious, i.e. you'll have a lot more parameters floating around. But the real question is, what is most appropriate given your theory and the empirical reality? If the effects of everything really is different across every group, then you should estimate separate models. But, if the effects do not differ across groups, then you are producing unnecessarily complicated models, and you are also reducing your statistical power, e.g. by not pooling groups when you should be pooling them you'll be more likely to conclude that effects do not differ from zero when they really do. These sorts of issues are discussed in http://www.nd.edu/~rwilliam/stats2/l51.pdf http://www.nd.edu/~rwilliam/stats2/l92.pdf ------------------------------------------- Richard Williams, Notre Dame Dept of Sociology OFFICE: (574)631-6668, (574)631-6463 FAX: (574)288-4373 HOME: (574)289-5227 EMAIL: Richard.A.Williams.5@ND.Edu WWW (personal): http://www.nd.edu/~rwilliam WWW (department): http://www.nd.edu/~soc * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: Dummy Variables vs. Subgroup Models in Logistic Regression***From:*SamL <saml@demog.berkeley.edu>

- Prev by Date:
**Re: st: Dummy Variables vs. Subgroup Models in Logistic Regression** - Next by Date:
**Re: st: Re: ST wish list and dbrowse** - Previous by thread:
**st: How to compute SE for linear or nonlinear combination of params?** - Next by thread:
**RE: st: Dummy Variables vs. Subgroup Models in Logistic Regression** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |