[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Allan Garland <agar5858@shaw.ca> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Modeling an independent variable with a very high data density at x=0 |

Date |
Fri, 05 Jun 2009 19:40:58 -0700 |

I'm doing a logistic regression using a non-negative, continuous independent variable X, for which about 60% of cases have X=0. It seems to me that just including X in the model is problematic, since it is likely that many cases with Y=0 and many others with Y=1 will have X=0. I can think of 2 possible approaches to modeling X, but would like some feedback on them, and any other thoughts on how to handle this situation. a) Divide X into m categories and represent it with m-1 dummy variables in the model. b) Include X in the model, and also include a binary variable Z such that Z=1 when X=0 and Z=0 otherwise. Then the effect of X=0 is given by the coefficient of Z, and the effect of X>0 is purely given by the coefficient of X itself (since then Z=0). Allan * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: Re: Modeling an independent variable with a very high data density at x=0***From:*"Joseph Coveney" <jcoveney@bigplanet.com>

- Prev by Date:
**Re: st: Re: JJQ : st: Evaluatng Instrument Strenght in the Arrelano and Bond (1998) GMM System Estimator** - Next by Date:
**st: Re: Modeling an independent variable with a very high data density at x=0** - Previous by thread:
**Re: st: Re: JJQ : st: Evaluatng Instrument Strenght in the Arrelano and Bond (1998) GMM System Estimator** - Next by thread:
**st: Re: Modeling an independent variable with a very high data density at x=0** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |