[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Stas Kolenikov" <skolenik@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: unexplainable probit results |

Date |
Fri, 11 Jul 2008 12:23:21 -0500 |

You might also consider exact logistic regression (type -help exlogistic-) for this problem, although I am pretty sure it will run into serious problems, besides being computationally intensive for this many variables. Getting perfect prediction means that your 0's and 1's are perfectly separated by some of your variables (or more likely a linear combination of those). In that case, the -logit/probit- models converge to the slope estimates equal to infinity. Picture this: suppose the response variable is zero for negative values of explanatory variable x, and that the response is 1 for positive values. What -logit/probit- tries to do is to fit a distribution function-looking curve between those points. The closer that distribution gets to the zeroes on the low end and ones on the high end, the better the likelihood. However that curve can be made arbitrarily close by sending the slope to infinity and chosing the intercept to be any point between the group of zeroes and group of ones on the x-axis. That's exactly what your output is showing: the coefficients represent the linear combination that separates ones and zeroes, but the standard errors are undefined because the likelihood is flat: if the loglikelihood is equal to zero as it is in your case, then it means that Prob[data]=1, you made a perfect prediction which you really would want to avoid. On 7/10/08, Michael I. Lichter <Lichter@uclalumni.net> wrote: > Karen, > > The immediate problem is stated at the bottom: "Note: 6 failures and 34 > successes completely determined." Some set of combinations of your 30 > variables always result in failures (a_w_wt == 0) and another set always > results in success (a_w_wt == 1). Logit/probit models will not produce > meaningful results under these circumstances. > > The real problem is that you are doing kitchen-sink modeling, throwing > in a huge number of variables relative to the number of cases you have > -- 43 cases (assuming that you're right about not having any missing > data) and 30 variables just isn't going to work. The usual rule of > thumb in OLS regression is that you should have 10 observations for > every independent variable, and while that may not exactly apply in > probit, you just don't have enough degrees of freedom. You either need > to reduce your independent variables down to the range of 4-6 either by > winnowing them down to the most theoretically important or by doing some > drastic data reduction by constructing additive indexes or using factor > analysis (which is probably not going to work with 43 cases) or > something else. > > Michael > > > Karen Barroga wrote: > > Hi, > > > > I run my variable using probit and got the results below. How come there are no/missing results (dots) for some variables and yet when I run these variables with no results, I get results? > > > > --------- > > > > . probit a_w_wt age schyr ricefyr axsta axsc frel8d erel8d owner_dum sumwsdsyield_t sumincome fsizesum Rizal_dum member_dum cx1wwt cx2wwt co1wwt co2wwt cbwwt obwwt ru1wwt ru2wwt pr1wwt pr2wwt tr1e_wwt tr2c_wwt tr3a_wwt tr4y_wwt > > > > Iteration 0: log likelihood = -20.65889 > > Iteration 1: log likelihood = -7.8035532 > > Iteration 2: log likelihood = -3.6455341 > > Iteration 3: log likelihood = -.95822287 > > Iteration 4: log likelihood = -.25576662 > > Iteration 5: log likelihood = -.07527036 > > Iteration 6: log likelihood = -.02337521 > > Iteration 7: log likelihood = -.00750029 > > Iteration 8: log likelihood = -.00245924 > > Iteration 9: log likelihood = -.00081897 > > Iteration 10: log likelihood = -.000276 > > Iteration 11: log likelihood = -.00009389 > > Iteration 12: log likelihood = -.00003219 > > Iteration 13: log likelihood = -.0000111 > > Iteration 14: log likelihood = -3.850e-06 > > Iteration 15: log likelihood = -1.341e-06 > > Iteration 16: log likelihood = -4.689e-07 > > Iteration 17: log likelihood = -1.638e-07 > > Iteration 18: log likelihood = -5.265e-08 > > Iteration 19: log likelihood = -4.800e-08 > > Iteration 20: log likelihood = -7.592e-09 > > Iteration 21: log likelihood = -6.152e-09 > > Iteration 22: log likelihood = -5.077e-09 > > Iteration 23: log likelihood = -5.074e-09 (backed up) > > Iteration 24: log likelihood = -5.073e-09 (backed up) > > > > Probit regression Number of obs = 43 > > LR chi2(27) = 41.32 > > Prob > chi2 = 0.0384 > > Log likelihood = -5.073e-09 Pseudo R2 = 1.0000 > > > > ------------------------------------------------------------------------------ > > a_w_wt | Coef. Std. Err. z P>|z| [95% Conf. Interval] > > -------------+---------------------------------------------------------------- > > age | -.3867675 . . . . . > > schyr | 1.086169 . . . . . > > ricefyr | .3800124 . . . . . > > axsta | -5.547729 . . . . . > > axsc | 12.72164 . . . . . > > frel8d | -9.253356 . . . . . > > erel8d | 12.346 . . . . . > > owner_dum | -3.629787 . . . . . > > sumwsdsyie~t | 1.31215 . . . . . > > sumincome | -.0534748 76.72575 -0.00 0.999 -150.4332 150.3262 > > fsizesum | 1.128914 . . . . . > > Rizal_dum | 2.55781 . . . . . > > member_dum | 3.753627 . . . . . > > cx1wwt | -1.092271 . . . . . > > cx2wwt | 1.702984 . . . . . > > co1wwt | -1.846226 . . . . . > > co2wwt | -.0006998 . . . . . > > cbwwt | 2.786398 . . . . . > > obwwt | -2.59865 . . . . . > > ru1wwt | -3.898595 . . . . . > > ru2wwt | -3.928283 . . . . . > > pr1wwt | -9.655389 . . . . . > > pr2wwt | .829331 . . . . . > > tr1e_wwt | 2.00484 . . . . . > > tr2c_wwt | .010138 90.12691 0.00 1.000 -176.6354 176.6556 > > tr3a_wwt | .0006733 10.95916 0.00 1.000 -21.47889 21.48023 > > tr4y_wwt | 1.028878 . . . . . > > _cons | -9.091063 . . . . . > > ------------------------------------------------------------------------------ > > Note: 6 failures and 34 successes completely determined. > > -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: Please do not reply to my Gmail address as I don't check it regularly. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: unexplainable probit results***From:*"Michael I. Lichter" <Lichter@UCLAlumni.net>

- Prev by Date:
**Re: st: How do I test that two subsample have different coefficient of variation?** - Next by Date:
**st: Help with data management** - Previous by thread:
**Re: st: unexplainable probit results** - Next by thread:
**st: suest with multinomial logit** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |