Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: unexplainable probit results


From   "Stas Kolenikov" <[email protected]>
To   [email protected]
Subject   Re: st: unexplainable probit results
Date   Fri, 11 Jul 2008 12:23:21 -0500

You might also consider exact logistic regression (type -help
exlogistic-) for this problem, although I am pretty sure it will run
into serious problems, besides being computationally intensive for
this many variables.

Getting perfect prediction means that your 0's and 1's are perfectly
separated by some of your variables (or more likely a linear
combination of those). In that case, the -logit/probit- models
converge to the slope estimates equal to infinity. Picture this:
suppose the response variable is zero for negative values of
explanatory variable x, and that the response is 1 for positive
values. What -logit/probit- tries to do is to fit a distribution
function-looking curve between those points. The closer that
distribution gets to the zeroes on the low end and ones on the high
end, the better the likelihood. However that curve can be made
arbitrarily close by sending the slope to infinity and chosing the
intercept to be any point between the group of zeroes and group of
ones on the x-axis. That's exactly what your output is showing: the
coefficients represent the linear combination that separates ones and
zeroes, but the standard errors are undefined because the likelihood
is flat: if the loglikelihood is equal to zero as it is in your case,
then it means that Prob[data]=1, you made a perfect prediction which
you really would want to avoid.

On 7/10/08, Michael I. Lichter <[email protected]> wrote:
> Karen,
>
>  The immediate problem is stated at the bottom: "Note: 6 failures and 34
>  successes completely determined." Some set of combinations of your 30
>  variables always result in failures (a_w_wt == 0) and another set always
>  results in success (a_w_wt == 1). Logit/probit models will not produce
>  meaningful results under these circumstances.
>
>  The real problem is that you are doing kitchen-sink modeling, throwing
>  in a huge number of variables relative to the number of cases you have
>  -- 43 cases (assuming that you're right about not having any missing
>  data) and 30 variables just isn't going to work. The usual rule of
>  thumb in OLS regression is that you should have 10 observations for
>  every independent variable, and while that may not exactly apply in
>  probit, you just don't have enough degrees of freedom. You either need
>  to reduce your independent variables down to the range of 4-6 either by
>  winnowing them down to the most theoretically important or by doing some
>  drastic data reduction by constructing additive indexes or using factor
>  analysis (which is probably not going to work with 43 cases) or
>  something else.
>
>  Michael
>
>
>  Karen Barroga wrote:
>  > Hi,
>  >
>  > I run my variable using probit and got the results below. How come there are no/missing results (dots) for some variables and yet when I run these variables with no results, I get results?
>  >
>  > ---------
>  >
>  > . probit a_w_wt age schyr ricefyr axsta axsc frel8d erel8d owner_dum sumwsdsyield_t sumincome fsizesum Rizal_dum member_dum cx1wwt cx2wwt co1wwt co2wwt cbwwt obwwt ru1wwt ru2wwt pr1wwt pr2wwt tr1e_wwt tr2c_wwt tr3a_wwt tr4y_wwt
>  >
>  > Iteration 0:   log likelihood =  -20.65889
>  > Iteration 1:   log likelihood = -7.8035532
>  > Iteration 2:   log likelihood = -3.6455341
>  > Iteration 3:   log likelihood = -.95822287
>  > Iteration 4:   log likelihood = -.25576662
>  > Iteration 5:   log likelihood = -.07527036
>  > Iteration 6:   log likelihood = -.02337521
>  > Iteration 7:   log likelihood = -.00750029
>  > Iteration 8:   log likelihood = -.00245924
>  > Iteration 9:   log likelihood = -.00081897
>  > Iteration 10:  log likelihood =   -.000276
>  > Iteration 11:  log likelihood = -.00009389
>  > Iteration 12:  log likelihood = -.00003219
>  > Iteration 13:  log likelihood =  -.0000111
>  > Iteration 14:  log likelihood = -3.850e-06
>  > Iteration 15:  log likelihood = -1.341e-06
>  > Iteration 16:  log likelihood = -4.689e-07
>  > Iteration 17:  log likelihood = -1.638e-07
>  > Iteration 18:  log likelihood = -5.265e-08
>  > Iteration 19:  log likelihood = -4.800e-08
>  > Iteration 20:  log likelihood = -7.592e-09
>  > Iteration 21:  log likelihood = -6.152e-09
>  > Iteration 22:  log likelihood = -5.077e-09
>  > Iteration 23:  log likelihood = -5.074e-09  (backed up)
>  > Iteration 24:  log likelihood = -5.073e-09  (backed up)
>  >
>  > Probit regression                                 Number of obs   =         43
>  >                                                   LR chi2(27)     =      41.32
>  >                                                   Prob > chi2     =     0.0384
>  > Log likelihood = -5.073e-09                       Pseudo R2       =     1.0000
>  >
>  > ------------------------------------------------------------------------------
>  >       a_w_wt |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
>  > -------------+----------------------------------------------------------------
>  >          age |  -.3867675          .        .       .            .           .
>  >        schyr |   1.086169          .        .       .            .           .
>  >      ricefyr |   .3800124          .        .       .            .           .
>  >        axsta |  -5.547729          .        .       .            .           .
>  >         axsc |   12.72164          .        .       .            .           .
>  >       frel8d |  -9.253356          .        .       .            .           .
>  >       erel8d |     12.346          .        .       .            .           .
>  >    owner_dum |  -3.629787          .        .       .            .           .
>  > sumwsdsyie~t |    1.31215          .        .       .            .           .
>  >    sumincome |  -.0534748   76.72575    -0.00   0.999    -150.4332    150.3262
>  >     fsizesum |   1.128914          .        .       .            .           .
>  >    Rizal_dum |    2.55781          .        .       .            .           .
>  >   member_dum |   3.753627          .        .       .            .           .
>  >       cx1wwt |  -1.092271          .        .       .            .           .
>  >       cx2wwt |   1.702984          .        .       .            .           .
>  >       co1wwt |  -1.846226          .        .       .            .           .
>  >       co2wwt |  -.0006998          .        .       .            .           .
>  >        cbwwt |   2.786398          .        .       .            .           .
>  >        obwwt |   -2.59865          .        .       .            .           .
>  >       ru1wwt |  -3.898595          .        .       .            .           .
>  >       ru2wwt |  -3.928283          .        .       .            .           .
>  >       pr1wwt |  -9.655389          .        .       .            .           .
>  >       pr2wwt |    .829331          .        .       .            .           .
>  >     tr1e_wwt |    2.00484          .        .       .            .           .
>  >     tr2c_wwt |    .010138   90.12691     0.00   1.000    -176.6354    176.6556
>  >     tr3a_wwt |   .0006733   10.95916     0.00   1.000    -21.47889    21.48023
>  >     tr4y_wwt |   1.028878          .        .       .            .           .
>  >        _cons |  -9.091063          .        .       .            .           .
>  > ------------------------------------------------------------------------------
>  > Note: 6 failures and 34 successes completely determined.
>  >


-- 
Stas Kolenikov, also found at http://stas.kolenikov.name

Small print: Please do not reply to my Gmail address as I don't check
it regularly.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index