# Re: st: linear probability model vs. probit/logit

 From Ronnie Babigumira To statalist@hsphsun2.harvard.edu Subject Re: st: linear probability model vs. probit/logit Date Tue, 03 Oct 2006 22:51:29 +0200

The issues about comparing OLS and Logit for binary outcomes are spelt out in many econometric books (off my head, I can think of "Regression Models for Categorical and Limited Dependent Variables" by Long, and Basic Econometrics by Gujarati as having good notes on this). Might help to consult these or any other texts to clarify things

In addition, there is a nice example here http://www.ats.ucla.edu/STAT/stata/code/compolslog.htm that does what you are trying to do so it to might be helpful.

Nishant Dass wrote:

Hi Richard,
First of all, my apologies for flipping the no. of obs.;
indeed, it's more in the OLS and less in -logit-.

Hi Ronnie,
Thanks for your very helpful example. Yes, I see it now -
the header of my output shows that many indicator variables
of mine "predict failure perfectly" and yes, those
observations get dropped! (I have some 60 odd indicators. And the reason I was hesitant in pasting my output here is
that it is too large - I have some 85 regressors including
the indicator variables.)

So yes, I know why the observations are being dropped but
... is it then wrong to compare the OLS and -logit-?

Sorry about all this confusion!

N

--- Ronnie Babigumira <rb.glists@gmail.com> wrote:

Nishant
I agree with Richard, if logit dropped some observations,
then reg should as well. Here is an example using the
auto data. I regress foreign on price and rep78. We know that
rep78 has 5 missing cases so we expect that these
observations will be dropped.

. reg foreign price rep78

Source | SS df MS Number of obs = 69
-------------+------------------------------ F(
2, 66) = 17.86
Model | 5.13051358 2 2.56525679 Prob > F = 0.0000
Residual | 9.47818207 66 .143608819 R-squared = 0.3512
-------------+------------------------------ Adj R-squared = 0.3315
Total | 14.6086957 68 .21483376 Root MSE = .37896
<snip>

. logit foreign price rep78

Iteration 0: log likelihood = -42.400729
Iteration 1: log likelihood = -29.263454
Iteration 2: log likelihood = -27.809797
Iteration 3: log likelihood = -27.715582
Iteration 4: log likelihood = -27.714924

Logistic regression Number
of obs = 69
LR
chi2(2) = 29.37
Prob >
chi2 = 0.0000
Log likelihood = -27.714924 Pseudo
R2 = 0.3464
<snip>

----------------------------------------------

That said, I think I have an idea what is happening, I
generated a nonsensical variable called bug

gen bug = foreign

then I replace the first 15 cases with 1 (otherwise OLS
would basically produce nonsense)

replace bug = 1 in 1/15 //This introduces some variation
between bug and foreign so I now run

. reg foreign price rep78 bug

Source | SS df MS Number of obs = 69
-------------+------------------------------ F(
3, 65) = 29.77
Model | 8.45477597 3 2.81825866 Prob > F = 0.0000
Residual | 6.15391968 65 .094675687 R-squared = 0.5787
-------------+------------------------------ Adj R-squared = 0.5593
Total | 14.6086957 68 .21483376 Root MSE = .30769

<snip>

But see what happens when I run a logit

. logit foreign price rep78 bug

note: bug != 1 predicts failure perfectly
bug dropped and 35 obs not used

Iteration 0: log likelihood = -22.616945
Iteration 1: log likelihood = -13.458773
Iteration 2: log likelihood = -12.034404
Iteration 3: log likelihood = -11.766442
Iteration 4: log likelihood = -11.749228
Iteration 5: log likelihood = -11.749093

Logistic regression Number
of obs = 34
LR
chi2(2) = 21.74
Prob >
chi2 = 0.0000
Log likelihood = -11.749093 Pseudo
R2 = 0.4805

I think therein lies the problem, something in your list
of x's is perfectly predicting your y

Anyhow, it is show and tell time for you, you have told,
so may be you should show what exactly you typed and the headers of the output

hth

Ronnie

Nishant Dass wrote:
> Hi Maarten,
> Thanks for the link. I read it but I wonder - does
perfect
> prediction result in exclusion of those observations?
> > Hi Richard,
> I checked again and my runs aren't different. I simply
> replaced the -reg- with -logit- and re-run the command,
and
> get a different no. of obs. I am not sure how useful
would
> pasting my command be for you because there's really
> nothing different between the two commands that I am
> running (except the estimation method.)
> > Nishant
> > > --- Richard Williams <Richard.A.Williams.5@ND.edu>
wrote:
> >> That shouldn't be happening. I suspect there is
>> something different >> between your runs, e.g. are you using a different
>> dependent >> variable? Perhaps you could show the commands and
>> output.
>> >> At 02:11 PM 10/3/2006, Nishant Dass wrote:
>> >Dear list members,
>> >
>> >I am estimating a -probit-/-logit- model and my
question
>> is
>> >about its comparison with the linear probability
model
>> >(simple OLS).
>> >
>> >When I run the -probit- or -logit-, the number of
>> >observations is the same but much less when compared
>> with
>> >the OLS estimation of the very same model! (E.g.,
the
>> no.
>> >of obs. in my probit and logit estimate is 12,000
while
>> >it's only 10,000 in the OLS regression.)
>> >
>> >Could anyone please tell me why do -probit- and
-logit-
>> >drop these observations?
>> >
>> >Thank you very much,
>> >
>> >Nishant
>> >
>> >
>> >
>> >__________________________________________________
>> >Do You Yahoo!?
>> >Tired of spam? Yahoo! Mail has the best spam
protection
>> around
>> >http://mail.yahoo.com
>> >*
>> >* For searches and help try:
>> >* http://www.stata.com/support/faqs/res/findit.html
>> >* http://www.stata.com/support/statalist/faq
>> >* http://www.ats.ucla.edu/stat/stata/
>> >> -------------------------------------------
>> Richard Williams, Notre Dame Dept of Sociology
>> OFFICE: (574)631-6668, (574)631-6463
>> FAX: (574)288-4373
>> HOME: (574)289-5227
>> EMAIL: Richard.A.Williams.5@ND.Edu
>> WWW (personal): http://www.nd.edu/~rwilliam
>> WWW (department): http://www.nd.edu/~soc >> >> *
>> * For searches and help try:
>> * http://www.stata.com/support/faqs/res/findit.html
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>> > > > __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam
protection around
=== message truncated ===

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com *
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

```*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```