# Re: st: linear probability model vs. probit/logit

 From Kenneth Flamm To statalist@hsphsun2.harvard.edu, statalist@hsphsun2.harvard.edu Subject Re: st: linear probability model vs. probit/logit Date Thu, 05 Oct 2006 11:23:05 -0500

"Now, some experiments revealed that linprob and logit use the same observations, because for the
observations that are dropped by logit the correlation between the perfect predictor and hat is one,
and variables causing multicollinearity should be dropped by linear regression as well."

Could you explain? Since the linear probability model is not constrained to produce predicted values that lie in the [0,1] range, how can there be a correlation of 1 between the "perfect predictor" and the predicted probability? Your linprob routine must be dropping all observations with predicted probabilities outside the [0,1] range, which is not generally done with LPM models. Is this written up anywhere?

Also, robust standard errors are another means of dealing with the heteroskedasticity problem.

At 09:15 AM 10/5/2006, Tamas Bartus (tbartus) wrote:

Hi,

The previous discussion seemed to assume that the linear probability model is
a simple regression model.

However, the linear probability model should be a two-step weighted regression, that is,
first estimate the regression, then save the predicted value (hat), calculate hat*(1-hat),
then reestimate the model with analytic weight N/hat*(1-hat).

This procedure is implemented in my old and simple LINPROB (downloadable from my website)

Now, some experiments revealed that linprob and logit use the same observations, because for the
observations that are dropped by logit the correlation between the perfect predictor and hat is one,
and variables causing multicollinearity should be dropped by linear regression as well.

Hope this helps,

Tamas

------------------------------------------------

Tamas Bartus, PhD
Associate Professor, Institute of Sociology and Social Policy
Corvinus University, Budapest
1093 Budapest, Fovam ter 8.
Phone: +36-1-482-5290 Fax: +36-1-482-5226
Homepage: www.uni-corvinus.hu/bartus

----- Eredeti üzenet -----
Dátum: Szerda, Október 4, 2006 7:31 de
Tárgy: Re: st: linear probability model vs. probit/logit

> Ronnie Babigumira wrote (excerpted):
>
> Does it make sense that Stata drops a variable that predicts
> perfectly and
> then goes ahead to drop the observations even when it does not use the
> problem variable in the regression? Any insights into what is going
> on.
> --------------------------------------------------------------------
> ------------
>
> Take a look at _Release 9 Reference K-Q_ Page 98. This is in the entry
> for -logit-. At the top of the page, you'll see output of a logistic
> regression using the auto dataset, and with a variable dropped and 10
> observations omitted.
>
> The paragraphs beneath the printout, including the technical note
> at the
> bottom of the page, give the reasoning behind omitting observations
> aftera perfectly predicting variable (but not after a collinear
> predictor) has
> been dropped from the list of candidate predictors.
>
> Joseph Coveney
>
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```