Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: endogenous regression with OLS first stage and logit main regression

From   Cameron McIntosh <>
Subject   RE: st: endogenous regression with OLS first stage and logit main regression
Date   Fri, 21 Oct 2011 16:29:24 -0400

The error variance in logistic regression is fixed for identification purposes at pi-squared/3, so I believe that may be the reason for your perceived roadblock here:
Fielding, A. (2004). Scaling for residual variance components of ordered category responses in generalised linear mixed multilevel models. Quality and Quantity, 38, 425-433.
Bauer, D.J. (2009). A note on comparing the estimates of models for cluster-correlated or longitudinal data with binary or ordinal outcomes. Psychometrika, 74(1), 97-105.
Allison, P.D. (1999). Comparing logit and probit coefficients across groups. Sociological Methods & Research, 28(2), 186-208.
Mood, C. (2010). Logistic regression: Why we cannot do what we think we can do, and what we can do about it. European Sociological Review, 26(1), 67-82.
Williams, R. (2009). Using heterogeneous choice models to compare logit and probit coefficients across groups. Sociological Methods & Research, 37(4), 531-559.
Long, J.S. (2009). Group Comparisons in Logit and Probit Using Predicted Probabilities. Working paper draft 2009-06-25
Offhand, I'm not 100% sure if you can just substitute in that quantity in the code below and all will be well, but it may work. You may also want to consider a mediational model of the type Z--->X--->Y in -cmp- or gllamm, for example, where you can have both linear and logit link functions in a simultaneous equation framework:
Roodman, D. (2011). Fitting fully observed recursive mixed-process models with cmp. The Stata Journal, 11(2), 159-206.
My two cents,
> Date: Fri, 21 Oct 2011 10:05:50 -0700
> From:
> To:
> Subject: st: endogenous regression with OLS first stage and logit main regression
> Hello,
> I am trying to run an instrumental variables regression with an OLS first stage and logit (or probit if necessary) in the main regression.  We have a system that is "triangular" which means that our endogenous regressor isn't a function of the dependent variable.  Therefore, our system looks like this:
> main regression (logit or probit):
> y1 = y2 + x1 + e
> first stage (OLS):
> y2 = instrument + error
> The problem with using IVPROBIT is that the first stage will use all the exogenous regressors from the main regression as instruments, which we don't want or need as it won't bias the results to use only our one instrument.  There is a very useful discussion about how to use our one instrument in the first stage instead of all the exogenous regressors at the bottom of the FAQ below:
> It says to simply regress the first stage and substitute y2hat for y2 into the main regression.  No problems there.  But we run into trouble when trying to correct the variance-covariance matrix with the "correct mean square error".  The code is written for a system where the first and main stages are both OLS.   Is there a mean square error for logit?  STATA doesn't produce it.  
> Here is the code which recalculates the var-covar matrix after running an OLS main stage regression with y2hat substituted in.  
>  . scalar realmse = r(mean)*r(N)/e(df_r) 
>  . matrix bmatrix = e(b)
>  . matrix Vmatrix = e(V)
>  . matrix Vmatrix = e(V) * realmse / e(rmse)^2
>  . ereturn post bmatrix Vmatrix, noclear
>  . ereturn display
> Any help with the issue with correcting the var-covar matrix would be most appreciated.  
> Thanks!!
> Karen Ruckman
> Associate Professor
> Beedie School of Business, SFU
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index