# st: problems with Murphy-Topel

 From "Rachel Bouvier" <[email protected]> To <[email protected]> Subject st: problems with Murphy-Topel Date Mon, 29 Oct 2007 12:57:00 -0400

```Hi stata-listers:

I’m working on a problem dealing with the Murphy-Topel procedure as
outlined in Hardin (2002) and Hole (2006).  Briefly, I have a very large
stacked model in the first stage, consisting of 31 countries and up to
17 years (some countries have a shorter time series).  It is stacked so
that the intercepts and the coefficients can vary for each country,
quite a while ago, so it may seem familiar to some.)

I use OLS to predict the trend of income over time for each country,
and call this variable “new_predict”, and its square,
“new_pred2.”  I also generate another variable, called
“new_flux,” which is made up of the residuals from the first
stage (or how far income falls from its predicted trend).  I then use
those three variables from the first stage in my second stage model
(also using OLS).  I need to adjust the standard errors from the second
stage model because those three variables were generated in the first
stage.

My former dissertation advisor and I modified the code used in the Hole
essay to account for the fact that I use OLS in both stages.  We’re
running into two problems, though:  if I try to display the results
using the matrices b and M, I get an error code that the matrix is not
positive definite.  Further, the standard error for at least one of my
estimated coefficients is smaller than it was before doing the
adjustment, which doesn’t seem plausible.  The problem likely lies
somewhere in the correlation between new_predict, new_pred2, and
new_flux.  I think the problem might be in how we defined “zz,” but
I’m not sure.  I’m wondering if the fact that we used the residuals
from the first stage as a variable in the second stage needs to be
accounted for in a different way.

If anyone out there has any suggestions, I’d be very grateful.  I am
out of my comfort zone with this!  Thanks.  I’ve copied the code
below.  -Rachel

PS.  I’m using Stata 7, but if a later version of Stata is necessary,
I’ll just bite the bullet and pay for it (my department has no money
for such trifles!)

*first stage:
xi: regress lnpppc i.code*year i.code|yearsq

predict double new_predict

predict double res1, res
gen double res1sq=res1^2
quietly sum res1sq

scalar mse=r(mean)

matrix V1=(e(df_r)/e(N) )* e(V)

gen double s1=res1*(1/mse)
gen new_flux = res1
gen new_pred2=new_predict^2

*second stage, need to use new prefix:

xi, prefix(rev):regress	lnpoll new_predict new_pred2 new_flux
pop_dens year i.code
matrix b = e(b)
matrix V2 = (e(df_r)/e(N) )* e(V)

predict double res2, res
gen double res2sq=res2^2
quietly sum res2sq
scalar mse2=r(mean)
gen double s2= res2*(1/mse2)

generate zz= _b[new_predict] + 2*new_predict*_b[new_pred2] -
_b[new_flux]   /*this how the generated variables affect the stage-2
outcome*/

*Calculate C using scores:
gen byte cons=1

matrix accum C =  _Icode_*   year   _IcodXyear_*   yearsq  _IcodXyeara*
cons new_predict new_pred2 new_flux pop_dens year revcode_* cons
[iw=s2*s2*zz], nocons

*Calculate R using scores:

matrix accum R = _Icode_*   year   _IcodXyear_*   yearsq  _IcodXyeara*
cons new_predict new_pred2 new_flux pop_dens year revcode_*  cons
[iw=s2*s1], nocons

matrix C = C[94..129,1..93]

matrix R = R[94..129,1..93]

matrix M = V2 + (V2 * (C*V1*C' - R*V1*C' - C*V1*R') * V2)

capture program drop doit
matrix b=e(b)
program define doit, eclass
est post b M
est local vcetype "Mtopel"
est display
end
doit

Rachel Bouvier, Ph.D.
Assistant Professor of Economics
University of Southern Maine
PO Box 9300
11 Chamberlain Avenue
Portland, ME 04104
(207) 228-8377

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```