Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: two stage model variance estimators

From   "Rachel Bouvier" <>
To   <>
Subject   Re: st: two stage model variance estimators
Date   Thu, 15 Sep 2005 14:14:55 -0400

>>> 09/14/05 7:55 PM >>>
it would actually help if you could sent your commands to the list so
that we see what's going on,

Of course.  Thank you for your time.

My first model is regressing the log of GDP on the year (ie, 1996) and
the square of the year (ie, 3984016).  Originally, I ran this model
separately for 30 countries.  I then obtained the predicted value of the
log of GDP for each of those countries (by using -predict-) and used it
in a second model.  

The suggestion was to interact year and year squared with each of the
countries in the dataset so that I could put them all in one regression,
using the country dummies and the -noconst- option.  Seems sensible, but
when I tried it, Stata dropped all the country dummies.  Here was my
code (I know there are more parsimonious ways to do this, but...):

*All the countries are given an index from 1 to 30.*
gen  mol  =1 if index== 1
replace  mol  =0 if index~= 1
gen  arm =1 if index== 2
replace  arm =0 if index~= 2

etc.  This generated dummies with 1 if the observation belonged to that
country (Moldova is #1, for example).

Then, I interacted the dummies with both year and year squared:

gen yrmol=year*mol 
gen yr2mol=yearsq*mol 
gen yrarm=year*arm
gen yr2arm=yearsq*arm


Finally, I ran a regression that looks like:

regress lngdp yrmol yr2mol yrarm yr2arm ... mol arm ... , nocons

where mol = Moldavia, arm = Armenia, and so on.

When I run this, I get the following (truncated):

lnpppc1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
mol |  (dropped)
arm |  (dropped)
yrmol |   .2019759   1.003532     0.20   0.841    -1.772746   
yr2mol |  -.0001006   .0005037    -0.20   0.842    -.0010917   
yrarm |    .149713   .4002733     0.37   0.709    -.6379338   
yr2arm |  -.0000744   .0002011    -0.37   0.712    -.0004702   

However, when I run the following:

regress lngdp year yearsq if index ==1

for example, I do get results.  That is what I did in order to get the
predicted variables for the second stage.

Anything jump out at you?  Again, thank you for your time and patience.

>>> 09/14/05 7:55 PM >>>
it would actually help if you could sent your commands to the list so
that we see what's going on,

On 9/14/05, Rachel Bouvier <> wrote:
> Hi again.  I tried interacting my xs with country specific dummies
> running them in a single equation as suggested.  Stata is dropping
> country dummies, even though I specify the nocons option.  (I
> now that this was why I had originally run it in 30 different
> - it works fine that way,  but not if I put them all into one
>  Am I doing something wrong?  It could be because xsq is the square
> x, but I don't understand why stata would let me do it for an
> country but not together.  Sorry for being obtuse.  -Rachel
> >>> 09/13/05 4:50 PM >>>
> a possible solution could be to run in a single model  the equation
>  (1) y = b1 x + b2 xsq
> interacting your x's with country specific dummies.
> In other words, you could run a fully interactive model which is
> equivalent to running 30 different regressions but in a single
> equation. (make sure you include the country specific dummies too
> would account for the constant in your separate regressions and
> specify the nocons option).
> hope this helps.
> robert
> On 9/13/05, Rachel Bouvier <> wrote:
> > Dear statalisters *
> >
> > I am confronting a problem much like that described by James
> in volume 2, issue 3 of the Stata Journal, "The robust variance
> estimator for two-stage models," where he gave an illustration of
> code to construct the Murphy-Topel variance estimator.
> >
> > I am using a variable (call it yhat), predicted in a first (series
> of) equations, as a regressor in my second equation.
> >
> > In other words, my first (series of) regressions looked like this:
> > (1) y = b1 x + b2 xsq
> >
> > Then, I predicted yhat from that regression, and used that in a
> second regression:
> > (2) z = b1 yhat + b2 x2 + b2 x3*
> >
> > I say "series of" regressions because I have a panel of 30
>  Rather than run one panel data regression and predict each
> yhat from that, I ran each country as a separate regression, not
> to assume that they could be pooled.  In other words, I ran equation
> 30 different times, for each country in the dataset.  (It seemed to
> sense at the time, to both me and my committee!)
> >
> > Therein lies my problem.  I would like to adjust the standard
> for the fact that I predicted yhat, but as I ran a different
> for each country, the solution is not as easy as constructing the
> Murphy-Topel estimator.  Does anyone have any suggestions? Any help
> would be much appreciated, before I dive into something that is
> undoubtedly over my head.  Thanks.

*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index