Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: First stage of panel IV


From   "Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: First stage of panel IV
Date   Sat, 16 Jun 2012 12:24:28 +0100

Hi Filippos.  Sorry for the delay in replying.  Some responses below:

> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu 
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of 
> Filippos Petroulakis
> Sent: 13 June 2012 00:15
> To: statalist@hsphsun2.harvard.edu
> Subject: RE: st: RE: First stage of panel IV
> 
> Hi Mark,
> 
> 1) I have version 01.0.13 for xtivreg2, 03.0.08 for ivreg2, 
> and 01.3.01 for ranktest

xtivreg2 is up-to-date but ivreg2 and ranktest are not.

I tried to replicate your problem (different first-stage results
reported by xtivreg2 and official xtivreg when there are singletons) but
couldn't - using the up-to-date versions, the first-stage and final
outputs of xtivreg2 and xtivreg match.

If updating doesn't solve your problem, can you contact me off-list and
we can try to work out what's going on?

> 2) I get the exact same results. In fact I run both and 
> create a variable equal to e(sample) for each and they are identifcal
> 
> 3) You write
> 
> >It sounds like that's because the instrumenting of either w or q is 
> >weak.  But that's what the Angrist-Pischke F-stats are for.  
> If the AP 
> >F-stat for the regressor of interest is respectable, then you're OK.
> 
> But I'm not instrumenting for them. The problem arises when I 
> merely put them in the regression so they are included in the 
> first stage of x on z and the other covariates (so that they 
> become included instruments).

Apologies - in your previous posting you said that in

y_it=b_0+b_1 x_it + b_k demographic_covariates + w_it + q_it + e_it ,

"w_it and q_it are correlated with x_it and with e_it"

so I thought you were instrumenting for them.  But now I understand.

So ... if I understand correctly, what is happening is that the b_1, the
coeff on the endogenous regressor x_it, becomes weakly identified when
you include w and q as exogenous regressors.

I think what is happening is that the component of x that your excluded
instrument z is correlated with is also correlated with w and q.

If you think about it in a mechanical way, the weak ID diagnostic is the
first-stage F stat for the significance of x in the regression

x_it = b_k demographic_covariates + z_it + v_it

The F for a test of the significance of z in the above regression is
big, but when you add w and q,

x_it = b_k demographic_covariates + z_it + w_it + q_it + v_it

the F for the test of z becomes small.  So, loosely speaking, a lot of
the ability of z to explain x disappears when it has to compete with w
and q.  Presumably the SEs on w and q are on the small size.

I don't know what your application is, but it's possible that this could
be a case of what Angrist and Pischke ("Mostly Harmless Econometrics",
Princeton U.P. 2009, pp. 64-68) call the "bad control" problem.  If so,
the solution is to omit w and q altogether.

HTH,
Mark

> Thanks again,
> 
> Filippos
> 
> >>> "Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk> 06/12/12 4:36 AM >>>
> Filippos,
> 
> > -----Original Message-----
> > From: owner-statalist@hsphsun2.harvard.edu
> > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Filippos 
> > Petroulakis
> > Sent: 12 June 2012 04:11
> > To: statalist@hsphsun2.harvard.edu
> > Subject: Re: st: RE: First stage of panel IV
> > 
> > Hi Mark and thanks for your response. I am using Stata 11 
> and the up 
> > to date version of xtivreg2.
> 
> Can you check/report to us your versions of xtivreg2, ivreg2 
> and ranktest?
> 
> > I'll start another list
> > as it's getting too crowded
> > 
> > 1) I  was probably mistaken about this. Just to make sure, 
> is running 
> > OLS on a panel with all variables differenced identical to running 
> > first differences?
> 
> Should be the case.
> 
> > 2) About xtivreg versus xtivreg2, I am certain and to make 
> sure I run 
> > xtivreg2 and then copy the code and then just remove the 2. For 
> > xtivreg2 the first stage output is
> > 
> > 
> > .   
> > .   xtivreg2 y ( x =z)  ///
> > l.bzo  ln_tunder15 ln_t15to24 ln_t25to44 ln_t45to64 ln_t65plus 
> > ln_fraction_male ln_pop_hisp_all
> > ln_pop_nh_black ln_pop_nh_white, fd   small first robust
> > 
> > FIRST DIFFERENCES ESTIMATION
> > ----------------------------
> > Number of groups =      1026                    Obs per 
> > group: min =         1
> >                                                               
> >  avg =       1.9
> >                                                               
> >  max =         2
> > 
> > First-stage regressions
> > -----------------------
> > 
> > First-stage regression of D.ln_stim_forfd:
> > 
> > OLS estimation
> > --------------
> > 
> > Estimates efficient for homoskedasticity only Statistics robust to 
> > heteroskedasticity
> > 
> >                                                       Number 
> > of obs =     1928
> >                                                       F( 11,  
> > 1916) =    53.82
> >                                                       Prob > 
> > F      =   0.0000
> > Total (centered) SS     =  1631.878852                
> > Centered R2   =   0.2035
> > Total (uncentered) SS   =  59641.31821                
> > Uncentered R2 =   0.9782
> > Residual SS             =  1299.720342                Root 
> > MSE      =    .8236
> > 
> > --------------------------------------------------------------
> > ----------------
> > D.           |               Robust
> > x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
> > -------------+------------------------------------------------
> > ----------
> > -------------+------
> >     ln_bzo |
> >          LD. |  -.1256429   .0767888    -1.64   0.102    
> > -.2762413    .0249555
> >              |
> >  ln_tunder15 |
> >          D1. |   5.698369   3.323788     1.71   0.087    
> > -.8202531    12.21699
> >              |
> >   ln_t15to24 |
> >          D1. |   6.831107   2.217364     3.08   0.002     
> > 2.482406    11.17981
> >              |
> >   ln_t25to44 |
> >          D1. |   32.90151   3.672888     8.96   0.000     
> > 25.69823    40.10479
> >              |
> >   ln_t45to64 |
> >          D1. |   8.227723   3.417776     2.41   0.016     
> > 1.524771    14.93068
> >              |
> >   ln_t65plus |
> >          D1. |    4.88607   2.705773     1.81   0.071    
> > -.4205002    10.19264
> >              |
> > ln_fractio~e |
> >          D1. |   -3.86913    9.62245    -0.40   0.688    
> > -22.74071    15.00245
> >              |
> > ln_pop_his~l |
> >          D1. |  -6.352615   1.729689    -3.67   0.000    
> > -9.744887   -2.960343
> >              |
> > ln_pop_nh_~k |
> >          D1. |   7.426661   3.028961     2.45   0.014     
> > 1.486254    13.36707
> >              |
> > ln_pop_nh_~e |
> >          D1. |  -5.481275   3.728017    -1.47   0.142    
> > -12.79267    1.830123
> >              |
> > z |
> >          D1. |   3.404714   .2067276    16.47   0.000     
> > 2.999279    3.810149
> >              |
> >        _cons |   5.682521   .0508073   111.84   0.000     
> > 5.582877    5.782164
> > --------------------------------------------------------------
> > ----------------
> > 
> > 
> > With xtivreg it is
> > 
> > 
> > .   xtivreg y ( x =z)  ///
> > l.ln_crime  ln_tunder15 ln_t15to24 ln_t25to44 ln_t45to64 ln_t65plus 
> > ln_fraction_male ln_pop_hisp_all
> > ln_pop_nh_black ln_pop_nh_white, fd   small first
> > 
> > First-stage first-differenced regression
> > 
> >       Source |       SS       df       MS              Number 
> > of obs =     901
> > -------------+------------------------------           F( 11, 
> >   889) =   17.42
> >        Model |  58.4561519    11  5.31419563           Prob > 
> > F      =  0.0000
> >     Residual |  271.136523   889  .304990465           
> > R-squared     =  0.1774
> > -------------+------------------------------           Adj 
> > R-squared =  0.1672
> >        Total |  329.592675   900  .366214084           Root 
> > MSE      =  .55226
> > 
> > --------------------------------------------------------------
> > ----------------
> > D.           |
> > x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
> > -------------+------------------------------------------------
> > ----------
> > -------------+------
> >     ln_bzo |
> >          LD. |  -.0553261   .0661387    -0.84   0.403    
> > -.1851322    .0744801
> >              |
> >  ln_tunder15 |
> >          D1. |     14.637   2.988974     4.90   0.000     
> > 8.770729    20.50326
> >              |
> >   ln_t15to24 |
> >          D1. |   12.06724   2.274707     5.30   0.000     
> > 7.602818    16.53166
> >              |
> >   ln_t25to44 |
> >          D1. |   28.67612   3.411086     8.41   0.000      
> > 21.9814    35.37084
> >              |
> >   ln_t45to64 |
> >          D1. |   9.750316   3.753647     2.60   0.010     
> > 2.383274    17.11736
> >              |
> >   ln_t65plus |
> >          D1. |   9.117776   2.449422     3.72   0.000     
> > 4.310454     13.9251
> >              |
> > ln_fractio~e |
> >          D1. |   20.81802   9.297503     2.24   0.025     
> > 2.570409    39.06564
> >              |
> > ln_pop_his~l |
> >          D1. |  -3.244084   1.379805    -2.35   0.019     
> > -5.95214   -.5360282
> >              |
> > ln_pop_nh_~k |
> >          D1. |  -2.669608   2.386585    -1.12   0.264    
> > -7.353605     2.01439
> >              |
> > ln_pop_nh_~e |
> >          D1. |  -18.12162   4.160963    -4.36   0.000    
> > -26.28808   -9.955169
> >              |
> > z |
> >          D1. |  -1.102999   .2832308    -3.89   0.000    
> > -1.658877   -.5471196
> >              |
> >        _cons |   6.306457   .0505186   124.83   0.000     
> > 6.207308    6.405607
> > --------------------------------------------------------------
> > ----------------
> > 
> > 
> > 
> > The sample size is less than half in xtivreg. What is 
> particularly odd 
> > is the fact that the 2nd stage coefficients are very close and the 
> > reported observations and groups are now 1926 and 1025 for 
> xtivreg, so 
> > just one group less than xtivreg. Is it perhaps some 
> reporting issue? 
> > I really don't understand this. Just so I'm clear, the large 
> > difference, especially in the coefficient of the instrument 
> is in the 
> > first stage, while the second stages are very similar.
> 
> This is curious.  I am just about to travel but I will look into it.
> 
> One thing that comes to mind is that xtivreg2 with FDs may be 
> reporting N=the entire sample including the singletons (group 
> size=1) that drop out.
> 
> Perhaps try running -xtivreg,fd- and then -xtivreg2,fd if 
> e(sample)- so that they use the same sample.  Do you get the 
> same results?
> 
> > 3)
> > 
> > >This is confusing.  Do you mean that w_it and q_it are
> > correlated with
> > >x_it?  That's not a problem.  The key requirement is that in
> > >
> > >y_it=b_0+b_1 x_it + b_k demographic_covariates + w_it + 
> q_it + e_it ,
> > >
> > >w_it and q_it should be uncorrelated with e_it. 
> > 
> > Yes, w_it and q_it are correlated with x_it and with e_it.
> 
> To repeat, correlation with x_it is not a problem, but 
> correlation with e_it is.
> 
> > >If they are correlated,
> 
> with e_it (sorry)
> 
> > then you have two options: (1) Add w
> > and q to
> > >your list of endogenous variables.  But as you say, you will need 
> > >instruments for them.  And if you aren't interested in a causal 
> > >interpretation, then maybe you shouldn't bother.  (2)
> > Instead of using
> > >w and q as regressors and instrumenting them, insert the
> > instruments as (exogenous) regressors.
> > 
> > I am not interested in instrumenting, just conditioning, but my 
> > problem is that once I add them the previously very high 
> F-stat in the 
> > first stage goes down to the point of indicating weak instruments.
> 
> It sounds like that's because the instrumenting of either w 
> or q is weak.  But that's what the Angrist-Pischke F-stats 
> are for.  If the AP F-stat for the regressor of interest is 
> respectable, then you're OK.
> 
> > That is basically my concern. 
> > Concerning your second advice, do you mean that I should 
> just drop w 
> > and q from the model and replace them with instruments?
> 
> Yes, exactly.  You'll be estimating a semi-reduced form.
> 
> --Mark
>  
> > Thanks again for your help, it is very much appreciated.
> > 
> > Best,
> > 
> > Filippos
> > 
> > >>> "Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk> 06/11/12 7:02 AM >>>
> > Filippos,
> > 
> > You need to tell us more - what versions of software you are using, 
> > what the actual output is (or the relevant pieces of the 
> output), etc.
> > 
> > More comments below.
> > 
> > > -----Original Message-----
> > > From: owner-statalist@hsphsun2.harvard.edu
> > > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf 
> Of Filippos 
> > > Petroulakis
> > > Sent: Sunday, June 10, 2012 11:59 PM
> > > To: statalist@hsphsun2.harvard.edu
> > > Subject: st: First stage of panel IV
> > > 
> > > Hi all,
> > > 
> > > I am running a panel first differences (fixed effects) model. 
> > > My regression is of the sort
> > > 
> > > y_it=b_0+b_1 x_it + b_k demographic_covariates + e_it
> > > 
> > > x_it is endogenous so I have an instrument z_it.
> > > 
> > > I essentially have 3 issues and I list them in descending 
> order of 
> > > importance.
> > > 
> > > 1) xtivreg2 is running the first stage of x_it on z_it and the 
> > > exogenous demographic covariates as OLS instead of fixed 
> effects or 
> > > first differences.
> > 
> > I doubt it very much (having programmed -xtivreg2-, I think 
> I'm well 
> > placed to say this!).  -xtivreg2- follows the standard procedure of 
> > transforming the full set of variables used in the same 
> say, i.e., the 
> > within or between transformation is applied to all variables.
> > 
> > > I honestly do not know whether
> > > this is due to theory but it seems to be very odd, 
> especially given 
> > > the fixed effects is definitely the correct specification for the 
> > > model as a whole, and so I would think it has to be the
> > case for the
> > > first stage as well. I can do the 2 stages manually and 
> correct the 
> > > errors using the process outlined here
> > > (http://www.stata.com/support/faqs/stat/ivreg.html) and I
> > presume the
> > > fact that I have a panel doesn't change much, but my issue is 
> > > basically whether this is the correct thing to do.
> > > 
> > > 2) xtivreg and xtivreg2 give me pretty different results,
> > which is due
> > > to the fact that xtivreg drops about half of the
> > observations in the
> > > first stage. I checked and the variable that is causing the
> > dropping
> > > (for whatever reason) is the dependent variable. I am 
> thus positive 
> > > that xtivreg is the wrong one but am still worried. Anyone
> > knows why
> > > this happens?
> > 
> > Again, I doubt the problem is the one you suspect.  My 
> guess is that 
> > Most likely you are using different estimators, e.g., fixed effects 
> > with
> > -xtivreg2- and random effects with -xtivreg-.  But you need 
> to show us 
> > the output.
> > 
> > > 3) Finally, at some point I will need to include a further two 
> > > variables, call them w_it and q_it, which are surely 
> endogenous. I 
> > > don't care about instrumenting for them as I am not 
> interested in a 
> > > causal interpretation, but the problem is that they are also 
> > > endogenous to x_it. So my first stage will be regression x_it on 
> > > variables that are endogenous to itself and to y_it. Is
> > that an issue
> > > I should be concerned about?
> > 
> > This is confusing.  Do you mean that w_it and q_it are 
> correlated with 
> > x_it?  That's not a problem.  The key requirement is that in
> > 
> > y_it=b_0+b_1 x_it + b_k demographic_covariates + w_it + 
> q_it + e_it ,
> > 
> > w_it and q_it should be uncorrelated with e_it.  If they are 
> > correlated, then you have two options: (1) Add w and q to 
> your list of 
> > endogenous variables.  But as you say, you will need 
> instruments for 
> > them.  And if you aren't interested in a causal 
> interpretation, then 
> > maybe you shouldn't bother.  (2) Instead of using w and q as 
> > regressors and instrumenting them, insert the instruments as 
> > (exogenous) regressors.
> > 
> > HTH,
> > Mark
> > 
> > > Thank you very much in advance - answers to any or all of
> > those issues
> > > will be immensely appreciated.
> > > 
> > > Best,
> > > 
> > > Filippos Petroulakis
> > > 
> > > *
> > > *   For searches and help try:
> > > *   http://www.stata.com/help.cgi?search
> > > *   http://www.stata.com/support/statalist/faq
> > > *   http://www.ats.ucla.edu/stat/stata/
> > > 
> > 
> > 
> > --
> > Heriot-Watt University is the Sunday Times Scottish 
> University of the 
> > Year 2011-2012
> > 
> > Heriot-Watt University is a Scottish charity registered 
> under charity 
> > number SC000278.
> > 
> > 
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/help.cgi?search
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> > 
> > 
> > 
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/help.cgi?search
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> > 
> 
> 
> --
> Heriot-Watt University is the Sunday Times Scottish 
> University of the Year 2011-2012
> 
> Heriot-Watt University is a Scottish charity registered under 
> charity number SC000278.
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 


-- 
Heriot-Watt University is the Sunday Times
Scottish University of the Year 2011-2012

Heriot-Watt University is a Scottish charity
registered under charity number SC000278.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index