Austin Nichols <austinnichols@gmail.com> |

statalist@hsphsun2.harvard.edu |

Re: st: First stage F stats - xtivreg |

Tue, 21 Jun 2011 10:56:54 -0400 |

Agnese Romiti <romitiagnese@gmail.com>: I don't see how it matters that individuals move across clusters, unless you want to cluster by individual as well, and -xtivreg2- allows two dimensions of clustering. When you cluster by region-year, you assume that a draw from the dgp of person i in year t is independent from a draw from the dgp of person i in year t+1, which is clearly problematic. You should try clustering by individual, by region, and then try two dimensions of clustering. Let us know how the first stage diagnostic statistics and SEs on main variables of interest, in each of those 3 cases, compare to your region-year-clustered version. On Tue, Jun 21, 2011 at 10:47 AM, Agnese Romiti <romitiagnese@gmail.com> wrote: > Austin, > > The reason whereby I have chosen the region-year as cluster unit was > due to the fact that individuals - around 8 percent of them - move > across regions over time, so the region was not unique for them. > > Many thanks again for your help and the ref. > Agnese > > 2011/6/21 Austin Nichols <austinnichols@gmail.com>: >> Agnese Romiti <romitiagnese@gmail.com> >> In that case the cluster-robust SE will be biased downward slightly, >> resulting in overrejection and your first-stage F stat overstated, but >> I expect it will still outperform the SE and F clustering by >> region-year. You would have to do simulations matching your exact >> setup to be sure; see e.g. >> http://www.stata.com/meeting/13uk/nichols_crse.pdf >> >> On Tue, Jun 21, 2011 at 3:27 AM, Agnese Romiti <romitiagnese@gmail.com> wrote: >>> Hi, >>> Thanks again >>> In my data I have 19 regions, and around 18 percent of the data in the >>> largest region. >>> >>> Agnese >>> >>> >>> 2011/6/21 Austin Nichols <austinnichols@gmail.com>: >>>> Agnese Romiti <romitiagnese@gmail.com>: >>>> No, you should cluster by region to correctly account for possible >>>> serial correlation, >>>> assuming you have sufficiently many regions in your data; how many are there? >>>> What percent of the data is in the largest region? >>>> >>>> On Mon, Jun 20, 2011 at 5:19 PM, Agnese Romiti <romitiagnese@gmail.com> wrote: >>>>> Many thanks Austin, >>>>> >>>>> I'm actually clustering the standard errors at region-year level >>>>> rather than at region because I have one regressor with variability at >>>>> region-year level. Is that correct? >>>>> Do you think that the high first stage F stats might be a signal of a >>>>> bad instrument?Like a failure of the exogeneity requirement? >>>>> >>>>> Agnese >>>>> >>>>> >>>>> 2011/6/20 Austin Nichols <austinnichols@gmail.com>: >>>>>> Agnese Romiti <romitiagnese@gmail.com>: >>>>>> Are you clustering by region to account for the likely correlation of >>>>>> errors within region? >>>>>> Also see >>>>>> http://www.stata.com/meeting/boston10/boston10_nichols.pdf >>>>>> for an alternative model that allows your dep var to be nonnegative. >>>>>> >>>>>> On Mon, Jun 20, 2011 at 3:49 AM, Agnese Romiti <romitiagnese@gmail.com> wrote: >>>>>>> Dear Statalist users, >>>>>>> >>>>>>> I'm running a fixed effect model with IV (xtivreg2) , my dependent >>>>>>> variable is a measure of labor supply at the individual level (working >>>>>>> hours). Whereas I have an endogenous variable with variation only at >>>>>>> regional-year level. >>>>>>> My question is about the First stage statistics, the Weak >>>>>>> identification test results in an F statistics extremely high which >>>>>>> makes me worry about something wrong, i.e. F=3289. >>>>>>> Do you have any clue about potential reasons driving this odd result? >>>>>>> >>>>>>> Many thanks in advance for your help. >>>>>>> >>>>>>> Agnese * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

