Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: First stage F stats - xtivreg


From   "Schaffer, Mark E" <[email protected]>
To   <[email protected]>
Subject   RE: st: First stage F stats - xtivreg
Date   Tue, 21 Jun 2011 19:49:58 +0100

Agnese, Austin,

Am I missing something here?  Using abdata.dta,

webuse abdata

xtivreg2 n w k, fe cluster(id year)

seems to work fine.  The panel identifier is id, and of course overlaps over different years, and year is one of the cluster variables.

Do you have the lastest -xtivreg2-?  It should be

. which xtivreg2, all

c:\ado\personal\xtivreg2.ado
*! xtivreg2 1.0.12 17June2010
*! author mes

--Mark

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of 
> Austin Nichols
> Sent: 21 June 2011 18:42
> To: [email protected]
> Subject: Re: st: First stage F stats - xtivreg
> 
> Agnese Romiti <[email protected]>:
> You are right about -xtivreg2- refusing to participate, so 
> you could simply include dummies for every fixed effect in 
> -ivreg2-, e.g.
> 
> webuse nhanes2, clear
> xtivreg2 hlthstat (iron=lead), fe i(houssiz) cluster(houssiz)
> xtivreg2 hlthstat (iron=lead), fe i(houssiz) cluster(location 
> sampl) qui ta houssiz, gen(d_)
> ivreg2 hlthstat (iron=lead) d_*, fwl(d_*) cluster(location sampl)
> 
> Or you could cluster by initial region instead, e.g.
> 
> bys i (t): g initregion=region[1]
> 
> which involves different assumptions, but will also give you 
> evidence of how the data seem to be clustered.
> 
> On Tue, Jun 21, 2011 at 1:06 PM, Agnese Romiti 
> <[email protected]> wrote:
> > Dear Austin,
> >
> > When I used as cluster unit region-year or also only region 
> I had to 
> > run ivreg2 on the data that I have previously transformed 
> in deviation 
> > to the mean (within trasformation) because the xtivreg2 
> requires that 
> > no panel overlaps more than one cluster. So panels should 
> be uniquely 
> > assigned to clusters.
> >  I tried to run instead xtivreg2 with two clusters as you suggested 
> > but I received an error message  "cluster():  too many variables 
> > specified", apparently  because I don't have the latest 
> version of the 
> > commands. I have just done an update all and my stata seems to be 
> > updated to 30March 2011 (exe and ado), and to 1Sept 2010 , the 
> > utilities. Is there a reason whereby  I still get the error?
> >
> > Thanks
> > Agnese
> >
> >
> >
> >
> > 2011/6/21 Austin Nichols <[email protected]>:
> >> Agnese Romiti <[email protected]>:
> >> I don't see how it matters that individuals move across clusters, 
> >> unless you want to cluster by individual as well, and -xtivreg2- 
> >> allows two dimensions of clustering. When you cluster by 
> region-year, 
> >> you assume that a draw from the dgp of person i in year t is 
> >> independent from a draw from the dgp of person i in year 
> t+1, which 
> >> is clearly problematic.  You should try clustering by 
> individual, by 
> >> region, and then try two dimensions of clustering.  Let us 
> know how 
> >> the first stage diagnostic statistics and SEs on main variables of 
> >> interest, in each of those 3 cases, compare to your 
> >> region-year-clustered version.
> >>
> >> On Tue, Jun 21, 2011 at 10:47 AM, Agnese Romiti 
> <[email protected]> wrote:
> >>> Austin,
> >>>
> >>> The reason whereby I have chosen the region-year as 
> cluster unit was 
> >>> due to the fact that individuals - around 8 percent of 
> them - move 
> >>> across regions over time, so the region  was not unique for them.
> >>>
> >>> Many thanks again for your help and the ref.
> >>> Agnese
> >>>
> >>> 2011/6/21 Austin Nichols <[email protected]>:
> >>>> Agnese Romiti <[email protected]> In that case the 
> >>>> cluster-robust SE will be biased downward slightly, resulting in 
> >>>> overrejection and your first-stage F stat overstated, 
> but I expect 
> >>>> it will still outperform the SE and F clustering by 
> region-year.  
> >>>> You would have to do simulations matching your exact setup to be 
> >>>> sure; see e.g.
> >>>> http://www.stata.com/meeting/13uk/nichols_crse.pdf
> >>>>
> >>>> On Tue, Jun 21, 2011 at 3:27 AM, Agnese Romiti 
> <[email protected]> wrote:
> >>>>> Hi,
> >>>>> Thanks again
> >>>>> In my data I have 19 regions, and around 18 percent of 
> the data in 
> >>>>> the largest region.
> >>>>>
> >>>>> Agnese
> >>>>>
> >>>>>
> >>>>> 2011/6/21 Austin Nichols <[email protected]>:
> >>>>>> Agnese Romiti <[email protected]>:
> >>>>>> No, you should cluster by region to correctly account for 
> >>>>>> possible serial correlation, assuming you have 
> sufficiently many 
> >>>>>> regions in your data; how many are there?
> >>>>>> What percent of the data is in the largest region?
> >>>>>>
> >>>>>> On Mon, Jun 20, 2011 at 5:19 PM, Agnese Romiti 
> <[email protected]> wrote:
> >>>>>>> Many thanks Austin,
> >>>>>>>
> >>>>>>> I'm actually clustering the standard errors at 
> region-year level 
> >>>>>>> rather than at region because I have one regressor with 
> >>>>>>> variability at region-year level. Is that correct?
> >>>>>>> Do you think that the high first stage F stats might 
> be a signal 
> >>>>>>> of a bad instrument?Like a failure of the exogeneity 
> requirement?
> >>>>>>>
> >>>>>>> Agnese
> >>>>>>>
> >>>>>>>
> >>>>>>> 2011/6/20 Austin Nichols <[email protected]>:
> >>>>>>>> Agnese Romiti <[email protected]>:
> >>>>>>>> Are you clustering by region to account for the likely 
> >>>>>>>> correlation of errors within region?
> >>>>>>>> Also see
> >>>>>>>> http://www.stata.com/meeting/boston10/boston10_nichols.pdf
> >>>>>>>> for an alternative model that allows your dep var to 
> be nonnegative.
> >>>>>>>>
> >>>>>>>> On Mon, Jun 20, 2011 at 3:49 AM, Agnese Romiti 
> <[email protected]> wrote:
> >>>>>>>>> Dear Statalist users,
> >>>>>>>>>
> >>>>>>>>> I'm running a fixed effect model with IV (xtivreg2) , my 
> >>>>>>>>> dependent variable is a measure of labor supply at the 
> >>>>>>>>> individual level (working hours). Whereas I have an 
> endogenous 
> >>>>>>>>> variable with variation only at regional-year level.
> >>>>>>>>> My question is about the First stage statistics, the Weak 
> >>>>>>>>> identification test results in an F statistics 
> extremely high 
> >>>>>>>>> which makes me worry about something wrong, i.e. F=3289.
> >>>>>>>>> Do you have any clue about potential reasons 
> driving this odd result?
> >>>>>>>>>
> >>>>>>>>> Many thanks in advance for your help.
> >>>>>>>>>
> >>>>>>>>> Agnese
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 


-- 
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index