[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Stas Kolenikov <skolenik@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: bootstrapping and time series |

Date |
Fri, 8 Oct 2004 15:05:34 -0400 |

On Fri, 8 Oct 2004 18:50:31 +0100, Nick Cox <n.j.cox@durham.ac.uk> wrote: > I'm sure Stas (and Jeff Pitblado) are right here. > > However, do note that there is literature on special > bootstrapping methods for time series (see e.g. Politis _Statistical > Science_ 2003). The point is that Stata's > -bootstrap- implements none of these methods. > > How far such methods extend to panel data I > do not know. Well if you can assume that your panels (i.e., individuals observed over time) are independent, then that is an appropriate unit to resample, and that can be handled by Stata's -bsample-. If -id- is the panel ID, and -year- is your time variable, so that your data set is tsset id year then you can resample your data by bsample ... , ...cluster(id) newcluster(newid) and then setting it up tsset newid year within your estimation routine, i.e. before -ivreg- or whatever you want to do with it. Note that this would necessarily involve programming so that your program has at least two lines: tsset newid year ivreg2 .... Then, with estimation results still in memory, you can still use _b[whatever] in the -bs-'s -exp_list-. Additional statistical inconvenience arise, however, It is known that the bootstrap distribution converges faster for the pivotal statistics (i.e. those that converge to a fully known distribution). The distinction here is between saving the coefficients _b[whatever] and the t-statistics _b[whatever]/_se[whatever]. The latter will converge to N(0,1), while the former, to N(true beta, sampling variance of beta-hat). Wait a second. The first one will converge to that nice N(0,1) only if there is no effect of -whatever- variable. You need to resample under the null hypothesis, so we need to sample from the distribution that has all our explanatory variables as they are, and the dependent variable equal to 0+0*x1+0*x2+error, where error follows exactly the same distribution it has in our data, which is not observed... and so on, and so on. To sum up again: doing the bootstrap properly involves quite a bit more assumptions than one usually seems to think it does, and I have not yet gotten into the discussion of whehter the bootstrap will give you a reasonable estimate of the variance / distribution (which involves yet another layer of highly technical asymptotic results that still have some regularity and mixing assumptions). My personal take on the bootstrap is to use it ONLY when (i) you know the standard errors provided by Stata are totally wrong, and there is no way to correct them analytically (and for two-stage econometric applications that people often complain they cannot get standard errors for, Murphy-Topel standard errors should work; I hope Mark Schaffer can correct me if that's not so); (ii) you totally know what you are doing with the bootstrap and what assumptions are implied by the bootstrap procedure. (Think of installing yet another software and clicking "I agree" button without reading the small print. In the second screen that you have not read, they have: "You are entitled to use this software for 30 days for evaluation purposes. If you have not purchased the full license by the 31st day, your hard drive will be formatted"). Please don't get me wrong, the bootstrap is a very powerful technique, but as all powerful techniques, you need to know how to use it. I know enough to warn against using it when I see the reasons for it to break down, like the dependent / heterogeneous data. -- Stas Kolenikov http://stas.kolenikov.name * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**RE: st: bootstrapping and time series***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

- Prev by Date:
**[no subject]** - Next by Date:
**st: repeated time unbalanced panel data** - Previous by thread:
**RE: st: bootstrapping and time series** - Next by thread:
**st: pooled regression vs fixed panel regression** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |