Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: bootstrap test, combined bootstrap datasets, statistical properties of the bootstrap

 From Stas Kolenikov To statalist@hsphsun2.harvard.edu Subject Re: st: bootstrap test, combined bootstrap datasets, statistical properties of the bootstrap Date Tue, 4 May 2010 09:11:16 -0500

```On Tue, May 4, 2010 at 3:43 AM, Gianluca Cafiso <gcafiso@unict.it> wrote:
> My doubt is the following:
> Is the test based on the unique dataset (as generated at point 3) still
> valid? Or, for the so-generated dataset, do the usual distributional
> properties -on which Bootstrap-based tests are based- not hold?

Which exactly properties are you worried about?

I am sure one can construct a valid bootstrap scheme for your
situation. I would be more worried about other issues, however.

1. It looks like you are dealing with time series. If that's the case,
you are dealing with dependent data, and you have to use some sort of
blocking schemes.

2. If that's indeed time series, and the two series overlap, then you
should ask yourself a question, "Do I want to try and resample the
same periods in both series?" If the series are cross-correlated,
you'd have to do that.

3. Using the bootstrap for hypothesis testing is quite complicated. To
get the bootstrap distribution of the test statistic, you have to
sample from a distribution in which the null is satisfied. Your setup
is very odd in this respect: you don't have a point null, and you
don't even have the null that contains the boundary point (as the
worst case null), so I cannot even conceptually think of a test that
could work in your case. If you really have an open interval for your
null, then asymptotically all probabilities are either 0 or 1, so
cannot construct an asymptotic test that would have an intermediate
size like 5%.

In other situations, there are ways to get the bootstrap p-value of
some reasonable tests. But to get there, you need to transform your
data so that the null is satisfied. In testing the mean, you'd need to
shift your data; in multivariate analysis, you need to rotate them; in
time series, you probably need to filter them somehow (even though
that might kill the very effect you are trying to test). If you do the
bootstrap with the raw data, you can only get a confidence interval of
a kind for the observed value of your statistic, but that'll be it.
You cannot get the null distribution of the test statistic or a
p-value unless you resample under the null.

--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```