Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: bootstrap test, combined bootstrap datasets, statistical properties of the bootstrap

From	Stas Kolenikov <[email protected]>
To	[email protected]
Subject	Re: st: bootstrap test, combined bootstrap datasets, statistical properties of the bootstrap
Date	Tue, 4 May 2010 09:11:16 -0500

On Tue, May 4, 2010 at 3:43 AM, Gianluca Cafiso <[email protected]> wrote:
> My doubt is the following:
> Is the test based on the unique dataset (as generated at point 3) still
> valid? Or, for the so-generated dataset, do the usual distributional
> properties -on which Bootstrap-based tests are based- not hold?

Which exactly properties are you worried about?

I am sure one can construct a valid bootstrap scheme for your
situation. I would be more worried about other issues, however.

1. It looks like you are dealing with time series. If that's the case,
you are dealing with dependent data, and you have to use some sort of
blocking schemes.

2. If that's indeed time series, and the two series overlap, then you
should ask yourself a question, "Do I want to try and resample the
same periods in both series?" If the series are cross-correlated,
you'd have to do that.

3. Using the bootstrap for hypothesis testing is quite complicated. To
get the bootstrap distribution of the test statistic, you have to
sample from a distribution in which the null is satisfied. Your setup
is very odd in this respect: you don't have a point null, and you
don't even have the null that contains the boundary point (as the
worst case null), so I cannot even conceptually think of a test that
could work in your case. If you really have an open interval for your
null, then asymptotically all probabilities are either 0 or 1, so
cannot construct an asymptotic test that would have an intermediate
size like 5%.

In other situations, there are ways to get the bootstrap p-value of
some reasonable tests. But to get there, you need to transform your
data so that the null is satisfied. In testing the mean, you'd need to
shift your data; in multivariate analysis, you need to rotate them; in
time series, you probably need to filter them somehow (even though
that might kill the very effect you are trying to test). If you do the
bootstrap with the raw data, you can only get a confidence interval of
a kind for the observed value of your statistic, but that'll be it.
You cannot get the null distribution of the test statistic or a
p-value unless you resample under the null.

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: bootstrap test, combined bootstrap datasets, statistical properties of the bootstrap
  - From: Gianluca Cafiso <[email protected]>

Prev by Date: Re: st: lorenz and Concentration Curves
Next by Date: st: Graph Title Problem
Previous by thread: st: bootstrap test, combined bootstrap datasets, statistical properties of the bootstrap
Next by thread: st: -rtnorm()- and -runningprod()- available from SSC
Index(es):
- Date
- Thread