Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# st: bootstrap test, combined bootstrap datasets, statistical properties of the bootstrap

 From Gianluca Cafiso To Stata List Subject st: bootstrap test, combined bootstrap datasets, statistical properties of the bootstrap Date Tue, 04 May 2010 10:43:39 +0200

```Dear Statalisters,

```
I have a question about a bootstrap test I am developing. My doubt concerns the statistical properties of the bootstrap test as I have envisaged it, and not how to technically implement it into Stata.
```
____________________________________

My statistic of interest is the product of two sub-statistics; it is:
dif_L= dif_TF * dif_GCI

```
where the ?dif? suffix denotes a time difference, TF is a mean value, and GCI an Entropy Index.
```
I am interested in testing ?Ho: dif_L>0?, against ?H1: dif_L<=0?.

```
The series of data used to compute ?dif_TF? has a different size (N2 observations) than the series used for ?dif_GCI?(N1 observations).
```
```
If the two sub-statistics TF and GCI were generated from two series of the same size, this would bring me to the general case and I would simply write a programme to generate the ?dif_L? overall statistic and bootstrap it as usual. But since this is not the case, I have thought to do the following:
```
```
1- First Bootstrap for ?dif_TF?: I generate the ?dif_TF? statistic, bootstrap it (R repetitions) and store the dataset generated by the bootstrap (generates R samples of size N2, but one dataset with R observations for the estimated dif_TF, 1st dataset).
```
```
2- Second Bootstrap for ?dif_GCI?: I generate the ?dif_GCI? statistic, bootstrap it (R repetitions) and store the dataset generated by the bootstrap (generates R samples of size N1, but one dataset with R observations for the estimated dif_GCI, 2nd dataset).
```
```
3- Multiply the two dataset: Since each of the R observations in each dataset (1 and 2) is an estimate of the statistic dif_TF and dif_GCI, by multiplying the two dataset I generate a unique dataset with R observations for the combined statistics:
```
dif_Lj= dif_TFj * dif_GCIj  where j=1,?,R.

```
4- Use the Percentile Bootstrap of the dataset generated at point 3 for the combined statistic ?dif_Lj? to the test the null hypothesis.
```
My doubt is the following:
```
Is the test based on the unique dataset (as generated at point 3) still valid? Or, for the so-generated dataset, do the usual distributional properties -on which Bootstrap-based tests are based- not hold?
```

Any help, suggestion, reference on this is really welcome.

Many thanks. Gianluca Cafiso

___________________________________________________________
Dr. Gianluca Cafiso
Research Fellow, Economics Department-University of Catania.
Corso Italia 55, Catania, Italy.

e-mail: gcafiso@unict.it
tel.:  +39 0957537745

----------------------------------------------------------------
Universita' di Catania - A.P.Se.Ma.
Servizio di Posta Elettronica

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```