Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: bootstrap test, combined bootstrap datasets, statistical properties of the bootstrap


From   Gianluca Cafiso <gcafiso@unict.it>
To   Stata List <statalist@hsphsun2.harvard.edu>
Subject   st: bootstrap test, combined bootstrap datasets, statistical properties of the bootstrap
Date   Tue, 04 May 2010 10:43:39 +0200

Dear Statalisters,

I have a question about a bootstrap test I am developing. My doubt concerns the statistical properties of the bootstrap test as I have envisaged it, and not how to technically implement it into Stata.

____________________________________

My statistic of interest is the product of two sub-statistics; it is:
dif_L= dif_TF * dif_GCI

where the ?dif? suffix denotes a time difference, TF is a mean value, and GCI an Entropy Index.

I am interested in testing ?Ho: dif_L>0?, against ?H1: dif_L<=0?.

The series of data used to compute ?dif_TF? has a different size (N2 observations) than the series used for ?dif_GCI?(N1 observations).

If the two sub-statistics TF and GCI were generated from two series of the same size, this would bring me to the general case and I would simply write a programme to generate the ?dif_L? overall statistic and bootstrap it as usual. But since this is not the case, I have thought to do the following:

1- First Bootstrap for ?dif_TF?: I generate the ?dif_TF? statistic, bootstrap it (R repetitions) and store the dataset generated by the bootstrap (generates R samples of size N2, but one dataset with R observations for the estimated dif_TF, 1st dataset).

2- Second Bootstrap for ?dif_GCI?: I generate the ?dif_GCI? statistic, bootstrap it (R repetitions) and store the dataset generated by the bootstrap (generates R samples of size N1, but one dataset with R observations for the estimated dif_GCI, 2nd dataset).

3- Multiply the two dataset: Since each of the R observations in each dataset (1 and 2) is an estimate of the statistic dif_TF and dif_GCI, by multiplying the two dataset I generate a unique dataset with R observations for the combined statistics:

dif_Lj= dif_TFj * dif_GCIj  where j=1,?,R.

4- Use the Percentile Bootstrap of the dataset generated at point 3 for the combined statistic ?dif_Lj? to the test the null hypothesis.

My doubt is the following:
Is the test based on the unique dataset (as generated at point 3) still valid? Or, for the so-generated dataset, do the usual distributional properties -on which Bootstrap-based tests are based- not hold?


Any help, suggestion, reference on this is really welcome.

Many thanks. Gianluca Cafiso

___________________________________________________________
Dr. Gianluca Cafiso
Research Fellow, Economics Department-University of Catania.
Corso Italia 55, Catania, Italy.

e-mail: gcafiso@unict.it
tel.:  +39 0957537745


----------------------------------------------------------------
Universita' di Catania - A.P.Se.Ma.
Servizio di Posta Elettronica


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index