Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: on bootstrap

 From "Rubil Ivica" To Subject st: on bootstrap Date Thu, 27 Dec 2012 09:58:47 +0100

```Dear Statalisters,

I have a question on bootstrap. I have cross-sectional data on incomes for two countries, y0 and y1. I would like to obtain bootstrap standard errors for the ratio of medians of these two income distributions. Two ways come to my mind but I am not sure which one would be more appropriate:

Option 1:
I create dataset with variables y0 and y1. Since these two distributions are for different countries, basically it does not matter how the two are sorted, all combinations are possible. However, the way they are sorted matters for bootstraping, since different sortings imply different pairs (y0, y1) for each "observation". Then, of course, I get different bootstrap results for different sortings.
The code I use is the following:

cap prog drop medratio
prog medratio, rclass
qui sum y0
scalar med0 = r(p50)
qui sum y1
scalar med1 = r(p50)
return scalar med_ratio = med1 / med0
end
bootstrap r(med_ratio), seed(1234) reps(500): medratio

Option 2:
I append y1 to y0 and get one income variable, y. In addition, I create a dummy1 = 1 for incomes from country 1. And then I do the bootstrap using the following code:

cap prog drop medratio1
prog medratio1, rclass
qui sum y if dummy1 == 1
scalar med1 = r(p50)
qui sum y   if dummy1 == 0
scalar med0 = r(p50)
return scalar med_ratio = med1 / med0
end
bootstrap r(med_ratio), seed(1234) reps(500): medratio1

So, which of these two options seems more appropriate?

Thanks.

--
Ivica Rubil
Ekonomski institut || The Institute of Economics, Zagreb
Trg J. F. Kennedyja 7, 10 000 Zagreb, Croatia
tel. +385-1-2362-269 || fax. +385-1-2335-165
irubil@eizg.hr || www.eizg.hr

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```