Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Construct Null Datasets through Bootstrap Resampling


From   Erik Ingelsson <[email protected]>
To   [email protected]
Subject   Re: st: Construct Null Datasets through Bootstrap Resampling
Date   Fri, 01 Dec 2006 23:32:05 -0500

Great, it works perfectly! Now I have two options for my resampling procedure, one with replacement and one without. They seems to give very similar answers, which they should.
Many thanks for all help!
Erik Ingelsson


Quoting Michael Blasnik <[email protected]>:


Sure- here's a quick way to scramble the observations while keeping
groups of variables together.  It will work with up to four groups.
It's not the most elegant code, but it works (I think) . The way to
specify the groups is to enclose the variable name list in double
quotes.

program define scramblegrp
version 9.2
args grp1 grp2 grp3 grp4
tempvar hold order rand
qui gen `hold'=.
gen long `order'=_n
foreach grp in "`grp1'"  "`grp2'"  "`grp3'" "`grp4'" {
qui gen `rand'=uniform()
 sort `rand'
   foreach var of local grp {
      qui replace `hold'=`var'
      qui replace `var'=`hold'[`order']
   }
drop `rand'
}
end

Here's a quick test using two groups:

sysuse auto
scramblegrp "price mpg" "weight length turn"

Michael Blasnik


----- Original Message ----- From: "Erik Ingelsson"
<[email protected]>
To: <[email protected]>
Sent: Friday, December 01, 2006 5:17 PM
Subject: Re: st: Construct Null Datasets through Bootstrap Resampling


Thanks Michael,

After discussing this with our senior statistician today, I will probably go with the replacement, since this is how they have done it before in SAS. However, do you think that you could easily explain for me how to keep groups together in the scramble code as well? Just if I need to do it that way on a later occasion?

Best,
Erik Ingelsson


Quoting Michael Blasnik <[email protected]>:


One more difference between the approaches that you should recognize is
that bsample samples with replacement (a value from a given observation
can appear more than once) while the scramble program I wrote does not
-- it simply re-arranges the order.  I'm not sure which is preferred
for your application.

If you end up wanting to sample without replacement (as scramble does)
but want to keep groups of variables together, a relatively modest
change to the scramble code would do the trick.

Michael Blasnik
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


---
Erik Ingelsson, MD, PhD

Current affiliation (until June 30, 2007):
Framingham Heart Study
73 Mt. Wayte Avenue, Suite 2
Framingham, MA 01702-5827
Phone: 508-935-3453
Fax: 508-626-1262
Cell: 508-202-8493

Permanent affiliation:
Uppsala University, Department of Public Health and Caring Sciences, Uppsala
Science Park, SE-751 85 Uppsala, SWEDEN.
Fax: +46-18-611 79 76
E-mail: [email protected]
---


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index