Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Bootstrap command when used with cluster and strata options


From   "Chris Frost" <[email protected]>
To   <[email protected]>
Subject   Re: st: Bootstrap command when used with cluster and strata options
Date   Fri, 25 Oct 2013 07:51:47 +0100

Dear Jeff
 
Thanks for this. Just wanted to point out that the problem applies to the command "bootstrap" as well as "bsample". Hope that this command will be updated too?
 
Many thanks.
 
Chris  

>>> Jeff Pitblado, StataCorp LP <[email protected]> 24/10/2013 19:54 >>>
Chris Frost <[email protected]> is using -bootstrap- with options
-strata()-, -cluster()- and -idcluster()-, and noticed that the new cluster
variable repeates ID values (starting from 1) between the strata:

> I think that there is a problem with the bootstrap command when used in
> conjunction with the "cluster" and "strata" options. The problem arises
> because the command "bootstrap, strata(group) cluster(id) idcluster(newid)
> ....." creates a variable "newid" which is only unique (at the cluster
> level) within each strata. For example if there are 1000 subjects (with
> multiple measures per subject) each with a unique id but in two equal size
> groups the above command will result in each bootstrap sample having only
> 500 values of newid with subjects being erroneously paired up: this will
> lead to incorrect variance estimates with a command such as
>
> . bootstrap, strata(group) cluster(id) idcluster(newid):
>		 mixed outcome i.group || newid: 
> 
> Am I correct? Can this be fixed?

Austin Nichols <[email protected]> verified this, and pointed out that
-bsample- is the command that is producing the new cluster id variable.

Jeph Herrin <[email protected]> ran across this behavior in a reply to a
Statalist thread earlier this year.  Sorry I missed that thread Jeph.

The documentation for -idcluster()- for -bsample- says:

	idcluster(newvar) creates a new variable containing a unique
		identifier for each resampled cluster.

This description agrees with Chris's expectation.  As such, we will update
-bsample- to behave as expected.

--Jeff
[email protected] 
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search 
*   http://www.stata.com/support/faqs/resources/statalist-faq/ 
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index