[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: too many duplicates with bsample, weight()?

From	[email protected] (Jeff Pitblado, StataCorp LP)
To	[email protected]
Subject	Re: st: too many duplicates with bsample, weight()?
Date	Tue, 28 Feb 2006 01:58:27 -0600

Matissa Hollister <[email protected]>

> I've been experimenting a bit with the bootstrap commands and there seems to
> be something wrong with the bsample command when the weight option is used.
> As I understand it, ...

Matissa has found a problem in -bsample- when used with the -weight()- option
and an expression that results in a resample size that is less than the sample
size.  While

	. bsample, weight(w)

is returning the correct frequency weights for a simple random sample with
replacement of the _N observations,

	. bsample 10, weight(w)

is not when _N >> 10 (for example).

We have fixed the problem, and the updated -bsample- will be available in the
next ado-file update.

> On a related note, is there the equivalent of the
> weight option for the bootstrap command? A way to
> leave the full dataset in memory?  I saw the -nodrop-
> option but it's not completely clear to me what it
> does.

In short, no.  The -nodrop- option prevents -bootstrap- from dropping
out-of-sample observations specified in the -if- and -in- conditions.  This
option is mostly useful for something like

	program myboot, rclass
		args y group
		reg `y' if group == 0
		local m0 = _b[_cons]
		reg `y' if group == 1
		return scalar diff = _b[_cons] - `m0'
	end

	. sysuse auto
	. bootstrap diff=r(diff), nodrop reps(100) : mybook mpg for

Without the -nodrop- option, -bootstrap- would drop all the domestic cars.
This is because -bootstrap- assumes that -e(sample)- identifies the within
sample observations when -e(sample)- is created as a result of the first call
to the prefixed command, and -bootstrap- drops out-of-sample observations by
default.

--Jeff
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: calculating autocovariances in stata
Next by Date: Re: st: timestamps and time
Previous by thread: st: too many duplicates with bsample, weight()?
Next by thread: st: behavior of -areg-
Index(es):
- Date
- Thread