[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: bs problems

From	[email protected] (Jeff Pitblado, Stata Corp.)
To	[email protected]
Subject	Re: st: bs problems
Date	Tue, 22 Jul 2003 19:29:38 -0500
Jun Xu <[email protected]> asks why sometimes the observed number of
replications is less than the requested:

> Thanks for Stata Corp people (Jeff) responding to my problem. I would be 
> interested in knowing when the update will be realized.

The fix to -bootstrap- is in the wings for the next ado-update.

> One more problem, why the reps(30) I specified is not consistent with the
> number under the Reps column (22) (as well as the matrix e(reps)? Also,
> though the ereturn can be modified to have the e(size) something like that,
> is that possible you could add the size return matrix too for next update?
> Really appreciate.

I'll answer the second question first.  Jun Xu can save the estimation sample
size by including it on the list of expressions to bootstrap.  For example,

	. bootstrap "logit ..." _b size=e(N), ...
	                           ^^^^^^^^^

There are two reasons why the observed number of replications for a given
bootstrapped statistic may be different (less than) the number of requested
replications:

	1.  The expression cannot be calculated after the command is executed
	    using some of the bootstrap data sets.

	2.  The command failed for some of the bootstrap data sets.

I believe it is the second that Jun Xu is observing.  Some of the logistic
regressions are failing because -bootstrap- is supplying -logit- with a
dependent variable that is either all 0's or all 1's.

The following code will reproduce this behavior:

	. sysuse auto, clear
	. keep in -25/l
	. tabulate for
	. set seed 1234
	. bootstrap "logit for mpg" _b size=e(N), reps(30) size(20)

Using the auto data, I remove all but the last 25 observations.  The
-tabulate- command shows that only 3 out of the 25 cars are Domestic.  It is
probable that a bootstrap sample from this data set will result in only
Foreign cars, even more so if we only randomly sample 20 of the 25 cars.  The
-noisily- option will display the output from -logit- for each bootstrap
sample.  When this option is given, -bootstrap- will also output a message
indicating that is will be posting missing values when either (1) or (2) above
occurs.

A log from the above commands follow:

***** BEGIN: log
. sysuse auto, clear
(1978 Automobile Data)

. keep in -25/l
(49 observations deleted)

. tab for

   Car type |      Freq.     Percent        Cum.
------------+-----------------------------------
   Domestic |          3       12.00       12.00
    Foreign |         22       88.00      100.00
------------+-----------------------------------
      Total |         25      100.00

. set seed 1234

. bootstrap "qui logit for mpg" _b size=e(N), reps(30) size(20) noi

bootstrap: First call to (qui logit for mpg) with data as is:

. qui logit for mpg

bootstrap header:

command:      qui logit for mpg
statistics:   b_mpg      = _b[mpg]
              b_cons     = _b[_cons]
              size       = e(N)

30 calls to (qui logit for mpg) with bootstrap samples:

. qui logit for mpg
. qui logit for mpg
captured error running (qui logit for mpg), posting missing values
. qui logit for mpg
. qui logit for mpg
captured error running (qui logit for mpg), posting missing values
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
captured error running (qui logit for mpg), posting missing values
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
captured error running (qui logit for mpg), posting missing values
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
captured error running (qui logit for mpg), posting missing values
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg
. qui logit for mpg

Bootstrap statistics                              Number of obs    =        25
                                                  Replications     =        30

------------------------------------------------------------------------------
Variable     |  Reps  Observed      Bias  Std. Err. [95% Conf. Interval]
-------------+----------------------------------------------------------------
       b_mpg |    25  .1438087  .1078638   .198525  -.2659268   .5535442   (N)
             |                                       .0174609    .890745   (P)
             |                                       .0174609   .5159226  (BC)
      b_cons |    25 -1.241594  -2.21652  4.356838  -10.23366   7.750477   (N)
             |                                      -18.76268   1.817488   (P)
             |                                      -8.075786   1.817488  (BC)
        size |    25        25        -5         0         25         25   (N)
             |                                             20         20   (P)
             |                                              .          .  (BC)
------------------------------------------------------------------------------
Note:  N   = normal
       P   = percentile
       BC  = bias-corrected

***** END: log

--Jeff
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Prev by Date: st: Difficult situation
Next by Date: st: Re: offsets
Previous by thread: st: bs problems
Next by thread: st: serset limits are wimpy
Index(es):
- Date
- Thread