Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Bootstrapping question

From   "Ilian, Henry (ACS)" <>
To   "''" <>
Subject   RE: st: Bootstrapping question
Date   Fri, 8 Feb 2013 11:08:06 -0500

Maarten, This helps a great deal. Bootstrapping is a new idea for me. I saw it as a way to solve the problem of the fuzziness around the frequencies we are reporting, since people are making decisions based on those frequencies. If the proportion we report could be nearly 20 percentage points in either direction, then it doesn't tell us very much about what's going on in the real world. People might be doing poorly in an area reflected by the proportion, but they might also be doing well. Fortunately, we're supplementing the case-reading frequencies with focus groups among the people who work on the cases. If there is some agreement, then we can feel better about the numbers we're reporting, and the decision makers can feel more confident that they're making the right decisions. Right now, the process has the quality of fumbling around in the dark. I'm aware that people use small samples for many purposes, and if you didn't have small samples, you'd have no samples. I was lo!
 oking for a better way of saying what's going on. This may not be it.


-----Original Message-----
From: [] On Behalf Of Maarten Buis
Sent: Friday, February 08, 2013 3:49 AM
Subject: Re: st: Bootstrapping question

On Thu, Feb 7, 2013 at 10:28 PM, Ilian, Henry (ACS) wrote:
> I looked at the table of contents. The book is clearly worth having, but it doesn't seem to cover the sample-size problem--which actually may not be a problem, since the sample size is what it is, and there isn't a way to make it any larger. By improved, I meant narrower, although that's such an obvious answer I don't think it was what you were asking me. If bootstrapping won't result in narrower confidence intervals, then I'll have to live with the confidence intervals as they are.

It is not obvious that smaller confidence intervals represent an
improvement. A confidence interval is based on a thought experiment:
what if I could draw many new sample of the same size from my
population and compute my statistic in each of these samples. Each of
these statistics would be slightly different, as they are based on a
different random sample from the population. The 95% confidence
interval is an estimate of the interval within which 95% of these
hypothetical statistics will be. This is an estimate of the
uncertainty you have about your estimate, and the source of that
uncertainty is the fact that you don't have the entire population but
only a sample from that population. If you are unhappy about the size
of that interval than the obvious way to reduce that is to increase
the sample size. There are other cute ways of improving the precision
of your estimate, e.g. stratified sampling, but don't expect too much
from that: there is no way around the fact that a sample size of 27 is
small and any estimate based on that sample size will be uncertain.

If you say "improving the confidence interval", than that would mean
to me making sure that the probability that the statistic computed on
a random draw from the population falls within the 95% confidence
interval is indeed 95%. This may seem trivial, but for many estimates
of the confidence intervals this is not strictly true. Some confidence
intervals are based on a computation that assumes an infinitely large
sample and than the question becomes how large does the sample has to
be before this approximation becomes reasonable. Improving the
confidence interval would in that case mean some sort of adjustment
that takes into account that you have a sample of finite size (which
would typically increase the confidence interval rather than decrease
it). For other problems the problem of computing confidence intervals
is just very very hard and all existing estimates are approximate. The
estimate of a proportion is a good example of that: if we have N
observations, that our estimate of the proportion can only take one of
N+1 possible values: 0/N, 1/N, 2/N, ..., or N/N. This discreteness
makes the computation of an interval with exactly 95% coverage very
hard. Paradoxically, the estimates of that interval that are called
"exact" have far worse coverage  than many approximate methods.

The bootstrap confidence intervals can be said to be better at dealing
with small samples in the sense that it tends to make fewer
assumptions. It is not better in the sense that it will lead to
smaller confidence intervals, that might or might not be the case
depending on the type of violation of assumptions in the method with
which you compared the bootstrap estimate.

Hope this helps,

Maarten L. Buis
Reichpietschufer 50
10785 Berlin

*   For searches and help try:

Confidentiality Notice: This e-mail communication, and any attachments, contains confidential and privileged information for the exclusive use of the recipient(s) named above. If you are not an intended recipient, or the employee or agent responsible to deliver it to an intended recipient, you are hereby notified that you have received this communication in error and that any review, disclosure, dissemination, distribution or copying of it or its contents is prohibited. If you have received this communication in error, please notify me immediately by replying to this message and delete this communication from your computer. Thank you.

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index