Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Bias: Monte Carlo

 From John Antonakis To statalist@hsphsun2.harvard.edu Subject Re: st: Bias: Monte Carlo Date Mon, 06 May 2013 22:32:18 +0200

```Thanks Stas, Maarten.

Best,
J.

__________________________________________

John Antonakis
Professor of Organizational Behavior
Director, Ph.D. Program in Management

University of Lausanne
Internef #618
CH-1015 Lausanne-Dorigny
Switzerland
Tel ++41 (0)21 692-3438
Fax ++41 (0)21 692-3305
http://www.hec.unil.ch/people/jantonakis

Associate Editor
__________________________________________

On 06.05.2013 17:04, Stas Kolenikov wrote:
```
```Continuing on Maarten's note, my -bsweights- package produces
first-order balanced bootstrap weights that remove the random
simulation error from the bootstrap results for the point estimates.
See http://stata-journal.com/article.html?article=st0187 and
references on the balanced bootstrap therein.

I think the bias in parameter estimates should be put into the context
of MSE: if the bias component is greater than the variance component,
then the estimator is relatively useless. If the magnitude of bias is
say 1/2 that of the standard deviation of the sampling distribution of
your estimator, so that the contribution of bias to the total MSE is
20%, I would personally be able to live with that. With bias this big,
though, your estimation procedure should report the standard errors
that are based on MSE, not on the variance alone. For lots of
"regularly behaving" estimators, the sampling standard deviation
(estimated by the standard error) is O(n^{-1/2}), and bias is
O(n^{-1}), going to zero faster with the sample size than the standard
deviation, so in large samples, the bias asymptotically disappears.
Then the question of "how large bias is tolerable" becomes the
question of "how large my sample size should be" (for the normal
approximations to make sense).

Finally, the % bias is a awful measure when the true value of the
parameter is zero (and the whole situation is shift invariant, as in
the mean of the normal distribution, so the initial point on the scale
is arbitrary). It's probably OK in the constant CV situation of
heavily skewed distributions, but may not make sense in other
situations.

-- Stas Kolenikov, PhD, PStat (SSC)
-- Senior Survey Statistician, Abt SRBI
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer
-- http://stas.kolenikov.name

On Mon, May 6, 2013 at 3:25 AM, Maarten Buis <maartenlbuis@gmail.com> wrote:
```
```On Mon, May 6, 2013 at 9:49 AM, John Antonakis wrote:
```
```I am running some Monte Carlos where I am interested in observing the bias
in parameter estimates across manipulated conditions. By bias I mean the
absolute percentage difference of the simulated value from the true value.

I was wondering whether there has been another written about how much bias
is "acceptable"--I know that this is like asking how long is a piece of
string and that there is no statistical fiat that can give a definitive
answer, because it also is a very field specific issue.
```
```It is probably not quite the answer you are looking for (and I think
you are right by wondering whether such an answer can exist), but one
thing you can do is take into account that a Monte Carlo experiment
contains a random component, so if you repeat the experiment (with a
different seed) you will get a slightly different estimate of your
bias. The logic behind this variation between Monte Carlo experiments
is pretty much the same as the logic behind statistical testing: so
you can compute standard errors and confidence intervals. This is the
idea behind: Ian R. White (2010) "simsum: Analyses of simulation
studies including Monte Carlo error" The Stata Journal, 10(3):369--385
and <http://www.maartenbuis.nl/software/simpplot.html>. It is not very
useful as a definition of what amount of bias is "acceptable" as you
can arbitrarily make the bounds around your estimate of the bias
smaller by increasing the number of iterations, but at least this type
of bounds prevents you from over-interpretting the result from your
simulation, as happend here:
<http://stats.stackexchange.com/questions/55676#55676>.

Hope this helps,
Maarten

---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany

http://www.maartenbuis.nl
---------------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```
```*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```
```
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```