Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Unbiased standard deviation in summarize


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Unbiased standard deviation in summarize
Date   Fri, 16 Nov 2012 22:52:38 +0000

The formula used by -summarize- is documented. Even if it weren't
experiment would show that the SD is calculated as the square root of
the variance, itself calculated with a divisor of (n - 1).

. set obs 7
obs was 0, now 7

. gen y = _n

. su y

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
           y |         7           4    2.160247          1          7

. mata
------------------------------------------------- mata (type end to
exit) ---------------------------------------------
: y = 1::7

: sqrt(mean((y :- mean(y)):^2))
  2
: sqrt(sum((y :- mean(y)):^2)/6)
  2.160246899

Although a divisor of (n - 1) does give an unbiased estimate of
variance, its square root is _not_ an unbiased estimate of SD, and you
would need to program a correction factor yourself.

: n = 7

: sqrt(2 / (n - 1)) * exp(lngamma(n / 2)) / exp(lngamma((n - 1)/2))
  .9593687887

This is not quite what is given at

http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation

It's also documented everywhere that Stata is called "Stata".

Nick

On Fri, Nov 16, 2012 at 10:00 PM, Daniel Almar de Sneijder
<dasneijder@gmail.com> wrote:
> Hello STATA,
>
> First of all, my initial guess is that the computed standard deviation in
> STATA does not correct for possible biases. However I am not sure as I
> havent found any good documentation on this issue. My questions are:
>
> 1. Does anybody know whether the standard deviations, computed with the
> command summarize, are unbiased?
>
> 2. If so, how can I correct for this?
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index