[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: RE: RE: highly skewed, highly zeroed data

From   "Nick Cox" <>
To   <>
Subject   st: RE: RE: RE: highly skewed, highly zeroed data
Date   Thu, 26 Nov 2009 10:34:10 -0000

Jay makes an interesting point, although in turn it can be restated to
acknowledge that the central limit theorem comes in numerous different
flavours depending on quite what assumptions are being made. (For
example, there are flavours allowing various kinds of dependence.)
Alternatively, purists might want to talk of a family of central limit

However, my guess is that this is not the central issue. (That pun was
unintentional in my first draft and deliberate in my second.) Although
with lots of zeros and strong skew the distribution concerned is awkward
practically, I'd be surprised if it was pathological mathematically, or
indicative of an underlying distribution that was. The point could be
explored a little by e.g. bootstrapping. 

The median in the sample data was clearly zero! 


Verkuilen, Jay

Kieran McCaul wrote:

>The skew in the data does not stop you from calculating the mean, nor
does it stop you from calculating a 95% CI around the mean.
Regardless of the skew in the data, the sampling distribution of the
mean will be Normal.<

Not true. It will tend towards normality (in the sense of convergence in
distribution) assuming regularity conditions for the central limit
theorem hold, which for highly skewed variables is often NOT the case.
But that convergence may be VERY slow and the resulting confidence
interval for the mean may be extremely poor (incredibly wide) or even
ludicrous (e.g., below the lower bound of the data). 

I would wonder whether the original poster might want to estimate a
median instead of a mean?

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index