# Re: st: RE: RE: RE: highly skewed, highly zeroed data

 From Maarten buis To statalist@hsphsun2.harvard.edu Subject Re: st: RE: RE: RE: highly skewed, highly zeroed data Date Thu, 26 Nov 2009 03:29:29 -0800 (PST)

```--- On Thu, 26/11/09, Nick Cox wrote:
> Although with lots of zeros and strong skew the distribution
> concerned is awkward practically, I'd be surprised if it was
> pathological mathematically, or indicative of an underlying
> distribution that was. The point could be explored a little
> by e.g. bootstrapping.

Here is an attempt to do that. It looks at the computing the
mean in a Chi square distribution with .1 degrees of freedom.
It shows that the sampling distribution is better approximated
by the normal distribution if the mean is computed in larger
samples, which doesn't surprise me. This example requires the
-ssc install hangroot-

*--------------- begin example --------------------
// see the chi^2(.1) distribution
// looks pretty skewed to me...

// simulate the sampling distirbution
set more off
program drop _all
program define sim, rclass
drop _all
set obs `1'
gen y = rchi2(`2')
sum y, meanonly
return scalar m = r(mean)
end

simulate m=r(m), reps(10000) : sim 500 .1
hangroot m , susp ci theoropt(lpattern(-))
// sampling distribution looks a bit off

simulate m=r(m), reps(10000) : sim 5000 .1
hangroot m , susp ci theoropt(lpattern(-))
// but in a larger sample it looks about fine
*------------------ end example -----------------
( For more on how to use examples I sent to statalist see:
http://www.maartenbuis.nl/stata/exampleFAQ.html )

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```