[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: RE: RE: highly skewed, highly zeroed data

From   Maarten buis <>
Subject   Re: st: RE: RE: RE: highly skewed, highly zeroed data
Date   Thu, 26 Nov 2009 03:29:29 -0800 (PST)

--- On Thu, 26/11/09, Nick Cox wrote:
> Although with lots of zeros and strong skew the distribution
> concerned is awkward practically, I'd be surprised if it was
> pathological mathematically, or indicative of an underlying 
> distribution that was. The point could be explored a little
> by e.g. bootstrapping. 

Here is an attempt to do that. It looks at the computing the
mean in a Chi square distribution with .1 degrees of freedom.
It shows that the sampling distribution is better approximated
by the normal distribution if the mean is computed in larger 
samples, which doesn't surprise me. This example requires the
-hangroot- package, which is downloadable by typing in Stata:
-ssc install hangroot-

*--------------- begin example --------------------
// see the chi^2(.1) distribution
twoway function y =  gammaden(`=.1/2',2,0,x)
// looks pretty skewed to me...

// simulate the sampling distirbution
set more off
program drop _all
program define sim, rclass
	drop _all
	set obs `1'
	gen y = rchi2(`2')
	sum y, meanonly
	return scalar m = r(mean)

simulate m=r(m), reps(10000) : sim 500 .1
hangroot m , susp ci theoropt(lpattern(-))
// sampling distribution looks a bit off

simulate m=r(m), reps(10000) : sim 5000 .1
hangroot m , susp ci theoropt(lpattern(-))
// but in a larger sample it looks about fine
*------------------ end example -----------------
( For more on how to use examples I sent to statalist see: )

Hope this helps,

Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen


*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index