I would generate lots of values from a Gaussian
and then take a big bite out of it, with some
randomness about the biting.
. set obs 1000
. gen normal = invnorm(uniform())
. qnorm normal
. histogram normal if (abs(normal) * uniform()) > 0.5
. histogram normal if (abs(normal) * uniform()) > 0.6
Nick
n.j.cox@durham.ac.uk
D.Christodoulou
>
> I need to illustrate the failure of the normal distribution,
> in terms of
> linear scaling, to describe a distribution with fairly distinguished
> clusters.
>
> To go tto he extreme and make this more obvious, I consider
> the following
> illustration:
> Let assume a sample of 1000 observations which are
> distributed in a [-1, 1]
> interval. Suppose that there is a large but smoothly
> distributed cluster of
> 900 observations that take values in the [-0.2, 1] interval,
> and the rest
> of the 100 observations lie in the [-1, -0.8] interval.
> Thirty percent of
> the linear scaling will be used for non-existent values. The normal
> distribution will fit a large volume of variation in that gap.
> (The example is merely to make clear the possibility of
> mis-scaling and how
> it works, I'm not bothered with the obvious mixture of distributions)
>
> I need to generate the appropriate data and do this on a graph, e.g. a
> histogram with a superimposed normal density (the graph is
> not a problem).
> I have been playing around with the -invnorm(uniform())- function to
> generate the data but I didnt even get near to what I want to do. Any
> suggestions are gratefully appreciated!
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/