Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: uniform distribution
From
David Hoaglin <[email protected]>
To
[email protected]
Subject
Re: st: uniform distribution
Date
Sat, 9 Nov 2013 08:43:23 -0500
Nikos,
No approximation to the binomial distribution is involved.
The approach uses a basic property of (continuous) probability
distributions. If X is an observation from a distribution whose
cumulative distribution function (c.d.f.) is F, then U = F(X) has a
uniform(0,1) distribution. This is, I am transforming X by using the
c.d.f. of its own distribution. This holds for any continuous
distribution, not just the normal distribution.
The reverse of the above process starts with an observation U from
uniform(0,1) and transforms it by the inverse of the c.d.f. of the
particular distribution (call it Finv). Then X = Finv(U) is an
observation from the particular distribution. This is what Fernando
suggested. Of course, he did not assume that, when compressed onto
the interval [0,1], mpg would have a uniform distribution. The idea
is that a departure from uniformity will show up as a departure from
normality after transforming the uniformized data by invnorm. A
little problem may arise at the ends of the interval, though:
theoretically, invnorm(0) = minus infinity and invnorm(1) = infinity.
People often make "probability plots" and handle that problem by using
"plotting positions" that do not go quite as low as 0 or as high as 1.
In making a probability plot (or "quantile-quantile plot") for a
sample of n observations vs. the uniform distribution, I would do the
following:
1. Sort the observations from smallest to largest, index those with i
= 1 through i = n, and denote them by x(1), ..., x(n).
2. Calculate the corresponding plotting positions from the formula
pp(i) = (i - (1/3))/(n + (1/3)).
3. Make a scatterplot of the points (pp(i), x(i)).
4. Assess departures from uniformity by comparing the pattern in that
plot against a straight line.
5. To get a feel for how such plots look when the data are actually
uniform, simulate a number of samples of n from the uniform(0,1)
distribution and make that plot for each sample.
(Quantile-quantile plots for non-uniform distributions use the same
approach. They use Finv(pp(i)) as horizontal coordinate of the plot.)
David Hoaglin
On Sat, Nov 9, 2013 at 7:58 AM, Nikos Kakouros <[email protected]> wrote:
> Fernando,
>
> That seems to work pretty well (did a run below).
> I'm not entirely sure why it should work though.
>
> Is it because the normal distribution in this case works as an
> approximation to the binomial distribution?
>
> Nikos
>
>
>
> set obs 50000
> gen test=runiform()
> sort test
> histogram test
> gen n_test=invnormal(test)
> histogram n_test, normal
> swilk n_test
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/