Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: uniform distribution

From	David Hoaglin <[email protected]>
To	[email protected]
Subject	Re: st: uniform distribution
Date	Sat, 9 Nov 2013 08:43:23 -0500

Nikos,

No approximation to the binomial distribution is involved.

The approach uses a basic property of (continuous) probability
distributions.  If X is an observation from a distribution whose
cumulative distribution function (c.d.f.) is F, then U = F(X) has a
uniform(0,1) distribution.  This is, I am transforming X by using the
c.d.f. of its own distribution.  This holds for any continuous
distribution, not just the normal distribution.

The reverse of the above process starts with an observation U from
uniform(0,1) and transforms it by the inverse of the c.d.f. of the
particular distribution (call it Finv).  Then X = Finv(U) is an
observation from the particular distribution.  This is what Fernando
suggested.  Of course, he did not assume that, when compressed onto
the interval [0,1], mpg would have a uniform distribution.  The idea
is that a departure from uniformity will show up as a departure from
normality after transforming the uniformized data by invnorm.  A
little problem may arise at the ends of the interval, though:
theoretically, invnorm(0) = minus infinity and invnorm(1) = infinity.

People often make "probability plots" and handle that problem by using
"plotting positions" that do not go quite as low as 0 or as high as 1.
 In making a probability plot (or "quantile-quantile plot") for a
sample of n observations vs. the uniform distribution, I would do the
following:
1. Sort the observations from smallest to largest, index those with i
= 1 through i = n, and denote them by x(1), ..., x(n).
2. Calculate the corresponding plotting positions from the formula
pp(i) = (i - (1/3))/(n + (1/3)).
3. Make a scatterplot of the points (pp(i), x(i)).
4. Assess departures from uniformity by comparing the pattern in that
plot against a straight line.
5. To get a feel for how such plots look when the data are actually
uniform, simulate a number of samples of n from the uniform(0,1)
distribution and make that plot for each sample.
(Quantile-quantile plots for non-uniform distributions use the same
approach.  They use Finv(pp(i)) as horizontal coordinate of the plot.)

David Hoaglin

On Sat, Nov 9, 2013 at 7:58 AM, Nikos Kakouros <[email protected]> wrote:
> Fernando,
>
> That seems to work pretty well (did a run below).
> I'm not entirely sure why it should work though.
>
> Is it because the normal distribution in this case works as an
> approximation to the binomial distribution?
>
> Nikos
>
>
>
> set obs 50000
> gen test=runiform()
> sort test
> histogram test
> gen n_test=invnormal(test)
> histogram  n_test, normal
> swilk  n_test
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: uniform distribution
  - From: Nikos Kakouros <[email protected]>

References:
- st: uniform distribution
  - From: "PAPANIKOLAOU P." <[email protected]>
- Re: st: uniform distribution
  - From: Nick Cox <[email protected]>
- Re: st: uniform distribution
  - From: Fernando Rios Avila <[email protected]>
- Re: st: uniform distribution
  - From: Nikos Kakouros <[email protected]>

Prev by Date: Re: st: uniform distribution
Next by Date: Re: st: uniform distribution
Previous by thread: Re: st: uniform distribution
Next by thread: Re: st: uniform distribution
Index(es):
- Date
- Thread