Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <[email protected]> |

To |
"[email protected]" <[email protected]> |

Subject |
Re: st: uniform distribution |

Date |
Sat, 9 Nov 2013 13:31:14 +0000 |

Let's take this more slowly. It looks like a case of answering a poster's question when the real problem is otherwise. 1. I would be interested to learn of examples to the contrary, but the hypothesis of a uniform distribution (unqualified) does not seem arise naturally. In contrast, the hypothesis that a variable is uniform on some interval [a, b] does arise and in that case a, b are known constants that follow from the nature of the variable. 2. Panos wants to scale values by (value - max) / (max - min) to [0,1] which amounts to arguing that the uniform being tested for has known extremes, namely the sample extremes. That needs a story. 3. Panos wants to plug the scaled values into -invnormal()-. However, -invnormal(0)- and -invnormal(1)- are indeterminate. Usually when people plug in probabilities into -invnormal()- they ensure that the arguments belong to (0,1), e.g. by using a recipe such as (rank - 0.5) / sample size. 4. Panos's examples are time series MONTH MS_COHO UK_MS Apri 396 62986 Aug 330 67503 Dec 342 65218 Feb 348 59491.83 Jan 379 65502.33 Jul 377 68214.5 Jun 368 65511.33 Mar 419 65112.17| May 423 66152.34 Nov 328 65107.67 Oct 347 68344.16 Sep 356 67597.34 What these variables are is not made clear, but my guess is not the problem is not about testing uniformity of distribution at all, but about testing for seasonality, which is a quite different problem. Ignoring the serial order is pointless in that case; it is a vital part of the information. 5. Regardless of whether that guess about the real problem is correct, Panos can't assume _independence_ of observations willy-nilly; that is an assumption that has to be justified. Whatever the answer to (4) a P-value from e.g. Shapiro-Wilk can't be taken very seriously here because of the fudges involved in translating the original problem to a quite different one. Nick [email protected] On 9 November 2013 12:58, Nikos Kakouros <[email protected]> wrote: > Fernando, > > That seems to work pretty well (did a run below). > I'm not entirely sure why it should work though. > > Is it because the normal distribution in this case works as an > approximation to the binomial distribution? > > Nikos > > > > set obs 50000 > gen test=runiform() > sort test > histogram test > gen n_test=invnormal(test) > histogram n_test, normal > swilk n_test > > > > On Fri, Nov 8, 2013 at 3:58 PM, Fernando Rios Avila <[email protected]> wrote: >> What about standardizing the variable toward an index from 0 to 1. >> say: >> sum mpg >> gen mpg_s=(mpg-r(min))/(r(max)-r(min)) >> Transform it into a normal >> gen n_mpg_s=invnormal(mpg_s) >> and then make a normality test of this variable >> sktest n_mpg_s >> HTH >> Fernando >> >> On Fri, Nov 8, 2013 at 3:53 PM, Nick Cox <[email protected]> wrote: >>> -egen, count()- on a variable just puts a constant in a variable, >>> namely the sum of non-missing values, which is useless for your >>> purpose. >>> >>> The best test of uniformity is graphical: -quantile- by accident if >>> not design yields the appropriate graph. Otherwise think of >>> chi-square, Kolmogorov-Smirnov, etc. >>> >>> For "STATA" read "Stata". >>> >>> Nick >>> [email protected] >>> >>> >>> On 8 November 2013 18:09, PAPANIKOLAOU P. <[email protected]> wrote: >>> >>>> I am a fairly new user to STATA. I have got to check whether each of >>>> these two variables (column 2: MS_COHO; column 3: UK_MS) follow the >>>> uniform distribution. >>>> For each for them, I used the following code, properly adjusted: >>>> >>>> egen n = count (mpg) // use MS_COHO and UK_MS each time ... drop n i >>>> surprisingly, the results were identical in both attempts, though the >>>> script was applied to two different variables. >>>> MONTH MS_COHO UK_MS >>>> Apri 396 62986 | >>>> Aug 330 67503 | >>>> Dec 342 65218 | >>>> Feb 348 59491.83 | >>>> Jan 379 65502.33 | >>>> Jul 377 68214.5 | >>>> Jun 368 65511.33 | >>>> Mar 419 65112.17 | >>>> May 423 66152.34 | >>>> Nov 328 65107.67 | >>>> Oct 347 68344.16 | >>>> Sep 356 67597.34 >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: uniform distribution***From:*Nikos Kakouros <[email protected]>

**References**:**st: uniform distribution***From:*"PAPANIKOLAOU P." <[email protected]>

**Re: st: uniform distribution***From:*Nick Cox <[email protected]>

**Re: st: uniform distribution***From:*Fernando Rios Avila <[email protected]>

**Re: st: uniform distribution***From:*Nikos Kakouros <[email protected]>

- Prev by Date:
**Re: st: uniform distribution** - Next by Date:
**st: understanding weights in a -xtreg panel regression** - Previous by thread:
**Re: st: uniform distribution** - Next by thread:
**Re: st: uniform distribution** - Index(es):