Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Plotting with cumuli and doing Kolmogorov-Smirnov


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Plotting with cumuli and doing Kolmogorov-Smirnov
Date   Tue, 14 Aug 2012 10:04:40 +0100

-cumul- (not "cumuli") is an official command. -distplot- is a
user-written command from the Stata Journal, as you are asked to
explain.

Morrison is correct. There is no -equal- option in -distplot- and it
does not plot _exactly_ what -cumul, equal- calculates. But use the
option -c(J J J)- (for three variables) and you have to work really
hard to spot that.

Let's think about the general problem of plotting a cumulative
distribution function. If all values are distinct and the number of
values n is reasonably large, then the cumulative probabilities run
1/n (1/n) 1, the smallest 1/n is almost 0 and it's immaterial in
practice exactly how you plot those probabilities. But plotting
conventions become more obvious with lots of ties or a very small
sample size.

If you plot the cumulative as -cumul, equal- calculates it, the lowest
cumulative probability plotted is k/n where k is the frequency of the
lowest value. This is perfectly logical but in practice leads to
questions from learners or naive users on why the cumulative doesn't
start at 0.

All this is immaterial to -ksmirnov-, which does its own calculations.
As Maarten hints, quite what you do for inference with three
distributions is arguable. If there is a variant of Kolmogorov-Smirnov
for three distributions, then it not supported by -ksmirnov-.

Nick

On Tue, Aug 14, 2012 at 8:25 AM, Maarten Buis <maartenlbuis@gmail.com> wrote:
> On Tue, Aug 14, 2012 at 4:44 AM, Morrison Hodges wrote:
>> I can plot with 'distplot', but this apparantly does not allow the 'equal' option.
>
> What do you want an -equal- option to do?
>
>> But I can't figure out how to use the K-S test, either before or after the plotting.
>
> You can compare two groups with -ksmirnov-. You can also look at -sts
> test- for related approaches with more than 2 groups.
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index