In error I posted my reply directly to Stace yesterday rather than the list and am now posting it to the list. Apologies, Neil > On Sun, 24 Jul 2005 17:58:21 -0700 (PDT) Stace > Maples <[email protected]> wrote... > > > I have a question about interpretation of the > ksmirnov > > statistic in STATA... > > I have a dataset that records a variable called > > VISPROM for an entire population (approx 190,000 > > observations), and of that population, I have > several > > samples such as paleo (n=43), archaic (n=1026), > > anasazi (n=5412). I have created dummy variables > for > > sorting, giving the sample the value of 0 and the > > population the value 1, so that when I sort on the > > dummy variable, my sample is the first value of, > say, > > the dummy variable paleo, and my population is the > > second value of the dummy variable. Using the > > ksmirnov test, to compare the two samples as > follows: > > > > ksmirnov visprom, by(paleo) > > > > I get the following result: > > > > . sort paleo > > > > . ksmirnov visprom, by(paleo) > > > > Two-sample Kolmogorov-Smirnov test for equality of > > distribution functions: > > > > Smaller group D P-value Corrected > > ---------------------------------------------- > > 0: 0.0000 1.000 > > 1: -0.1509 0.177 > > Combined K-S: 0.1509 0.353 0.285 > > > > > > Now, according to my text (Stats in Geography, > Ebdon), > > my critical value for D with DoF of 43, is .21 at > a > > significance level of .05. Further, the book says > > that a KS D statistic that is GREATER than the > > critical value indicates that the Null Hyp that > the > > distributions are equal can be rejected, and that > > there is evidence of a non-random pattern. > > > > I interpret the above STATA output as follows: > > The D statistic for paleo camopared with the > > population is .1509, which is less than the > critical > > value of .21 for 43 degrees of freedom. > Therefore, I > > cannot reject the null hypothesis that the sample > and > > population distributions are equal at a > significance > > level of .05. > > > Does that sound right? > > Your interpretation seems fine to me, but you don't > need to refer to statisical tables to > determine the critical value as Stata is calculating > the p-value for the test you are > performing (and will if desired calculate the exact > p-value). > > > What about the p-value of 1 for H0? > > I think there is some confusion, the 0 and 1 are > under the column "Smaller group", and > these are one-way tests to determine if group 0 (the > group that comes first when > dividing your data by variable paleo) is smaller > than the second group. The 1 tests if the > second group is smaller. > > > What is the significance of the Corrected values > (and > > what is the Combined)? > > > > Combined is the two-way test and is asking "Is there > a difference between these two > distributions" without any regard for which group is > the smaller/larger. > > Details of corrected p-value are given in the manual > and is "...obtained by modifying the > asymptotic p-value using a numerical approximation > technique." > > [ Formulae omitted] > > > I sure wish there was an annotated output for > ksmirnov > > on the stata site. > > The annotated output can be found in the manuals [R] > ksmirnov (pp230-233) (and > includes a couple of short biogs on Kolmogrov and > Smirnov). The manuals are an > invaluable resource and do very often contain > annotated examples. There have been > discussions on the list in the past about making the > printed manuals available in > electronic format, but for the various reasons > discussused in these postings they are > not currently available in this format (search the > archives if interested). > > I found the following invaluable when I first came > across this group of tests... > > Conover WJ (1999) Practical Nonparametric > Statistics. John Wiley & Sons. > > Its a great book and everything I've read in it is > explained with exceptional clarity. > > > > > Here is another output that I REALLY don't know > what > > to do with... > > > > > > . sort archaic > > > > . ksmirnov visprom, by(archaic) > > > > Two-sample Kolmogorov-Smirnov test for equality of > > distribution functions: > > > > Smaller group D P-value Corrected > > ---------------------------------------------- > > 0: 0.0631 0.000 > > 1: -0.0645 0.000 > > Combined K-S: 0.0645 0.000 0.000 > > > > Hopefully the interpretation of this is now clearer. > > HTH's > > Neil > > P.S. - Its Stata not STATA (see > > http://www.stata.com/support/faqs/res/statalist.html#spell) Neil Shephard Genetics Statistician ARC Epidemiology Unit, University of Manchester [email protected] [email protected] "If your result needs a statistician then you should design a better experiment" - Ernest Rutherford * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

