Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

(Fwd) Re: st: ksmirnov output in STATA question...


From   "Neil Shephard" <[email protected]>
To   [email protected]
Subject   (Fwd) Re: st: ksmirnov output in STATA question...
Date   Tue, 26 Jul 2005 10:30:24 +0100

In error I posted my reply directly to Stace yesterday rather than the list and am now 
posting it to the list. Apologies, 

Neil

> On Sun, 24 Jul 2005 17:58:21 -0700 (PDT) Stace
> Maples <[email protected]> wrote...
> 
> > I have a question about interpretation of the
> ksmirnov
> > statistic in STATA...
> > I have a dataset that records a variable called
> > VISPROM for an entire population (approx 190,000
> > observations), and of that population, I have
> several
> > samples such as paleo (n=43), archaic (n=1026),
> > anasazi (n=5412).  I have created dummy variables
> for
> > sorting, giving the sample the value of 0 and the
> > population the value 1, so that when I sort on the
> > dummy variable, my sample is the first value of,
> say,
> > the dummy variable paleo, and my population is the
> > second value of the dummy variable.  Using the
> > ksmirnov test, to compare the two samples as
> follows:
> > 
> > ksmirnov visprom, by(paleo)
> > 
> > I get the following result:
> > 
> > . sort paleo
> > 
> > . ksmirnov visprom, by(paleo)
> > 
> > Two-sample Kolmogorov-Smirnov test for equality of
> > distribution functions:
> > 
> >  Smaller group       D       P-value  Corrected
> >  ----------------------------------------------
> >  0:                  0.0000    1.000
> >  1:                 -0.1509    0.177
> >  Combined K-S:       0.1509    0.353      0.285
> > 
> > 
> > Now, according to my text (Stats in Geography,
> Ebdon),
> > my critical value for D with DoF of 43, is .21 at
> a
> > significance level of .05.  Further, the book says
> > that a KS D statistic that is GREATER than the
> > critical value indicates that the Null Hyp that
> the
> > distributions are equal can be rejected, and that
> > there is evidence of a non-random pattern.
> > 
> > I interpret the above STATA output as follows:
> > The D statistic for paleo camopared with the
> > population is .1509, which is less than the
> critical
> > value of .21 for 43 degrees of freedom. 
> Therefore, I
> > cannot reject the null hypothesis that the sample
> and
> > population distributions are equal at a
> significance
> > level of .05.  
> 
> > Does that sound right?
> 
> Your interpretation seems fine to me, but you don't
> need to refer to statisical tables to 
> determine the critical value as Stata is calculating
> the p-value for the test you are 
> performing (and will if desired calculate the exact
> p-value).
> 
> > What about the p-value of 1 for H0?
> 
> I think there is some confusion, the 0 and 1 are
> under the column "Smaller group", and 
> these are one-way tests to determine if group 0 (the
> group that comes first when 
> dividing your data by variable paleo) is smaller
> than the second group.  The 1 tests if the 
> second group is smaller.
> 
> > What is the significance of the Corrected values
> (and
> > what is the Combined)?
> > 
> 
> Combined is the two-way test and is asking "Is there
> a difference between these two 
> distributions" without any regard for which group is
> the smaller/larger.
> 
> Details of corrected p-value are given in the manual
> and is "...obtained by modifying the 
> asymptotic p-value using a numerical approximation
> technique."
> 
> [ Formulae omitted]
> 
> > I sure wish there was an annotated output for
> ksmirnov
> > on the stata site. 
> 
> The annotated output can be found in the manuals [R]
> ksmirnov (pp230-233) (and 
> includes a couple of short biogs on Kolmogrov and
> Smirnov).  The manuals are an 
> invaluable resource and do very often contain
> annotated examples.  There have been 
> discussions on the list in the past about making the
> printed manuals available in 
> electronic format, but for the various reasons
> discussused in these postings they are 
> not currently available in this format (search the
> archives if interested).
> 
> I found the following invaluable when I first came
> across this group of tests...
> 
> Conover WJ (1999) Practical Nonparametric
> Statistics.  John Wiley & Sons.
> 
> Its a great book and everything I've read in it is
> explained with exceptional clarity.
> 
> > 
> > Here is another output that I REALLY don't know
> what
> > to do with...
> > 
> > 
> > . sort archaic
> > 
> > . ksmirnov visprom, by(archaic)
> > 
> > Two-sample Kolmogorov-Smirnov test for equality of
> > distribution functions:
> > 
> >  Smaller group       D       P-value  Corrected
> >  ----------------------------------------------
> >  0:                  0.0631    0.000
> >  1:                 -0.0645    0.000
> >  Combined K-S:       0.0645    0.000      0.000
> > 
> 
> Hopefully the interpretation of this is now clearer.
> 
> HTH's
> 
> Neil
> 
> P.S. - Its Stata not STATA (see 
>
> http://www.stata.com/support/faqs/res/statalist.html#spell)

Neil Shephard
Genetics Statistician
ARC Epidemiology Unit, University of Manchester
[email protected]
[email protected]

"If your result needs a statistician then you should design a better experiment" - 
Ernest Rutherford

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index