Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Ksmirnov one-sided test interpretation


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Ksmirnov one-sided test interpretation
Date   Fri, 1 Mar 2013 09:49:08 +0000

As a testing problem this seems closer to Mann-Whitney-Wilcoxon. Even
better to recast it as a problem for -somersd- (Roger Newson, SSC
etc.)

Nick

On Fri, Mar 1, 2013 at 9:30 AM, Tsankova, Teodora <TsankovT@ebrd.com> wrote:
> Thank you Joerg, for your comment. I am using the test not as an
> equality of distributions check but as an one-sided (inequality) check.
>
> In my case I want to check whether a parameter is higher than a random
> uniform distribution would suggest. So, I basically need to prove that
> its values are higher than if they were chosen at random in the range
> observed. I am not using a simple ttest because I would like to prove
> that not only the mean is higher but that also all the values tend to be
> higher than the uniform distribution. Also, it is difficult to deduct
> this information from the CDF graphs as I have a limited number of
> observations which are sometime above and sometimes below the 45 degree
> line which would represent the random uniform distribution.
>
> That being said, most of the interpretation of the KS test are for a
> two-sided test and this is why I have trouble making conclusions.
>
> Thank you again,
>
> Teodora
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Joerg
> Luedicke
> Sent: 28 February 2013 18:38
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: Ksmirnov one-sided test interpretation
>
> Yes, why not just looking at your data?
>
> That aside, I am wondering what the point of such a test is? What does
> it even mean that one distribution is "lower" than another? Or to quote
> the Stata manual, version 11: "We wish to use the two-sample
> Kolmogorov-Smirnov test to determine if there are any differences in the
> distribution of x for these two groups..." "Any" differences seem to
> pick up a mix of differences with regard to the location and shape of
> distributions. What is the motivation behind this? If there are
> differences in two distributions, why not just looking at what these
> differences are? But even if there was a good reason for using this
> test, I am wondering what it is telling us. I did not try hard to come
> up with the following example:
>
> Let's generate some data for two groups where the distribution in group
> one is normal with mean 10 and SD 5, while the distribution in the other
> group is a gamma with shape 5 and scale 2:
>
> *---------------
> clear
> set obs 200
> set seed 1234
>
> gen u = runiform()>.5
> gen x = rnormal(10,5) if u==0
> replace x=rgamma(5,2) if u==1
> *---------------
>
> and have a look at the empirical distribution for this data realization:
>
> *---------------
> tw kdensity x if u==0 || kdensity x if u==1
> *---------------
>
> As expected, these distributions surely look different to me. We can
> also have a look at the true functions:
>
> *---------------
> tw      function y = gammaden(5,2,0,x) , range(0 25) || ///
>         function y = normalden(x,10,5) , range(-5 25) ///
>         legend(order(1 "Gamma" 2 "Gauss"))
> *---------------
>
> Yet, if we run the K-S test:
>
> *---------------
> ksmirnov x, by(u) exact
> *---------------
>
> we would conclude that we cannot reject the hypothesis that the
> distributions are "different"? That does not sound right to me.
>
> So, my bottom line is: a) that I wonder why one would use this test in
> the first place, and b) even if there was a good reason, I probably
> would not trust it. I may very well be missing something here as I have
> never used or studied this test before, so others, please correct me if
> I am wrong here with something.
>
> Joerg
>
>
>
> On Thu, Feb 28, 2013 at 1:06 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>> Why not plot the data to show what is going on?
>>
>> Nick
>>
>> On Thu, Feb 28, 2013 at 5:23 PM, Tsankova, Teodora <TsankovT@ebrd.com>
> wrote:
>>
>>> I have a question related to a previous post:
>>>
>>> http://www.stata.com/statalist/archive/2009-01/msg00525.html
>>>
>>> The Stata output from this message is as follows:
>>>
>>> Two-sample Kolmogorov-Smirnov test for equality of distribution
> functions:
>>>
>>> Smaller group       D       P-value  Corrected
>>> ----------------------------------------------
>>> male:               0.2468    0.002
>>> female:             0.0000    1.000
>>> Combined K-S:       0.2468    0.005      0.003
>>>
>>>
>>> From the one sided tests (first two lines) on can say which
> distribution tends to be lower - for males or for females. However, I am
> not sure how to interpret it.
>>>
>>> Given that the pvalue from the first line is low and that D in the
> second line is 0, can we say that this is a proof that the distribution
> of male is lower than that of female? To rephrase it - can we claim that
> the distribution of male stochastically dominates the one of female
> which would imply that the values of the underlying variable tend to be
> larger for male than for female?  Or, do we interpret it in the exactly
> opposite way - that the values for male tend to be lower than the values
> for female?
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index