Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]



I am not sure where the 14 came from.  A result from the theory of
order statistics may be useful.  If we start with a sample of n
observations from the uniform distribution on (0, 1),  x1, x2, ...,
xn, and arrange the observations in nondecreasing order, we obtain the
order statistics of the sample, often denoted by x(1) <= x(2) <= ...
<= x(n).  For the ith order statistic in a sample of n from Uniform(0,
1), the average value is i/(n + 1).  See, for example, David and
Nagaraja (2003).  Thus, you could plot your ordered observations
against those values.  A reasonable alternative plotting value for
x(i) is (i - (1/3))/(n + (1/3)), which (except for i = 1 and i = n) is
a close approximation to the median of the sampling distribution of
x(i).

I don't recall seeing the values of your 15 observations, but the
output for -ksmirnov- seems to be saying that your sample contains
more small values than one would expect in an sample of 15 from
Uniform(0, 1) AND more large values than one would expect in such a
sample.

If you regard your sample as a population and draw single observations
randomly from it and from Uniform(0, 1), it is straightforward to show
that the probability that a random observation from Uniform(0, 1) is
smaller than a random observation from your "population" is equal to
the mean of your "population" (i.e., sample).  To replace "smaller"
with "larger," simply subtract that mean from 1.  It is not necessary
to generate a new sample and use -ranksum-.  Indeed, that approach
introduces additional variability in the result.

David Hoaglin

H.A. David and H.N. Nagaraja (2003). Order Statistics, 3rd ed.
Hoboken, NJ: Wiley.

On Sat, Mar 9, 2013 at 8:49 AM, Tsankova, Teodora <TsankovT@ebrd.com> wrote:
> Dear David,
>
> Thank you for the suggestion.
>
> What I mean is that I create a uniform distribution between 0 and 1 with
> 15 observation. Given that every value should have the same probability
> under a uniform distribution I divide 1 by 14 and create those equally
> spaces 15 values. Plotting the CDF of those values would result in a
> straight diagonal line which is ultimately what the ksmirnov test would
> test against as well.
>
> The output from the ksmirnov test is as follows:
>
> ksmirnov mean_random_BTWGr_Fx=uniform()
>
> One-sample Kolmogorov-Smirnov test against theoretical distribution
>            uniform()
>
>  Smaller group       D       P-value  Corrected
>  ----------------------------------------------
>  mean_ra~r_Fx:       0.8221    0.000
>  Cumulative:        -0.8983    0.000
>  Combined K-S:       0.8983    0.000      0.000
>
> So, it seems that although I can reject the inequality of the two
> distributions, I cannot say anything about which one tends to have
> larger values.
>
> In Stata the -porder- option of the ranksum command gives the
> probability that a random draw from the first sample is larger than a
> random draw from the second sample. I like this as it seems very
> intuitive. I use those constructed values to perform this test. My
> results are as follows:
>
> ranksum mean_random_BTWGr_Fx, by( ObservedORUniform) porder
>
> Two-sample Wilcoxon rank-sum (Mann-Whitney) test
>
> ObservedOR~m |      obs    rank sum    expected
> -------------+---------------------------------
>     Observed |       15         259       232.5
>      Uniform |       15         206       232.5
> -------------+---------------------------------
>     combined |       30         465         465
>
> unadjusted variance      581.25
> adjustment for ties        0.00
>                      ----------
> adjusted variance        581.25
>
> Ho: mea~r_Fx(Observ~m==Observed) = mea~r_Fx(Observ~m==Uniform)
>              z =   1.099
>     Prob > |z| =   0.2717
>
> P{mea~r_Fx(Observ~m==Observed) > mea~r_Fx(Observ~m==Uniform)} = 0.618
>
> Those results, although not very strong, seem much easier to interprpet.
>
> Thank you again,
>
> Teodora
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index