Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: curious problem with 2 sample smirnov test


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: curious problem with 2 sample smirnov test
Date   Tue, 27 Sep 2011 10:16:22 +0100

It may seem slightly odd that -ksmirnov- doesn't just error out with
"no observations".

A defence is simply that a check for two distinct values of the -by()-
variable will catch that problem indirectly, so the check is
redundant.

On Tue, Sep 27, 2011 at 7:27 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> If I understand you correctly, you have 0 observations that satisfy
> your conditions. So there are no (0) distinct values of your -by()-
> variable. An analogue is
>
> . sysuse auto
> (1978 Automobile Data)
>
> . ksmirnov mpg if rep78 <1, by(foreign)
> foreign takes on 0 values, not 2
> r(450);
>
> The error message may be surprising to you, but it is reasonable.
>
> Nick
>
> On Tue, Sep 27, 2011 at 5:42 AM, Anjanette Chan Tack <amc75@uchicago.edu> wrote:
>
>> I am running a 2 sample smirnov test that tests for equality of distributions on an outcome for white and non-white majority neighborhoods at 5 different poverty levels. The test works smoothly for all but one poverty level, at which point, I get an error message saying: "majw7 takes on 0 values, not 2". When I click for more info, it says:
>>
>> "Return code 450 __________ is not a 0/1 variable;      number of successes invalid; p invalid; __________ takes on __________ values, not 2; You have used a command, such as bitest, that requires the variable take on only the values 0, 1, or missing, but the variable you specified does not meet that restriction.  (You can also get this message from, for example, bitesti, when you specify a number of successes greater than the number of observations or a probability not between 0 and 1.)"
>>
>> What is curious about this message is that it rejects my group variable "majw7" as a 0/1 variable at this poverty level, although it had no problem with "majw7" at the other poverty levels. If someone can suggest what might be going on and how I can work around this, I'd be grateful. For more info, see below:
>>
>> Example of commands that worked:
>> ksmirnov LgAllSt7 if povrat7 >= 80 & povrat7 <= 100, by(majw7)
>>
>> The command that did not work:
>> ksmirnov LgAllSt0 if povrat0 >= 0 & povrat0 < 20, by(majw0)
>>
>> I am wondering if the problem may be the limited number of cases I have at this poverty level. The crosstabs below show the distribution of neighborhoods (white majority = 1, and non-white majority = 0) across different poverty levels (e.g. 20 means 0 to 20%, etc)
>>
>>
>>  poverty |
>>    rate,5 |
>>   ordinal | White maj neighb = 1
>>    groups |         0          1 |     Total
>> -----------+----------------------+----------
>>        20 |       154        467 |       621
>>        40 |       154         26 |       180
>>        60 |        37          3 |        40
>>        80 |         3          1 |         4
>> -----------+----------------------+----------
>>     Total |       348        497 |       845
>>
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index