Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: curious problem with 2 sample smirnov test |

Date |
Tue, 27 Sep 2011 10:16:22 +0100 |

It may seem slightly odd that -ksmirnov- doesn't just error out with "no observations". A defence is simply that a check for two distinct values of the -by()- variable will catch that problem indirectly, so the check is redundant. On Tue, Sep 27, 2011 at 7:27 AM, Nick Cox <njcoxstata@gmail.com> wrote: > If I understand you correctly, you have 0 observations that satisfy > your conditions. So there are no (0) distinct values of your -by()- > variable. An analogue is > > . sysuse auto > (1978 Automobile Data) > > . ksmirnov mpg if rep78 <1, by(foreign) > foreign takes on 0 values, not 2 > r(450); > > The error message may be surprising to you, but it is reasonable. > > Nick > > On Tue, Sep 27, 2011 at 5:42 AM, Anjanette Chan Tack <amc75@uchicago.edu> wrote: > >> I am running a 2 sample smirnov test that tests for equality of distributions on an outcome for white and non-white majority neighborhoods at 5 different poverty levels. The test works smoothly for all but one poverty level, at which point, I get an error message saying: "majw7 takes on 0 values, not 2". When I click for more info, it says: >> >> "Return code 450 __________ is not a 0/1 variable; number of successes invalid; p invalid; __________ takes on __________ values, not 2; You have used a command, such as bitest, that requires the variable take on only the values 0, 1, or missing, but the variable you specified does not meet that restriction. (You can also get this message from, for example, bitesti, when you specify a number of successes greater than the number of observations or a probability not between 0 and 1.)" >> >> What is curious about this message is that it rejects my group variable "majw7" as a 0/1 variable at this poverty level, although it had no problem with "majw7" at the other poverty levels. If someone can suggest what might be going on and how I can work around this, I'd be grateful. For more info, see below: >> >> Example of commands that worked: >> ksmirnov LgAllSt7 if povrat7 >= 80 & povrat7 <= 100, by(majw7) >> >> The command that did not work: >> ksmirnov LgAllSt0 if povrat0 >= 0 & povrat0 < 20, by(majw0) >> >> I am wondering if the problem may be the limited number of cases I have at this poverty level. The crosstabs below show the distribution of neighborhoods (white majority = 1, and non-white majority = 0) across different poverty levels (e.g. 20 means 0 to 20%, etc) >> >> >> poverty | >> rate,5 | >> ordinal | White maj neighb = 1 >> groups | 0 1 | Total >> -----------+----------------------+---------- >> 20 | 154 467 | 621 >> 40 | 154 26 | 180 >> 60 | 37 3 | 40 >> 80 | 3 1 | 4 >> -----------+----------------------+---------- >> Total | 348 497 | 845 >> > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: curious problem with 2 sample smirnov test***From:*Anjanette Chan Tack <amc75@uchicago.edu>

**Re: st: curious problem with 2 sample smirnov test***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: _00001 already define error using egen xx=group()** - Next by Date:
**Re: st: _00001 already define error using egen xx=group()** - Previous by thread:
**Re: st: curious problem with 2 sample smirnov test** - Next by thread:
**st: Test for deciding which model fits my data the best** - Index(es):