Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: -collapse- with no observations?


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: -collapse- with no observations?
Date   Fri, 2 Oct 2009 18:36:59 +0100

I've got to say that while Neil has a good logical case, I also regard
this as a reasonable default behaviour. 

My guess is that usually when a -collapse- or -contract- asks for
combinations that don't exist, it is the  result of a typo (user didn't
mean to type what was issued) or of a misconception (user thought there
were some such observations). So the commands in question issue error
2000 and leave your dataset as is, which is usually what you should want
to happen. I don't think an option to allow results summarising what is
what for non-existent observations (zero count or missing summary
statistics) would be used enough to justify hitting the code, but that's
a personal view and they are naturally official commands. There are
work-arounds too. 

Nick 
n.j.cox@durham.ac.uk 

Neil Shephard

I've encountered some unexpected behaviour whilst using -collapse- on
different subsets of data.  An example to demonstrate the problem....

sysuse auto, clear
sum price
collapse (count) n = mpg if(price < 2000)

This results in an error as there are no cars with price < 2000...

. sysuse auto, clear
(1978 Automobile Data)

. sum price

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       price |        74    6165.257    2949.496       3291      15906


. collapse (count) n = mpg if price < 2000
no observations
r(2000);

I would have expected the command to run and return a count of zero
(0) as this is a valid number of counts (to my mind at least).

There's a mention at http://www.stata.com/help.cgi?whatsnew10 that
indicates that trying to open the data editor with an if qualifier
that resulted in zero observations used to cause Stata to crash, but
that has been fixed.

Searching the archives/FAQ I haven't been able to find whether this
has has been discussed before, or is a reasonable behaviour (as I say
I would have expected to get a count of zero returned).

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index