Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Collapse & Missing Values


From   "Eric G. Wruck" <ewruck@econalytics.com>
To   statalist@hsphsun2.harvard.edu
Subject   RE: st: Collapse & Missing Values
Date   Wed, 28 Sep 2005 20:46:22 -0400

Thank you Nick for your valiant effort to characterize the treatment of missings as a feature.  And thank you, Friedrich, for your work-around (& again to you Nick for your help on that too).

Let me just try to explain why this wasn't a feature for me today.  Using the collapse statement, I was aggregating various amount fields by day.  There could be multiple (and usually were) transactions per day.  Once I had the aggregated amounts, I was interested in their correlations, especially the correlation of one amount with the lagged amount of another.  When I start introducing erroneous zero amounts, my correlations will not be unbiased, & certainly not correct.  In fact, the way I discovered this is that one colleague was computing the same correlations in SAS.  For some reason, I had more observations than he.  I now know that my "extra" observations were the result of collapse's treatment of missing values.  I was able to get the same correlations as my colleague by deleting the observations with missing amounts but then I also lose the information on the number of transactions on those days (albeit with incomplete data).  So yes, I emphatically agree with your d
 iagnosis:

>I guess what Eric would in effect like Stata to do
>is to keep track of all the occurrences of
>missing so that -sum()- would produce say
>
>. + . + . + . + . + . + 42 = 42
>
>but
>
>. + . + . + . + . + . + . = .
>
>Thus, at the end of a set that were all missing,
>-sum()- would be morally compelled to say,
>"No, that initial guess of 0 doesn't apply here.
>These values are all missing, so the sum must
>be missing. I changed my mind!"   


Failing such a radical change to collapse, perhaps there could be an "allmiss" parameter that would make the sum of totally missing values equal to missing.


Eric

-- 

===================================================

       Eric G. Wruck
       Econalytics
       2535 Sherwood Road
       Columbus, OH  43209

       ph:      614.231.5034
       cell:    614.330.8846
       eFax:    614.573.6639
       eMail:   ewruck@econalytics.com
       website: http://www.econalytics.com

====================================================

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index