"Nick Cox" <n.j.cox@durham.ac.uk>

<statalist@hsphsun2.harvard.edu>

RE: st: Collapse & Missing Values

Thu, 29 Sep 2005 01:54:08 +0100

bysort group : egen nonmiss = total(myvar < .) by group: egen total = total(myvar) replace total = . if nonmiss == 0 egen tag = tag(group) corr <whatever> if tag Nick n.j.cox@durham.ac.uk Eric G. Wruck > Thank you Nick for your valiant effort to characterize the > treatment of missings as a feature. And thank you, > Friedrich, for your work-around (& again to you Nick for your > help on that too). > > Let me just try to explain why this wasn't a feature for me > today. Using the collapse statement, I was aggregating > various amount fields by day. There could be multiple (and > usually were) transactions per day. Once I had the > aggregated amounts, I was interested in their correlations, > especially the correlation of one amount with the lagged > amount of another. When I start introducing erroneous zero > amounts, my correlations will not be unbiased, & certainly > not correct. In fact, the way I discovered this is that one > colleague was computing the same correlations in SAS. For > some reason, I had more observations than he. I now know > that my "extra" observations were the result of collapse's > treatment of missing values. I was able to get the same > correlations as my colleague by deleting the observations > with missing amounts but then I also lose the information on > the number of transactions on those days (albeit with > incomplete data). So yes, I emphatically agree with your d > iagnosis: > > >I guess what Eric would in effect like Stata to do > >is to keep track of all the occurrences of > >missing so that -sum()- would produce say > > > >. + . + . + . + . + . + 42 = 42 > > > >but > > > >. + . + . + . + . + . + . = . > > > >Thus, at the end of a set that were all missing, > >-sum()- would be morally compelled to say, > >"No, that initial guess of 0 doesn't apply here. > >These values are all missing, so the sum must > >be missing. I changed my mind!" > > > Failing such a radical change to collapse, perhaps there > could be an "allmiss" parameter that would make the sum of > totally missing values equal to missing. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

