Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <n.j.cox@durham.ac.uk> |
To | "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: St: collapse by _N |
Date | Wed, 20 Oct 2010 11:31:31 +0100 |
All good advice, and here is some more: 1. I echo Michael in noting that -collapse- can produce a count variable, so that there is no need to set up your own. Of course, you would then need to drop data based on small samples after the -collapse-. 2. Be aware of -contract-. It has precisely the role of collapsing to frequencies, and so by default produces a count variable. By implication Ric here wants mostly to -collapse- to means, but I've often seen people use -collapse- when their objective was more directly matched by -contract-. Nick n.j.cox@durham.ac.uk Michael Mitchell ================ In addition to the great answers Chris and Ulrich sent, I might suggest that you include a variable that counts the number of valid observations. After having the collapsed file, you could then decide what you might want to use as a threshold for the data being too unreliable. You can see more examples about collapsing, including examples using count, at http://www.ats.ucla.edu/stat/stata/modules/collapse.htm . Ulrich Kohler ============= . bysort geocode: gen n = _N . collapse (mean) varlist if n >= 20, by(geocode) Chris Parker ============ You could count the observations in each geocode, then drop if there are too few observations then collapse. bysort geocode: gen numobs=_N drop if numobs < 20 collapse varlist, by(geocode) Eric Uslaner ============ > I have a survey data set with respondents geocoded. I want to collapse the data set to the geocode level, so the simple command would be: > > collapse varlist,by(geocode) > > However some geocodes barely have any respondents and any collapsed data would be unreliable. Is there a straightforward way to collapse only if the number of respondents is> 20 (e.g.)? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/