Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: St: collapse by _N

From   Nick Cox <>
To   "''" <>
Subject   RE: st: St: collapse by _N
Date   Wed, 20 Oct 2010 11:31:31 +0100

All good advice, and here is some more:

1. I echo Michael in noting that -collapse- can produce a count variable, so that there is no need to set up your own. Of course, you would then need to drop data based on small samples after the -collapse-. 

2. Be aware of -contract-. It has precisely the role of collapsing to frequencies, and so by default produces a count variable. By implication Ric here wants mostly to -collapse- to means, but I've often seen people use -collapse- when their objective was more directly matched by -contract-. 


Michael Mitchell

   In addition to the great answers Chris and Ulrich sent, I might suggest that you 
include a variable that counts the number of valid observations. After having the 
collapsed file, you could then decide what you might want to use as a threshold for the 
data being too unreliable. You can see more examples about collapsing, including examples 
using count, at .

Ulrich Kohler

. bysort geocode: gen n = _N
. collapse (mean) varlist if n >= 20, by(geocode)

Chris Parker

You could count the observations in each geocode, then drop if there are too few observations then collapse.

bysort geocode: gen numobs=_N
drop if numobs < 20
collapse varlist, by(geocode)

Eric Uslaner

> I have a survey data set with respondents geocoded.  I want to collapse the data set to the geocode level, so the simple command would be:
> collapse varlist,by(geocode)
> However some geocodes barely have any respondents and any collapsed data would be unreliable.  Is there a straightforward way to collapse only if the number of respondents is>  20 (e.g.)?

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index