Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: Counting observations within groups

 From Austin Nichols To statalist@hsphsun2.harvard.edu Subject Re: st: Counting observations within groups Date Thu, 29 Nov 2012 17:59:38 -0500

Daniel Escher <descher@nd.edu>:
Make an empty variable, loop over counties, filling in values as you
go, something like this:

su totprod, mean
loc m=r(mean)
qui levelsof fips, loc(fs)
g long nbig=.
foreach f of loc fs {
qui count if (totprod>`m'&totprod<.)&(sic==12110|sic==11110) &fips==`f'
replace nbig = r(N) if fips==`f'
}

Sometimes the list to loop over can get too long, in which case:

su totprod, mean
loc m=r(mean)
egen i=group(fips)
su i, mean
forv i=1/`r(max)' {
qui count if (totprod>`m'&totprod<.)&(sic==12110|sic==11110) &i==`i'
replace nbig = r(N) if i==`i'
}

is an alternative.

On Thu, Nov 29, 2012 at 5:48 PM, Daniel Escher <descher@nd.edu> wrote:
> Hello,
>
> I am trying to count the number of mines in a county by production.
> I.e., I'd like the number of mines in each county that are above the
> overall mean of production, and the number that are below. There are
> multiple mines per county, which is identified by its FIPS code.
> Missing data are marked by . The data are in long format.
>
> Here's what I have so far:
> . *bigmines = # of mines in a county above the overall mean
> . *totprod = total production per mine
> . *sic = type of mine
>
> . *ATTEMPT ONE
> . sort fips
> . su totprod // to get mean
> . by fips: egen bigmines = count(inrange(totprod, r(mean), .) &
> sic==12110 | sic==11110)  // This gives me total number of mines per
> FIPS code - not those that meet the criteria
> . drop bigmines
>
> . *ATTEMPT TWO
> . su totprod // to get mean
> . by fips: egen bigmines = total(mshahrs > r(mean) & sic==12110 |
> sic==11110) // This gives me the total number of mines per FIPS code
> if any mine exceeds the mean
> . drop bigmines
>
> . *ATTEMPT THREE
> (http://www.stata-journal.com/sjpdf.html?articlenum=pr0029) which
> clued me in to -count-:
> . gen bigmines = 0
> . su totprod
> . count if inrange(totprod, r(mean), .) & sic==12110 | sic==11110
> . replace bigmines = r(N)
>
> The last attempt is what I want, and it "works." However, I don't know
> how to -count- and then store r(N) for each FIPS code. Using -by- does
> not seem to work. This probably requires a loop like...
>
> forvalues j = all values of fips {
>         count if inrange(mshahrs, r(mean), .) & sic==12110 | sic==11110
>         replace bigmines_hrs = r(N)
> }
>
> Is this close? Thank you so much for your help and time.
>
> Gratefully,
> Daniel
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/