Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: Counting observations within groups

 From Daniel Escher To statalist@hsphsun2.harvard.edu Subject Re: st: Counting observations within groups Date Fri, 30 Nov 2012 08:12:10 -0500

```Austin,

Thank you so much! I had forgotten about using levelsof to create a
local of all values in a variable. In this case, your third option was
computationally quickest, but I'll keep the first two options in my
head for later situations. For some reason, totprod>`m' needed to be
changed to totprod>r(mean). Thus,

su totprod, mean
g big=(totprod>r(mean)&totprod<.)&(sic==12110|sic==11110)
by fips: g sbig=sum(big)
by fips: replace sbig=sbig[_N]

On Thu, Nov 29, 2012 at 6:03 PM, Austin Nichols <austinnichols@gmail.com> wrote:
> Daniel Escher <descher@nd.edu>:
>
> I sent my prior post a bit prematurely... I meant to go on to say--
> but one does not need a loop for this particular problem.
>
> Make a dummy, sum within county:
>
> su totprod, mean
> g big=(totprod>`m'&totprod<.)&(sic==12110|sic==11110)
> bys fips: g sbig=sum(big)
> by fips: replace sbig=sbig[_N]
>
> On Thu, Nov 29, 2012 at 5:48 PM, Daniel Escher <descher@nd.edu> wrote:
>> Hello,
>>
>> I am trying to count the number of mines in a county by production.
>> I.e., I'd like the number of mines in each county that are above the
>> overall mean of production, and the number that are below. There are
>> multiple mines per county, which is identified by its FIPS code.
>> Missing data are marked by . The data are in long format.
>>
>> Here's what I have so far:
>> . *bigmines = # of mines in a county above the overall mean
>> . *totprod = total production per mine
>> . *sic = type of mine
>>
>> . *ATTEMPT ONE
>> . sort fips
>> . su totprod // to get mean
>> . by fips: egen bigmines = count(inrange(totprod, r(mean), .) &
>> sic==12110 | sic==11110)  // This gives me total number of mines per
>> FIPS code - not those that meet the criteria
>> . drop bigmines
>>
>> . *ATTEMPT TWO
>> . su totprod // to get mean
>> . by fips: egen bigmines = total(mshahrs > r(mean) & sic==12110 |
>> sic==11110) // This gives me the total number of mines per FIPS code
>> if any mine exceeds the mean
>> . drop bigmines
>>
>> . *ATTEMPT THREE
>> (http://www.stata-journal.com/sjpdf.html?articlenum=pr0029) which
>> clued me in to -count-:
>> . gen bigmines = 0
>> . su totprod
>> . count if inrange(totprod, r(mean), .) & sic==12110 | sic==11110
>> . replace bigmines = r(N)
>>
>> The last attempt is what I want, and it "works." However, I don't know
>> how to -count- and then store r(N) for each FIPS code. Using -by- does
>> not seem to work. This probably requires a loop like...
>>
>> forvalues j = all values of fips {
>>         count if inrange(mshahrs, r(mean), .) & sic==12110 | sic==11110
>>         replace bigmines_hrs = r(N)
>> }
>>
>> Is this close? Thank you so much for your help and time.
>>
>> Gratefully,
>> Daniel
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```