Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Counting observations within groups |

Date |
Thu, 29 Nov 2012 17:59:38 -0500 |

Daniel Escher <descher@nd.edu>: Make an empty variable, loop over counties, filling in values as you go, something like this: su totprod, mean loc m=r(mean) qui levelsof fips, loc(fs) g long nbig=. foreach f of loc fs { qui count if (totprod>`m'&totprod<.)&(sic==12110|sic==11110) &fips==`f' replace nbig = r(N) if fips==`f' } Sometimes the list to loop over can get too long, in which case: su totprod, mean loc m=r(mean) egen i=group(fips) su i, mean forv i=1/`r(max)' { qui count if (totprod>`m'&totprod<.)&(sic==12110|sic==11110) &i==`i' replace nbig = r(N) if i==`i' } is an alternative. On Thu, Nov 29, 2012 at 5:48 PM, Daniel Escher <descher@nd.edu> wrote: > Hello, > > I am trying to count the number of mines in a county by production. > I.e., I'd like the number of mines in each county that are above the > overall mean of production, and the number that are below. There are > multiple mines per county, which is identified by its FIPS code. > Missing data are marked by . The data are in long format. > > Here's what I have so far: > . *bigmines = # of mines in a county above the overall mean > . *totprod = total production per mine > . *sic = type of mine > > . *ATTEMPT ONE > . sort fips > . su totprod // to get mean > . by fips: egen bigmines = count(inrange(totprod, r(mean), .) & > sic==12110 | sic==11110) // This gives me total number of mines per > FIPS code - not those that meet the criteria > . drop bigmines > > . *ATTEMPT TWO > . su totprod // to get mean > . by fips: egen bigmines = total(mshahrs > r(mean) & sic==12110 | > sic==11110) // This gives me the total number of mines per FIPS code > if any mine exceeds the mean > . drop bigmines > > . *ATTEMPT THREE > . *Then I read Nick Cox's helpful article > (http://www.stata-journal.com/sjpdf.html?articlenum=pr0029) which > clued me in to -count-: > . gen bigmines = 0 > . su totprod > . count if inrange(totprod, r(mean), .) & sic==12110 | sic==11110 > . replace bigmines = r(N) > > The last attempt is what I want, and it "works." However, I don't know > how to -count- and then store r(N) for each FIPS code. Using -by- does > not seem to work. This probably requires a loop like... > > forvalues j = all values of fips { > count if inrange(mshahrs, r(mean), .) & sic==12110 | sic==11110 > replace bigmines_hrs = r(N) > } > > Is this close? Thank you so much for your help and time. > > Gratefully, > Daniel * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Counting observations within groups***From:*Daniel Escher <descher@nd.edu>

- Prev by Date:
**st: Counting observations within groups** - Next by Date:
**Re: st: Counting observations within groups** - Previous by thread:
**st: Counting observations within groups** - Next by thread:
**Re: st: Counting observations within groups** - Index(es):