From
Roberto Ferrer <refp16@gmail.com>

To
Stata Help <statalist@hsphsun2.harvard.edu>

Subject
Re: st: Creating a mean of x entries in variable1 if there are at least x entries in variable1 by person id

Date
Wed, 9 Oct 2013 06:04:26 +0100

*----------------------------- input ------------------------------------------- clear input id exposure year 1 3 1 1 3 2 1 4 3 1 3 4 1 3 5 2 4 1 2 4 2 2 3 3 2 4 4 3 1 1 3 2 2 3 3 3 3 2 4 4 1 1 4 3 2 end *---------------------- what you want ------------------------------------------ bysort id: gen count = _N collapse (mean) exposure if count >= 3, by(id) *-------------------------- end ------------------------------------------------ See also: http://www.stata.com/support/faqs/data-management/number-of-distinct-observations/ On Wed, Oct 9, 2013 at 1:07 AM, Nicholas Winters <nicholas.winters@mail.mcgill.ca> wrote: > Sorry for the complicated subject, this is what my data looks like: > > person id exposure year > 1 3 1 > 1 3 2 > 1 4 3 > 1 3 4 > 1 3 5 > 2 4 1 > 2 4 2 > 2 3 3 > 2 4 4 > 3 1 1 > 3 2 2 > 3 3 3 > 3 2 4 > > I want to take an average of 'exposure' if there are at least, say, 3 entries in 'year' per person id. So I want to end up with a new variable, mean_exposure, that looks like this: > person id mean_exposure > 1 3 > 2 4 > 3 3 > 4 2 > > I've tried by person id: egen mean_exposure=mean(exposure) if _____________ > but I want the mean of ALL values in exposure only if 'year' has at least x number of observations > please help, I'm stumped...? > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

