Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

st: RE: Creating a mean of x entries in variable1 if there are at least x entries in variable1 by person id

 From Joe Canner <[email protected]> To "[email protected]" <[email protected]> Subject st: RE: Creating a mean of x entries in variable1 if there are at least x entries in variable1 by person id Date Wed, 9 Oct 2013 02:13:59 +0000

```Nicholas,

Try:

bys person_id: gen count=_N
bys person_id: egen mean_exposure=mean(exposure) if count>=3
bys person_id: keep if _n==1

It's not clear from your example whether it is sufficient to just count the number of observations for each person or whether you need to only count observations that have nonmissing values of -year-.  If the latter, you could substitute the following for the first line above:

bys person_id: egen count=count(year)

Regards,
Joe Canner
Johns Hopkins University School of Medicine
________________________________________
From: [email protected] [[email protected]] on behalf of Nicholas Winters [[email protected]]
Sent: Tuesday, October 08, 2013 8:07 PM
To: [email protected]
Subject: st: Creating a mean of x entries in variable1 if there are at least x entries in variable1 by person id

Sorry for the complicated subject, this is what my data looks like:

person id       exposure        year
1                3                 1
1                3                 2
1                4                 3
1                3                 4
1                3                 5
2                4                 1
2                4                 2
2                3                 3
2                4                 4
3                1                 1
3                2                 2
3                3                 3
3                2                 4

I want to take an average of 'exposure' if there are at least, say, 3 entries in 'year' per person id. So I want to end up with a new variable, mean_exposure, that looks like this:
person id       mean_exposure
1                 3
2                 4
3                 3
4                 2

I've tried by person id: egen mean_exposure=mean(exposure) if _____________
but I want the mean of ALL values in exposure only if 'year' has at least x number of observations
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```

• References: