Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Creating a mean of x entries in variable1 if there are at least x entries in variable1 by person id

 From Roberto Ferrer <[email protected]> To Stata Help <[email protected]> Subject Re: st: Creating a mean of x entries in variable1 if there are at least x entries in variable1 by person id Date Wed, 9 Oct 2013 06:04:26 +0100

```*----------------------------- input -------------------------------------------
clear

input id       exposure        year
1                3                 1
1                3                 2
1                4                 3
1                3                 4
1                3                 5
2                4                 1
2                4                 2
2                3                 3
2                4                 4
3                1                 1
3                2                 2
3                3                 3
3                2                 4
4                1                 1
4                3                 2
end

*---------------------- what you want ------------------------------------------

bysort id: gen count = _N
collapse (mean) exposure if count >= 3, by(id)

*-------------------------- end ------------------------------------------------

http://www.stata.com/support/faqs/data-management/number-of-distinct-observations/

On Wed, Oct 9, 2013 at 1:07 AM, Nicholas Winters
<[email protected]> wrote:
> Sorry for the complicated subject, this is what my data looks like:
>
> person id       exposure        year
> 1                3                 1
> 1                3                 2
> 1                4                 3
> 1                3                 4
> 1                3                 5
> 2                4                 1
> 2                4                 2
> 2                3                 3
> 2                4                 4
> 3                1                 1
> 3                2                 2
> 3                3                 3
> 3                2                 4
>
> I want to take an average of 'exposure' if there are at least, say, 3 entries in 'year' per person id. So I want to end up with a new variable, mean_exposure, that looks like this:
> person id       mean_exposure
> 1                 3
> 2                 4
> 3                 3
> 4                 2
>
> I've tried by person id: egen mean_exposure=mean(exposure) if _____________
> but I want the mean of ALL values in exposure only if 'year' has at least x number of observations
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```

• References: