Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Creating a mean of x entries in variable1 if there are at least x entries in variable1 by person id


From   Joe Canner <[email protected]>
To   "[email protected]" <[email protected]>
Subject   st: RE: Creating a mean of x entries in variable1 if there are at least x entries in variable1 by person id
Date   Wed, 9 Oct 2013 02:13:59 +0000

Nicholas,

Try:

bys person_id: gen count=_N
bys person_id: egen mean_exposure=mean(exposure) if count>=3
bys person_id: keep if _n==1

It's not clear from your example whether it is sufficient to just count the number of observations for each person or whether you need to only count observations that have nonmissing values of -year-.  If the latter, you could substitute the following for the first line above:

bys person_id: egen count=count(year)

Regards,
Joe Canner
Johns Hopkins University School of Medicine
________________________________________
From: [email protected] [[email protected]] on behalf of Nicholas Winters [[email protected]]
Sent: Tuesday, October 08, 2013 8:07 PM
To: [email protected]
Subject: st: Creating a mean of x entries in variable1 if there are at least x entries in variable1 by person id

Sorry for the complicated subject, this is what my data looks like:

person id       exposure        year
1                3                 1
1                3                 2
1                4                 3
1                3                 4
1                3                 5
2                4                 1
2                4                 2
2                3                 3
2                4                 4
3                1                 1
3                2                 2
3                3                 3
3                2                 4

I want to take an average of 'exposure' if there are at least, say, 3 entries in 'year' per person id. So I want to end up with a new variable, mean_exposure, that looks like this:
person id       mean_exposure
1                 3
2                 4
3                 3
4                 2

I've tried by person id: egen mean_exposure=mean(exposure) if _____________
but I want the mean of ALL values in exposure only if 'year' has at least x number of observations
please help, I'm stumped...?
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index