Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Creating a mean of x entries in variable1 if there are at least x entries in variable1 by person id (Out of office: Vacation )


From   "Newton Cheng" <ncheng@aafp.org>
To   "Stata Help" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Creating a mean of x entries in variable1 if there are at least x entries in variable1 by person id (Out of office: Vacation )
Date   Wed, 09 Oct 2013 00:05:25 -0500

To whom it may concern,
I am out of office.
Will get back to your inqury on the week of 9/10/2012

>>> Roberto Ferrer <refp16@gmail.com> 10/09/13 00:04 >>>

*----------------------------- input -------------------------------------------
clear

input id       exposure        year
1                3                 1
1                3                 2
1                4                 3
1                3                 4
1                3                 5
2                4                 1
2                4                 2
2                3                 3
2                4                 4
3                1                 1
3                2                 2
3                3                 3
3                2                 4
4                1                 1
4                3                 2
end

*---------------------- what you want ------------------------------------------

bysort id: gen count = _N
collapse (mean) exposure if count >= 3, by(id)

*-------------------------- end ------------------------------------------------

See also:
http://www.stata.com/support/faqs/data-management/number-of-distinct-observations/

On Wed, Oct 9, 2013 at 1:07 AM, Nicholas Winters
<nicholas.winters@mail.mcgill.ca> wrote:
> Sorry for the complicated subject, this is what my data looks like:
>
> person id       exposure        year
> 1                3                 1
> 1                3                 2
> 1                4                 3
> 1                3                 4
> 1                3                 5
> 2                4                 1
> 2                4                 2
> 2                3                 3
> 2                4                 4
> 3                1                 1
> 3                2                 2
> 3                3                 3
> 3                2                 4
>
> I want to take an average of 'exposure' if there are at least, say, 3 entries in 'year' per person id. So I want to end up with a new variable, mean_exposure, that looks like this:
> person id       mean_exposure
> 1                 3
> 2                 4
> 3                 3
> 4                 2
>
> I've tried by person id: egen mean_exposure=mean(exposure) if _____________
> but I want the mean of ALL values in exposure only if 'year' has at least x number of observations
> please help, I'm stumped...?
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index