Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: excluding single measurements in time


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: excluding single measurements in time
Date   Wed, 27 Feb 2013 10:44:20 +0000

Assuming a -school- identifier and given a -syr- variable then it will
be diagnostic whenever only one year is represented for each school in
that case

bysort school (syr) : drop if syr[1] == syr[_N]

This is also an FAQ:

FAQ     . . . . . .  Listing observations in a group that differ on a variable
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        11/01   How do I list observations in a group that differ
                on a variable?
http://www.stata.com/support/faqs/data-management/listing-observations-in-group/

Don't be misled by the FAQ title. It does cover what I think is your
problem, except that you want to -drop-, not -list- and you want
values the same, which is covered in the FAQ.

Nick

On Wed, Feb 27, 2013 at 9:56 AM, John Singhammer <singhammer@gmail.com> wrote:

> I am investigating growth in physical fitness among 58.000 children in
> 206 schools.
>
> Individual level data has been collected in 2009, 2010 and 2011.
> individuals are nested within classes, within schools.
> The schools volunteered to participate in the study and hence, data
> are unbalanced as can be seen from the output below.
>
> For a particular analysis, I want to select the schools (children
> within schools) with repeated measures. That is, I want to exclude
> schools with measurements at only a single point in time.
>
> How do I come about that in stata?

> . xtdescribe
>
>      pnr:  137400, 347246, ..., 1.231e+10                    n =      58041
>      syr:  2009, 2010, ..., 2011                             T =          3
>            Delta(syr) = 1 unit
>            Span(syr)  = 3 periods
>            (pnr*syr uniquely identifies each observation)
>
> Distribution of T_i:   min      5%     25%       50%       75%     95%     max
>                          1       1       1         2         2       2       3
>
>      Freq.  Percent    Cum. |  Pattern
>  ---------------------------+---------
>     27306     47.05   47.05 |  .11
>     16079     27.70   74.75 |  ..1
>     11318     19.50   94.25 |  .1.
>      2109      3.63   97.88 |  111
>       597      1.03   98.91 |  1.1
>       346      0.60   99.51 |  1..
>       286      0.49  100.00 |  11.
>  ---------------------------+---------
>     58041    100.00         |  XXX
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index