Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: how to keep maximum value


From   "Neil Shephard" <nshephard@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: how to keep maximum value
Date   Mon, 25 Sep 2006 09:48:08 +0800

On 9/25/06, FUKUGAWA, N. <nfukugawa@gmail.com> wrote:
Dear all,
Suppose we have data as follows,

year    var1
1997    14
1997    32
1997    19
1998    18
1998    42
1998    50
1999    3
1999    23
1999    37

we want to keep observations by year if var1 is maximum within
the same year.

year    var1
1997    32
1998    50
1999    37

There are two possible approaches, and which you use depends on
whether there are other variables in your data set which you wish to
retain.

1) No other variables need to be retained.

In this instance you can simply -collapse- your dataset...

. list

    +-------------+
    | year   var1 |
    |-------------|
 1. | 1997     14 |
 2. | 1997     32 |
 3. | 1997     19 |
 4. | 1998     18 |
 5. | 1998     42 |
    |-------------|
 6. | 1998     50 |
 7. | 1999      3 |
 8. | 1999     23 |
 9. | 1999     37 |
    +-------------+

. collapse (max) var1, by(year)

. list

    +-------------+
    | year   var1 |
    |-------------|
 1. | 1997     32 |
 2. | 1998     50 |
 3. | 1999     37 |
    +-------------+

2) Other variables you wish to retain...

. list

    +---------------------------+
    | year   var1   var2   var3 |
    |---------------------------|
 1. | 1997     14      a    .12 |
 2. | 1997     32      b    .14 |
 3. | 1997     19      c    .15 |
 4. | 1998     18      a    .09 |
 5. | 1998     42      b     .1 |
    |---------------------------|
 6. | 1998     50      b    .16 |
 7. | 1999      3      c    .11 |
 8. | 1999     23      a    .12 |
 9. | 1999     37      a     .1 |
    +---------------------------+

. bysort year: egen t = max(var1)

. list

    +--------------------------------+
    | year   var1   var2   var3    t |
    |--------------------------------|
 1. | 1997     14      a    .12   32 |
 2. | 1997     32      b    .14   32 |
 3. | 1997     19      c    .15   32 |
 4. | 1998     18      a    .09   50 |
 5. | 1998     42      b     .1   50 |
    |--------------------------------|
 6. | 1998     50      b    .16   50 |
 7. | 1999      3      c    .11   37 |
 8. | 1999     23      a    .12   37 |
 9. | 1999     37      a     .1   37 |
    +--------------------------------+

. keep if(var1 == t)
(6 observations deleted)

. drop t

. list

    +---------------------------+
    | year   var1   var2   var3 |
    |---------------------------|
 1. | 1997     32      b    .14 |
 2. | 1998     50      b    .16 |
 3. | 1999     37      a     .1 |
    +---------------------------+

(although I suspect this code could probably be condensed its
hopefully illustrative).

Neil
--
()  ascii ribbon campaign - against html mail
/\                        - against microsoft attachments
(www.gnu.org/philosophy/no-word-attachments.html)

Email - nshephard@gmail.com / neilshep@cyllene.uwa.edu.au
Website - http://slack.ser.man.ac.uk/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index