Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: How to drop individuals from sample?


From   Nick Cox <njcoxstata@gmail.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: How to drop individuals from sample?
Date   Wed, 25 Jul 2012 10:41:22 +0100

In addition, -egen- is happy with arguments such as -total(event == 1)-, so the code can be simplified by not creating an indicator variable.

Nick

On 25 Jul 2012, at 08:25, Maarten Buis <maartenlbuis@gmail.com> wrote:

On Wed, Jul 25, 2012 at 8:54 AM, Dheepan Ratha Krishnan wrote:
I am looking for a stata
command that will enable me to exclude individuals who have
experienced a particular life event more than once in my dataset.

That depends on how your data is structured. I will assume that the
data has the following structure: Each event has its own record, so
the same individual can have multiple records. The type of event is
stored in a separate variable, for example:

person event_date event_type other_vars
1         02Jan2012 married     ...
1         03Jan2012 divorced    ...
1         04Jan2012 married
2         01Feb1950 married    ...
2         01Jun2012 widowed ....

In this case you would want to ignore person 1 as he married twice (in
three days...). Typically you do not have the variable event_type as a
string but as a numeric variable with labels attached. Say 1 is
married 2 is divorced and 3 is widowed, and you are interested in the
event married. Than you type:

gen byte married = event == 1
bys person : egen tot_married = total(married)
drop married

The variable tot_married counts the number of times a person got
married, so you want to ignore all persons with tot_married > 2. So
you add to any subsequent command the condition -if tot_married < 2-.

This way you ignore the entire person 1, but say you want to use the
first marriage of person 1 just not any subsequent marriages. Than you
type:

gen byte married = event == 1
bys person (event_date) : gen tot_married = sum(married)
drop married

To read more about this see:
http://www.stata.com/support/faqs/data-management/true-and-false/
help egen
help sum()
help if
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index