Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: find last location across a set of records for a person


From   Cathy Antonakos <cathya@umich.edu>
To   statalist@hsphsun2.harvard.edu
Subject   st: find last location across a set of records for a person
Date   Fri, 15 Jun 2012 13:24:02 -0400

I am working with a file that contains multiple records per person
indicating residential locations over a number of years. For instance, the
example below contains 3 people with 3 locations each. The code of "1" under
"yr95", "yr96", "yr97" indicates that I have Census data for the person for
the specific year at the specific location. The Census data from the
person's last location will be used to create another variable I'm after. So
I'm trying to figure out an efficient way to identify the last valid
location and the year when it occurred for each person in the data set.

   id   location   yr95   yr96   yr97
    1          1      1      .      .
    1          2      .      1      .
    1          3      .      .      1
    2          1      1      1      .
    2          2      .      1      .
    2          3      .      .      1
    3          1      1      .      .
    3          2      .      1      .
    3          3      .      .      .

Aggregated to person level by taking the sum of indicators for each year,
the data would look like this:

   id   yr95   yr96   yr97
    1      1      1      1
    2      1      2      1
    3      1      1      0

My next step would be to create a pattern variable to bring back into the
main data set (the one with multiple records per case). I would use the
pattern variable to identify the last location or locations for the
individual. The pattern variables for the data above would look like this:

   id   pattern
    1      111
    2      121
    3      110

The problem is that there are actually 8 years of data, and some cases have
runs of records with missing information about location. For instance,
examples of patterns include:

00111100  *here, I want to capture the 6th location to use as last location*
11121120  *here, I'd use the 7th location -- both locations in that
year, actually*
12111121

Other than a temporary -collapse- (using -preserve- and -restore-), I want
to maintain the original multiple-record-per case file, because people
sometimes live at more than one location in a single year and I will need to
use data from each location within that "last location" year to generate
analysis variables. Also, analysis of data associated with the person's last
location is just one part of the work I'm doing.

I hope I'm providing enough information. Many thanks for any help you can
provide.

Cathy

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index