Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: find last location across a set of records for a person


From   Cathy Antonakos <cathya@umich.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: find last location across a set of records for a person
Date   Fri, 15 Jun 2012 15:48:35 -0400

Thank you so much. This looks like what I need.

Cathy

On Fri, Jun 15, 2012 at 2:47 PM, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> Consider this. It hinges on locations being positive integers. I would have used -egen, max()- to simplify the code, but it appears to have been broken by the last update. (StataCorp are working on this right now.)
>
> The main idea is to work backwards. In this example, the location in 97 is used if there was one; if not, we look at 96; and so forth.
>
> clear
> input id   location   yr95   yr96   yr97
>    1          1      1      .      .
>    1          2      .      1      .
>    1          3      .      .      1
>    2          1      1      1      .
>    2          2      .      1      .
>    2          3      .      .      1
>    3          1      1      .      .
>    3          2      .      1      .
>    3          3      .      .      .
> end
>
> quietly {
>
>        gen work = !missing(yr97) * location
>        bysort id (work) : gen lastloc = work[_N] if work[_N]
>        gen lastyr = 97 if !missing(lastloc)
>
>        forval y = 96(-1)95 {
>                replace work = !missing(yr`y') * location
>                bysort id (work) : replace work = work[_N]
>                by id: replace lastyr = `y' if work & missing(lastyr)
>                by id: replace lastloc = work if work & missing(lastloc)
>        }
>
>        sort id location
> }
>
> l
>
>
>
> Nick
> n.j.cox@durham.ac.uk
>
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Cathy Antonakos
> Sent: 15 June 2012 18:24
> To: statalist@hsphsun2.harvard.edu
> Subject: st: find last location across a set of records for a person
>
> I am working with a file that contains multiple records per person
> indicating residential locations over a number of years. For instance, the
> example below contains 3 people with 3 locations each. The code of "1" under
> "yr95", "yr96", "yr97" indicates that I have Census data for the person for
> the specific year at the specific location. The Census data from the
> person's last location will be used to create another variable I'm after. So
> I'm trying to figure out an efficient way to identify the last valid
> location and the year when it occurred for each person in the data set.
>
>    id   location   yr95   yr96   yr97
>     1          1      1      .      .
>     1          2      .      1      .
>     1          3      .      .      1
>     2          1      1      1      .
>     2          2      .      1      .
>     2          3      .      .      1
>     3          1      1      .      .
>     3          2      .      1      .
>     3          3      .      .      .
>
> Aggregated to person level by taking the sum of indicators for each year,
> the data would look like this:
>
>    id   yr95   yr96   yr97
>     1      1      1      1
>     2      1      2      1
>     3      1      1      0
>
> My next step would be to create a pattern variable to bring back into the
> main data set (the one with multiple records per case). I would use the
> pattern variable to identify the last location or locations for the
> individual. The pattern variables for the data above would look like this:
>
>    id   pattern
>     1      111
>     2      121
>     3      110
>
> The problem is that there are actually 8 years of data, and some cases have
> runs of records with missing information about location. For instance,
> examples of patterns include:
>
> 00111100  *here, I want to capture the 6th location to use as last location*
> 11121120  *here, I'd use the 7th location -- both locations in that
> year, actually*
> 12111121
>
> Other than a temporary -collapse- (using -preserve- and -restore-), I want
> to maintain the original multiple-record-per case file, because people
> sometimes live at more than one location in a single year and I will need to
> use data from each location within that "last location" year to generate
> analysis variables. Also, analysis of data associated with the person's last
> location is just one part of the work I'm doing.
>
> I hope I'm providing enough information. Many thanks for any help you can
> provide.
>
> Cathy
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index