Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Looping/Searching across Rows and Columns


From   Sergiy Radyakin <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: Looping/Searching across Rows and Columns
Date   Mon, 18 Nov 2013 20:41:45 -0500

Nanlesta,
may I suggest using reshape wide->long on your IDs reference file? I
think that you can make it the following:

roundid    uniqueid
1      201
2      201
3      201
4      202
5      202
....
66    213
66    217

then when you merge, match on the roundid and pull the uniqueid. That
would work if your ids were not reused for other people in other
rounds, which I assume is true, otherwise you would need some
applicability constraints for your 5 reference ids.

With this you will have minimal coding, while relying on standard
Stata's commands (reshape and merge) and avoid explicit looping.

Hope this helps.
Best, Sergiy Radyakin









On Mon, Nov 18, 2013 at 6:51 PM, Nanlesta Pilgrim <[email protected]> wrote:
> Dear all,
> I would be grateful for assistance in how to create a looping code for
> the following situation (or references that I can use to create the
> code).  I am working with a large longitudinal data set (app. 25
> years)  in which some participants id's were not kept constant over
> time (due to movement in and out of communities and households). Ids
> are created based on one's community and household living in at that
> survey round. I do have all the id's that a participant may have had
> over the time period in a separate file.  However, in a given round of
> the data, an individual might be present two or more times but under
> different id's.  I'm trying to reconcile this issue. Thus far, I've
> linked the file containing all the id's that a person might have had
> overtime and have created a unique id for the individual.  The file
> looks this way:
>
> newid currentid    altid1     altid2
> 7391    01203      01202    01209
> 7438    01377      01379    01413
> 7454    01405      01415    01503
>
> newid: I created this unique to the participant by taking it from a
> long format to a wide format
> currentid:  id for the current round
> altid1: alternate id used at some previous round
> altid2: alternate id used at some previous round
>
> Note that a person can have at least 5 alternate id.
>
> Where the actual data is stored, I can merge on currentid.
>
> Example of round X datafile.
> currentid
>  01202
>  01203
>  01377
>  01405
>  01415
>
> When I merge my created datafile with round X data file using
> currentid. I would get:
>
> newid currentid    altid1     altid2
> .          01202         .             .
> 7391    01203      01202    01209
> 7438    01377      01379    01413
> 7454    01405      01415    01503
> .          01415          .           .
>
> What I could like to create is a code that looks across the columns
> and rows to identify who are the same people and make a indication
> that they are the same by placing "newid" next to that individual.
> For example:  place newid=7391 if any the alternate id's (01202 or
> 01209) appear is also a currentid, essentially looking over up to 5
> columns of data but many rows.
>
> Is this feasible?  Is there an alternative solution that would not be
> extremely time consuming given the number of rounds of data?
>
> With thank!
> Nanlesta
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index