Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Linking Cross-Sectional and Time Series IDs

From   Paul <>
Subject   st: Linking Cross-Sectional and Time Series IDs
Date   Fri, 10 Aug 2012 15:20:51 -0400

I have repeated cross-sectional data at the individual*firm level,
where individuals may work at multiple firms and are linked within
years by a unique ID, which I'll refer to as crosssection_id.  Each
year the IDs are randomly assigned, so individuals are not linked over
time.  I've painstakingly linked individual*firm observations over
time based on characteristics such as experience and ethnicity,
generating what I'll call year_id.

However, the IDs I've created only link individual*firm observations
over time.  I'd like to now link the IDs so each individual, appearing
in multiple firms and years, has one ID.

I've devised the following (probably inefficient) method for linking IDs:

sort year crosssection_id year_id
forv i = 1/`_N'{
if crosssection_id[`i']==crosssection[`i'-1] & year[`i']==year[`i'-1]
& year_id[`i']!=year_id[`i'-1]{
local newid=year_id[`i'-1]
local oldid=year_id[`i']
replace year_id=`newid' if year_id==`oldid'

My questions are:
Given that I've done the yearly linking as I have, is there a more
efficient method of linking these IDs?  My devised scheme I think is
correct, but takes a very long time.

Would there have been a more efficient means of linking the IDs over
time?  I didn't do it at the individual level because within a firm
the it seems less probably that two people would have the same
ethnicity, experience, etc.

Thanks for any advice!
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index