Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: conditional merging


From   Jeph Herrin <stata@spandrel.net>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: conditional merging
Date   Wed, 07 Nov 2012 16:09:31 -0500

how about:

 u sales, clear
 merge 1:m home_id using condition
 gen lag = sale_year-inspection_year
 bys id_home : egen minimum_lag=min(lag)
 gen sale_condition if lag==minimum_lag & lag>=0
 bys home_id (sale_condition) : keep if _n==1

?
this should keep one record per home_id, with closest inspection_year prior to the sale date and the corresponding sale_condition.

hth,
jeph

On 11/6/2012 2:33 PM, Ben Hoen wrote:
I have two files sales.dta and condition.dta.  sales.dta has two variables
(home_id saleyear), and condition.dta has three variables (home_id
inspection_year condition).  The variable inspection_year can take the vales
of 2000-2011 for any home but for many homes only some years are present (in
many years the home was not inspected. Therefore a sample of the data might
look like:

home_id	inspection_year	condition
50121		2002			4
50121		2006			4
50121		2011			3
50681		2004			2
50681		2010			3
51040		2006			2
51040		2010			2
51040		2011			3

I would like to populate the sales.dta file with the condition of the parcel
in the inspection_year that is the closest to, but not following the
sale_year.

So, for example, the following dataset would result

home_id	sale_year	condition
50121		2007		4
50121		2011		3
50681		2008		2
51040		2003		.
51040		2010		3

I know I am not the first person to have this problem, but could not find
threads on this.  Maybe I am using the wrong search terms.  Any help would
be greatly appreciated.

(As I wrote this I realized one not as elegant work-around would be to
fill-in missing data for each missing year in the condition.dta file,
potentially using the user-written "carryforward" or even imputing the data
using, e.g., mi impute, and then matching home_id sale_year to home_id
inspection_year.)

Thanks, in advance!

Ben Hoen
Principal Research Associate
Lawrence Berkeley National Laboratory
Office: 845-758-1896
Cell: 718-812-7589
bhoen@lbl.gov
http://emp.lbl.gov/staff/ben-hoen

Visit our publications at:
http://emp.lbl.gov/publications





*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index