Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: conditional merging


From   "Ben Hoen" <[email protected]>
To   <[email protected]>
Subject   RE: st: conditional merging
Date   Thu, 8 Nov 2012 11:44:21 -0500

Jeph & Robert,

Taking both of your suggestions and making slight alterations I was able to
get two sets of code working fine.

Thank you!

Ben

***********************************************************

clear
input home_id inspyear condition
 50121 2002 4
 50121 2006 4
 50121 2011 3
 50681 2004 2
 50681 2010 3
 51040 2006 2
 51040 2010 2
 51040 2011 3
end

tempfile condition
save "`condition'"

clear
input home_id saleyear
 50121 2007
 50121 2011
 50681 2008
 51040 2003
 51040 2010
end

tempfile sales
save "`sales'"

*========= jeph's method
joinby home_id using "`condition'"
gen lag = saleyear-inspyear
bys home_id saleyear: egen maxlag=max(lag)
bys home_id saleyear: replace condition=. if maxlag<0 & _n==1
bys home_id saleyear: replace inspyear=. if maxlag<0 & _n==1
drop if lag<0 & condition!=.
bys home_id saleyear: egen minlag=min(lag)
bys home_id saleyear: keep if minlag==lag|condition==.
drop lag minlag maxlag
list

*====== robert's method
use "`sales'", clear
append using "`condition'"

gen year = cond(mi(saleyear), inspyear, saleyear)
sort home_id year inspyear
by home_id: replace condition = condition[_n-1] if mi(condition)
by home_id: replace inspyear = inspyear[_n-1] if mi(inspyear)
keep if !mi(saleyear)
keep home_id saleyear condition inspyear
list



Ben Hoen
LBNL
Office: 845-758-1896
Cell: 718-812-7589


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Jeph Herrin
Sent: Wednesday, November 07, 2012 4:10 PM
To: [email protected]
Subject: Re: st: conditional merging

how about:

  u sales, clear
  merge 1:m home_id using condition
  gen lag = sale_year-inspection_year
  bys id_home : egen minimum_lag=min(lag)
  gen sale_condition if lag==minimum_lag & lag>=0
  bys home_id (sale_condition) : keep if _n==1

?
this should keep one record per home_id, with closest inspection_year 
prior to the sale date and the corresponding sale_condition.

hth,
jeph

On 11/6/2012 2:33 PM, Ben Hoen wrote:
> I have two files sales.dta and condition.dta.  sales.dta has two variables
> (home_id saleyear), and condition.dta has three variables (home_id
> inspection_year condition).  The variable inspection_year can take the
vales
> of 2000-2011 for any home but for many homes only some years are present
(in
> many years the home was not inspected. Therefore a sample of the data
might
> look like:
>
> home_id	inspection_year	condition
> 50121		2002			4
> 50121		2006			4
> 50121		2011			3
> 50681		2004			2
> 50681		2010			3
> 51040		2006			2
> 51040		2010			2
> 51040		2011			3
>
> I would like to populate the sales.dta file with the condition of the
parcel
> in the inspection_year that is the closest to, but not following the
> sale_year.
>
> So, for example, the following dataset would result
>
> home_id	sale_year	condition
> 50121		2007		4
> 50121		2011		3
> 50681		2008		2
> 51040		2003		.
> 51040		2010		3
>
> I know I am not the first person to have this problem, but could not find
> threads on this.  Maybe I am using the wrong search terms.  Any help would
> be greatly appreciated.
>
> (As I wrote this I realized one not as elegant work-around would be to
> fill-in missing data for each missing year in the condition.dta file,
> potentially using the user-written "carryforward" or even imputing the
data
> using, e.g., mi impute, and then matching home_id sale_year to home_id
> inspection_year.)
>
> Thanks, in advance!
>
> Ben Hoen
> Principal Research Associate
> Lawrence Berkeley National Laboratory
> Office: 845-758-1896
> Cell: 718-812-7589
> [email protected]
> http://emp.lbl.gov/staff/ben-hoen
>
> Visit our publications at:
> http://emp.lbl.gov/publications
>
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index