Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From
Phil Schumm <pschumm@uchicago.edu>

To
statalist@hsphsun2.harvard.edu

Subject
Re: st: Merge by range of values

Date
Mon, 13 Jun 2011 19:57:26 -0500

On Jun 13, 2011, at 4:49 PM, Jeremy A. Grey wrote:

I am trying to find a way to merge data sets according to a range ofvalues, sort of a combination of m:1 merge and inrange().In one data set, each observation represents a subject with theindividual's value for variable X.In another data set, each observation represents a range of valuesfor variable X. The start and end values of the range are separatevariables, such as start_X and end_X. The remaining variablescontain the values of Y and Z for all values of X within that range.Is there a way to merge the Y and Z data from the second data setinto the first by comparing the value of X to the range specified bystart_X and end_X?I thought of transforming the second data set in order to create newvariables, such as start_X_1, end_X_1, Y_1, Z_1, start_X_2, end_X_2,Y_2, Z_2, etc., adding those data to each observation in the firstdataset, and using a loop and inrange() in order to compute Y and Zfor each subject, but there are about 3,000,000 different ranges ofX in the second data set, so this is impractical.

Here is the first approach: use dataset1 cross using dataset2 keep if inrange(x,start_x,end_x)

use dataset1 merge 1:1 _n using dataset2, keepusing(start_x end_x) nogen gen start = . gen end = . forv i=1/`c(N)' { if mi(start_x[`i']) continue, break

replace end = end_x[`i'] if inrange(x,start_x[`i'],end_x[`i']) } drop start_x end_x ren start start_x ren end end_x merge m:1 start_x end_x using dataset2

