[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: [merging US industry level data]
You don't give a whole lot of information about your data-set, but there
are a few things that can be said.
1) You need to generate the same industry level variable in both
data-sets, i.e. you need to generate a 3-digit level industry code
inside the data-set with the 4-level data-set (let's call this data-set
the 'master' data-set).
It is not clear how the 4-digit and the 3-digit industry variables
relate to each other, but let's assume that you can simply cut off the
last digit of the 4-digit variable to derive the 3-digit variable (e.g.,
codes 1230 to 1239 at the 4-digit level correspond with code 123 at the
Assuming this, as well as that your 4-digit level industry variable is
coded in integers (and called industry_4d), you could get the 3-digit
level variable with something like this:
gen int industry_3d = real(substr(string(industry_4d),1,3))
In your other data-set, you also need to have a variable that is called
"industry_3d" (and you need to make sure that it is equivalently coded,
of course - which I assumed above).
2) Depending on what type of merge you want to do, you probably need to
sort both data-sets by the identifier variables (the variables you want
to merge on). Assuming you want to merge on, say, "year" and
"industry_3d", you would need to sort both data-sets by "year industry_3d."
3) The you can merge, along the following lines:
use master.dta, clear
merge year industry_3d using using.dta
(where the data-set with the original 3-digit level industry level
variable is called "using.dta").
mine is a very preliminary question. i am working with the US industry level
data and i want to merge the variables of 4-digit level industries to 3-digit
and also create a variable for 3-digit.
could anybody help me with that?
* For searches and help try: