Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: matching databases


From   "Rafal Raciborski" <rraciborski@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: matching databases
Date   Mon, 25 Aug 2008 13:13:55 -0400

google up "concordance files," for example

http://www.macalester.edu/research/economics/page/haveman/Trade.Resources/tradeconcordances.html




On Mon, Aug 25, 2008 at 12:55 PM, kokootchke <kokootchke@hotmail.com> wrote:
> Hello!
>
> I have two manufacturing databases that I need to put together. The problem is that each database is classified under a different coding system. I do have the codes to match the observations accordingly but I am not sure of what's the best to do the matching.
>
> Database A contains variables such as total # of employees by industrial sector (v1), total value of shipments by industrial sector (v2), and annual growth rates of the industrial sector (v3). These industrial sectors are according to the SIC87 industry classification, so the database would look like this:
>
> sic87     yr    v1       v2        v3
> 2011     93   124.4    53.3    .0043177
> 2011     94   119.5    50.7   -.0043294
> 2011     95   125.8    51.4   -.0102257
> 2011     96     130    51.6   -.0452671
> 2013     93    48.7     2.1           .
> 2013     94    49.6     2.4     .047534
> 2013     95    48.5       2    .0065023
> 2014     95    9.6     1.6     .068254
> 2015     95    8.2      5.3    .0935813
>
> I need to translate all of these database into the ISIC3 industry classification. The problem is that one SIC87 category can go into several ISIC3 categories and also several SIC87 categories can go into only one ISIC3 category.
>
> For instance, suppose that my correspondences are as follows:
>
> sic87   isic3
> 2011   2020
> 2011   2022
> 2011   2026
> 2013   2100
> 2014   2100
> 2015   2100
>
> This means that sic87 category 2011 is now considered 3 separate categories (2020, 2022, and 2026), while all three categories 2013, 2014, and 2015 are now considered only one category 2100.
>
> I want to do the matching in two separate ways:
>
> (a) The first way deals with variables that one can easily add by sector, like the total # of employees by sector (v1) or the value of shipments by sector (v2). In this case, if multiple SIC87 categories are now classified as just one ISIC3 category, we can just add the numbers across categories; if just one SIC87 category is now classified as several ISIC3 categories, we can split the SIC87 number by the number of new ISIC3 categories.
>
> (b) The second one deals with variables that are not possible to just add because the sum would be meaningless. For example, for the case of v3, when multiple SIC87 categories have different growth rates and these categories translate into only one ISIC3 category, we can take the average by sector. On the other hand, if
>
> So, if we look at SIC87 category 2011 for year 95, I want my code to do the following calculations:
>
> isic3  yr   v1            v2           v3
> 2020  95  =125.8/3  =51.4/3   =-.0102257
> 2022  95  =125.8/3  =51.4/3   =-.0102257
> 2026  95  =125.8/3  =51.4/3   =-.0102257
>
>
> while SIC87 categories 2013, 2014, and 2015 for the same year would all fuse into one ISIC3 category to look like this:
>
> isic3  yr   v1                     v2                 v3
> 2100  95  =48.5+9.6+8.2  =2+1.6+5.3   =(.0065023+.068254+.0935813)/3
>
> Any ideas on how to achieve this?
>
> Thank you.
> Adrian
>
>
>
>
> _________________________________________________________________
> Talk to your Yahoo! Friends via Windows Live Messenger.  Find out how.
> http://www.windowslive.com/explore/messenger?ocid=TXT_TAGLM_WL_messenger_yahoo_082008
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index