[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
kokootchke <kokootchke@hotmail.com> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: matching databases |

Date |
Mon, 25 Aug 2008 12:55:48 -0400 |

Hello! I have two manufacturing databases that I need to put together. The problem is that each database is classified under a different coding system. I do have the codes to match the observations accordingly but I am not sure of what's the best to do the matching. Database A contains variables such as total # of employees by industrial sector (v1), total value of shipments by industrial sector (v2), and annual growth rates of the industrial sector (v3). These industrial sectors are according to the SIC87 industry classification, so the database would look like this: sic87 yr v1 v2 v3 2011 93 124.4 53.3 .0043177 2011 94 119.5 50.7 -.0043294 2011 95 125.8 51.4 -.0102257 2011 96 130 51.6 -.0452671 2013 93 48.7 2.1 . 2013 94 49.6 2.4 .047534 2013 95 48.5 2 .0065023 2014 95 9.6 1.6 .068254 2015 95 8.2 5.3 .0935813 I need to translate all of these database into the ISIC3 industry classification. The problem is that one SIC87 category can go into several ISIC3 categories and also several SIC87 categories can go into only one ISIC3 category. For instance, suppose that my correspondences are as follows: sic87 isic3 2011 2020 2011 2022 2011 2026 2013 2100 2014 2100 2015 2100 This means that sic87 category 2011 is now considered 3 separate categories (2020, 2022, and 2026), while all three categories 2013, 2014, and 2015 are now considered only one category 2100. I want to do the matching in two separate ways: (a) The first way deals with variables that one can easily add by sector, like the total # of employees by sector (v1) or the value of shipments by sector (v2). In this case, if multiple SIC87 categories are now classified as just one ISIC3 category, we can just add the numbers across categories; if just one SIC87 category is now classified as several ISIC3 categories, we can split the SIC87 number by the number of new ISIC3 categories. (b) The second one deals with variables that are not possible to just add because the sum would be meaningless. For example, for the case of v3, when multiple SIC87 categories have different growth rates and these categories translate into only one ISIC3 category, we can take the average by sector. On the other hand, if So, if we look at SIC87 category 2011 for year 95, I want my code to do the following calculations: isic3 yr v1 v2 v3 2020 95 =125.8/3 =51.4/3 =-.0102257 2022 95 =125.8/3 =51.4/3 =-.0102257 2026 95 =125.8/3 =51.4/3 =-.0102257 while SIC87 categories 2013, 2014, and 2015 for the same year would all fuse into one ISIC3 category to look like this: isic3 yr v1 v2 v3 2100 95 =48.5+9.6+8.2 =2+1.6+5.3 =(.0065023+.068254+.0935813)/3 Any ideas on how to achieve this? Thank you. Adrian _________________________________________________________________ Talk to your Yahoo! Friends via Windows Live Messenger. Find out how. http://www.windowslive.com/explore/messenger?ocid=TXT_TAGLM_WL_messenger_yahoo_082008 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: matching databases***From:*"Rafal Raciborski" <rraciborski@gmail.com>

**References**:**st: age-adjusted means***From:*"Mona Mowafi" <mmowafi@hsph.harvard.edu>

**Re: st: age-adjusted means***From:*Maarten buis <maartenbuis@yahoo.co.uk>

**Re: st: age-adjusted means***From:*"Mona Mowafi" <mmowafi@hsph.harvard.edu>

- Prev by Date:
**Re: st: new reshape command compatibility issue** - Next by Date:
**Re: st: fixed effect, autocorrelation heteroskedasticity** - Previous by thread:
**Re: st: age-adjusted means** - Next by thread:
**Re: st: matching databases** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |