[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Hua Pan" <panhua@gmx.de> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Build groups with the same first two numbers of SIC |

Date |
Thu, 19 Mar 2009 17:49:45 +0100 |

Dear Statalisters� I have a list of firms with four digit sic code, permno (identify Nr. for firms), date and return and wish to get daily mean return within the group, which has the same first two numbers of SIC code. My Dataset look like this: sic permno date ret … 3674 10012 5.Jan.2004 10012 6.Jan.2004 10012 7.Jan.2004 3674 10259 5.Jan.2004 10259 6.Jan.2004 10259 7.Jan.2004 3674 10299 10299 10299 3674 10302 10302 10302 ----------------------------------------------------------------- 3714 10667 10667 10667 ------------------------------------------------------------------ 3728 10145 10145 10145 ------------------------------------------------------------------ 3861 10163 10163 10163 ------------------------------------------------------------------ 4213 10379 10379 10379 4213 10649 10649 10649 At first I want to build several groups. Firms within each group have the same character: the first two numbers of their SIC codes are identical. For the example above sic permno Group1: 3674 10012, 10259, 10299, 10302 Group2: 3714, 3728 10667, 10145 Group3: 3861 10163 Group4: 4213 10379, 10649 Then I wish to get mean daily return for each group. So I just tried to separate the big dataset into several sub dataset, and calculate daily mean return for each of them. Then I get the sub datasets together with “append”. For the first step, I did: . local n=3600 . while `n' <4300 { 2. use "D:\sic.dta", clear 3. keep if sic >=`n' & sic < `n'+100 4. by date, sort: egen meanret=mean(ret) 5. save "D:\ph\sic\sic_`n'.dta" 6. local n=`n'+100 7. } It is successful for 36xx, 37xx. But when `n’== 3900, all observations in the complete file “D:\sic.dta" are deleted, because none of them meet the requirement: sic>=3900 & sic < 4000, so there is an error: (y observations deleted) __000001 not found y is the number of all the observations in complete dataset. There are a huge number of observations, so I can’t do it one by one. Has anyone here an idea to solve this problem? Or some easier methods to generate such groups (I’ve also tried, but failed to get it ), so I can get the daily mean return with: by group date, sort: egen meanret=mean(ret) Btw, I’m using stata 10. Thank you very much for your help. Best Regards Hua -- Aufgepasst: Sind Ihre Daten beim Online-Banking auch optimal geschützt? Jetzt absichern: https://homebanking.gmx.net/?mc=mail@footer.hb * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Build groups with the same first two numbers of SIC***From:*"Martin Weiss" <Martin.Weiss1@gmx.de>

**st: RE: Build groups with the same first two numbers of SIC***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

- Prev by Date:
**st: RE: Pseudo Rsquared with IVPROBIT** - Next by Date:
**RE: st: ordered tabulation with only top values shown** - Previous by thread:
**st: funny results with -cluster- in -somersd-** - Next by thread:
**st: RE: Build groups with the same first two numbers of SIC** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |