From
"Seed, Paul" <paul.seed@kcl.ac.uk>

To
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>

Subject
RE: st: Find the fiscal year for each obs

Date
Sun, 7 Apr 2013 10:44:56 +0000

One option is to start by restructuring file B wide so that here is only one record per firm Then match on firm, and use a loop to set the fiscal year. The resulting file will have the same number of records as FileA, and relatively few extra variables. So the extra memory requirements (beyond those for reading fileA) are relatively low Some thing like this (code not checked): **************Start Stata code *************** use fileB, clear summ Fiscal, mean local b = r(min) local e = r(max) keep Begin End Firm Fiscal reshape wide Begin End , i(Firm) j(Fiscal) compress save fileB_wide, replace use fileA, clear compress merge m:1 Firm using fileB_wide gen Fiscal = . foreach fy of numlist `b'/`e' { replace Fiscal = `fy' if Date >= Begin`fy' & Date <= End`fy' drop Begin`fy' End`fy' } **************End Stata code *************** An alternative requiring even less core memory would be to split fileB into separate files for each fiscal year. Then merge each one in turn with file A, update Fiscal appropriately, drop the extra variables and repeat. Some thing like this (code again not checked): **************Start Stata code *************** use fileB, clear summ Fiscal, mean local b = r(min) local e = r(max) foreach fy of numlist `b'/`e' { preserve keep if Fiscal == `fy' keep Firm Begin End compress save fileB`fy' restore } use fileA, clear compress gen Fiscal = . foreach fy of numlist `b'/`e' { merge m:1 Firm using fileB`fy' replace Fiscal = `fy' if Date >= Begin & Date <= End drop Begin End _merge } **************End Stata code *************** On Sat, Apr 6, 2013 at 4:24 PM, Yu Chen, PhD <profyuchen@gmail.com> wrote: > Nick, > Thank you so much for the code. It works great. However, I have a > further question. Assume File A has 5000 firms, and each has 360 daily > stock price observations for 5 years, then there are 9,000,000 > observations in File A. Assume File B has 5 years data (i.e., fiscal > year definition) for each firm, then there are 25,000 observations in > File B. If I use -merge-, the resulting file will be too large for the > computer to handle. So I think some kind of loop structure might need > to be used. Or there may be a better solution. Could you please help > me in this regard? What code will be good to handle this problem? > Best, > > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

