Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: FW: Aggregating values back to a certain row |
Date | Thu, 20 Mar 2014 13:34:39 +0000 |
This is a nice idea, although the example data showed that 1. two or more items can be observed for the same firm in the same year (still, what that implies is just a prior -collapse-, which might be a good move any way) 2. there can be gaps (again, there are remedies) Nick njcoxstata@gmail.com On 20 March 2014 13:14, Jeph Herrin <info@flyingbuttress.net> wrote: > > If the window is really a moderate 5 or so years, you might get by with > 'reshaping' the data first: > > forv y=1/5 { > bys firm_id (year): gen value_`y'=value[_n-`y'] /// > if inrange(year[_n-`y'],year,year-5) > } > egen value_agg = rowtotal(value_1-value_5) > > > You will have (at most 5) new variables populated on each row, containing > the values from the last (at most) 5 years. Then just aggregate across the > row. > > J > > > > On 3/19/2014 4:03 PM, Nick Cox wrote: >> >> This kind of question is often asked here. >> >> Your code can't easily be rescued. A loop over observations doesn't >> really mix with a -while- loop. Also note, for example, that -if- >> qualifiers are illegal and indeed make no sense in defining -local-s. >> >> You don't spell it out, and it's not even tacit in your code, but I am >> guessing that all calculations must be for the same firm. >> >> Consider this: >> >> 1. Initialise all aggregate values to 0. >> >> gen firm_value_agg = 0 >> >> 2. Loop over all the observations. >> >> quietly forvalues n=1/`=_N' { >> >> 3. Summarize value if the firm is the same as that in the present >> observation and investment was in the previous 5 years. >> >> Here previous five years is interpreted as meaning for 2013 (say) >> 2013, 2012, 2011, 2010, 2009. >> >> su value if firm == firm[`n'] & inrange(year, year[`n'] - 4, >> year), meanonly >> >> If you want to exclude the present year, it would be >> >> su value if firm == firm[`n'] & inrange(year, year[`n'] - 5, >> year[`n'] - 1), meanonly >> >> The -meanonly- option is helpful to speed up what is likely to be slow >> code. Despite its name, -meanonly- does include calculation of the >> sum. >> >> 4. Replace the aggregated value in the current observation. >> >> replace firm_value_agg = r(sum) in `n' >> } >> >> Counts would mean using r(N) not r(sum). >> >> There's a lengthier sermon in >> http://www.stata-journal.com/sjpdf.html?articlenum=pr0033 which you >> should cite if you find it useful. (Many people will cite a paper for >> just mentioning one idea they allude to, but not a paper that saved >> them hard work in data management.) >> >> Nick >> njcoxstata@gmail.com >> >> >> On 19 March 2014 19:03, Daniel Stefan Hain <dsh@business.aau.dk> wrote: >>> >>> Dear Stata users, >>> >>> I currently have a problem I find myself unable to solve. >>> >>> I have longitudinal data on investments made by companies, each >>> investment a single row. It looks sort of like: >>> >>> invest_id firm_id value >>> year >>> 1 1 1,5 >>> 2001 >>> 2 1 2,7 >>> 2001 >>> 3 1 3,4 >>> 2002 >>> 4 2 0,9 >>> 1998 >>> 5 2 5,5 >>> 2003 >>> >>> I want to introduce a "investment experience", which aggregates the >>> number (or value) of previous investments over the last X (lets say 5) years >>> previous to the current one. Firms have very different investment activity, >>> so might the one have 100, the next only one per year. I up to now tried >>> different versions of something like >>> >>> >>> local N =_N >>> forvalues n=1/`N' { >>> local i=1 >>> local t=0 >>> while t=0 { >>> replace firm_value_agg = firm_value_agg >>> + value[_n - `i']) if `n'=_n >>> local i = `i' + 1 >>> local t = 1 if year[_n] - 5 >>> == year[_n - `i'] >>> } >>> } >>> >>> Here I try to write a loop that goes over every row (first forvalues), >>> and then (in theory) aggregates previous investment values in >>> "firm_value_agg", as long till it comes to a row where the year is smaller >>> than the current year minus 5 (since I want to aggregate the last 5 years). >>> I tried some different versions, but always get some error messages like "no >>> weights allowed", or the code just produces unintended results. For every >>> piece of advice how to make this work I would be very grateful. >>> >>> Best wishes >>> >>> >>> Daniel S. Hain >>> >>> Ph.D. Fellow, M.Sc. Economics, Dipl.-Wirt.-Ing. >>> Department of Business and Management | IKE | DRUID | EIS >>> T: (+45) 9940 2724 | Email: dsh@business.aau.dk >>> Web: www.ike.aau.dk | www.eis-all.dk | www.innoresource.org >>> Aalborg University | Fibigerstræde 11, Room 91 | 9220 Aalborg, Denmark >>> >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ >> > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/