Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: FW: Aggregating values back to a certain row


From   Jeph Herrin <[email protected]>
To   [email protected]
Subject   Re: st: FW: Aggregating values back to a certain row
Date   Thu, 20 Mar 2014 09:14:33 -0400


If the window is really a moderate 5 or so years, you might get by with 'reshaping' the data first:

 forv y=1/5 {
          bys firm_id (year): gen value_`y'=value[_n-`y'] ///
                     if inrange(year[_n-`y'],year,year-5)
 }
 egen value_agg = rowtotal(value_1-value_5)


You will have (at most 5) new variables populated on each row, containing the values from the last (at most) 5 years. Then just aggregate across the row.

J


On 3/19/2014 4:03 PM, Nick Cox wrote:
This kind of question is often asked here.

Your code can't easily be rescued. A loop over observations doesn't
really mix with a -while- loop. Also note, for example, that -if-
qualifiers are illegal and indeed make no sense in defining -local-s.

You don't spell it out, and it's not even tacit in your code, but I am
guessing that all calculations must be for the same firm.

Consider this:

1. Initialise all aggregate values to 0.

gen firm_value_agg = 0

2. Loop over all the observations.

quietly forvalues n=1/`=_N' {

3. Summarize value if the firm is the same as that in the present
observation and investment was in the previous 5 years.

Here previous five years is interpreted as meaning for 2013 (say)
2013, 2012, 2011, 2010, 2009.

           su value if firm == firm[`n'] & inrange(year, year[`n'] - 4,
year), meanonly

If you want to exclude the present year, it would be

           su value if firm == firm[`n'] & inrange(year, year[`n'] - 5,
year[`n'] - 1), meanonly

The -meanonly- option is helpful to speed up what is likely to be slow
code. Despite its name, -meanonly- does include calculation of the
sum.

4. Replace the aggregated value in the current observation.

           replace firm_value_agg = r(sum) in `n'
}

Counts would mean using r(N) not r(sum).

There's a lengthier sermon in
http://www.stata-journal.com/sjpdf.html?articlenum=pr0033 which you
should cite if you find it useful. (Many people will cite a paper for
just mentioning one idea they allude to, but not a paper that saved
them hard work in data management.)

Nick
[email protected]


On 19 March 2014 19:03, Daniel Stefan Hain <[email protected]> wrote:
Dear Stata users,

I currently have a problem I find myself unable to solve.

I have longitudinal data on investments made by companies, each investment a single row. It looks sort of like:

invest_id             firm_id                 value                                     year
1                              1                              1,5                                          2001
2                              1                              2,7                                          2001
3                              1                              3,4                                          2002
4                              2                              0,9                                          1998
5                              2                              5,5                                          2003

I want to introduce a "investment experience", which aggregates the number (or value) of previous investments over the last X (lets say 5) years previous to the current one. Firms have very different investment activity, so might the one have 100, the next only one per year. I up to now tried different versions of something like


local N =_N
forvalues n=1/`N' {
                 local i=1
                 local t=0
        while t=0 {
                                 replace firm_value_agg = firm_value_agg + value[_n - `i']) if `n'=_n
                                 local i = `i' + 1
                                 local t = 1             if year[_n] - 5 == year[_n - `i']
                 }
}

Here I try to write a loop that goes over every row (first forvalues), and then (in theory) aggregates previous investment values in "firm_value_agg", as long till it comes to a row where the year is smaller than the current year minus 5 (since I want to aggregate the last 5 years). I tried some different versions, but always get some error messages like "no weights allowed", or the code just produces unintended results. For every piece of advice how to make this work I would be very grateful.

Best wishes


Daniel S. Hain

Ph.D. Fellow, M.Sc. Economics, Dipl.-Wirt.-Ing.
Department of Business and Management | IKE | DRUID | EIS
T: (+45) 9940 2724  | Email: [email protected]
Web: www.ike.aau.dk | www.eis-all.dk | www.innoresource.org
Aalborg University | Fibigerstræde 11, Room 91 | 9220 Aalborg, Denmark


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index