Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <n.j.cox@durham.ac.uk> |

To |
"'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: RE: st: RE: RE: Calculating moving windows over time with conditions |

Date |
Fri, 4 Feb 2011 14:29:10 +0000 |

Your question about -by:-. The answer is this: Take away the -by:- and what difference does it make? None, because the calculation is a row sum for each observation. The answer is the same however you do it, whether observation by observation; or observation by observation within blocks of observations. Where the data came from is irrelevant. You already have a time series solution, meaning one based on L. etc., in a previous reply. Nick n.j.cox@durham.ac.uk erik.aadland@bi.no Thanks again, Nick. Sorry about the typo. My third argument was of course: bysort id (year): gen var_x_3yrs = (var_x + lag_var_x + lag2_var_x) ; If I use this argument, or omit "by" as it is unneccessary as you say: gen var_x_3yrs = (var_x + lag_var_x + lag2_var_x) ; Would this produce an acceptable solution, or is it plain wrong? Remember that I have only one observation of each unique id per year if the id is indeed in that year. If this is wrong, I'll have to go into time series commands. All the best, Erik. -----Forwarded by Erik Aadland/people/BISTIFT on 02/04/2011 03:12PM ----- To: "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> From: Nick Cox <n.j.cox@durham.ac.uk> Sent by: owner-statalist@hsphsun2.harvard.edu Date: 02/04/2011 03:01PM Subject: RE: st: RE: RE: Calculating moving windows over time with conditions You are recoding the wrong variable in your third statement. Once the variables are created your last -by:- is unnecessary. I guess you are seeking something more like tsset id year gen var_x_3yrs = var_x + cond(L.var_x < ., L.var_x, 0) + cond(L2.var_x < ., L2.var_x, 0) -- but that is not guaranteed to work the way you want if there are gaps. In many ways you are likely to get better results by averaging non-missing values and multiplying up by 3. Nick n.j.cox@durham.ac.uk erik.aadland@bi.no Thank you very much for your help and input. If I don't get it right, I'll try to go for the time series commands. I just created the following code. Does this look acceptable? sort id year ; bysort id: gen lag_var_x = var_x[_n-1] if year==year[_n-1]+1 ; recode var_x (. = 0) ; bysort id: gen lag2_var_x = var_x[_n-2] if year==year[_n-2]+2 ; recode lag2_var_x (. = 0) ; bysort individual_id (year): gen var_x_3yrs = (var_x + lag_var_x + lag2_var_x) ; From: Nick Cox <n.j.cox@durham.ac.uk> Commenting now on the code, 0. Your basic structure is by id year: There is only one observation in each of those combinations. You need by id (year): 1. A key thing is that -egen-'s "functions" do not behave at all like Stata's functions. Thus you must refer to just _one_ function on the right-hand side of an = sign. The syntax of -egen- is given in the help. egen [type] newvar = fcn(arguments) [if] [in] [, options] So the minimal call is egen newvar = fcn(arguments) There is no scope for more than one -fcn()- call. 2. -if- is allowed just once in any Stata command. -if- never appears _inside_ anything else. 3. You could use -cond(,)- as part of an expression to express branching. In this case, it would get messy almost beyond belief. I'd back off from this approach and use L. directly as Johannes suggested or -rolling- or -mvsumm- (SSC) as I suggested earlier. Nick n.j.cox@durham.ac.uk Nick Cox Consider also using -rolling- or -mvsumm- (SSC). Writing your own code for problems like this is instructive, but not necessary. erik.aadland@bi.no I have an unbalanced panel dataset in which I need to calculate a 3 year moving window for a variable for each actor in the dataset. I have already calculated the annual total sum for the variable for each year (var_x). I have tagged individuals by year and removed all observations but one per year. Now I need to sum the annual totals up for each actor by year in 3 year moving windows. As the dataset is unbalanced, I need to make sure that observation _n-1 is indeed the year before _n, and not several years prior to _n. I don't get it quite right. I use stata 10. Here is the code so far: sort id year ; egen tag_id_year = tag(id year) ; keep if tag_id_year == 1; sort id year ; bysort id year: egen var_3yrs = total(var_x) & total(var_x[_n-1]if year==year[_n-1]+1) & total(var_x[_n-2]if year==year[_n-2]+2) ; I have also tried: bysort id year: egen var_3yrs = total(var_x) + total(var_x[_n-1]if year==year[_n-1]+1) + total(var_x[_n-2]if year==year[_n-2]+2) ; And: bysort id year: egen var_3yrs = total(var_x + var_x[_n-1]if year==year[_n-1]+1 + var_x[_n-2]if year==year[_n-2]+2) ; * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**RE: RE: st: RE: RE: Calculating moving windows over time with conditions***From:*erik.aadland@bi.no

- Prev by Date:
**RE: st: RE: RE: Calculating moving windows over time with conditions** - Next by Date:
**Re: st: adjust to the inflation level** - Previous by thread:
**RE: RE: st: RE: RE: Calculating moving windows over time with conditions** - Next by thread:
**Re: RE: RE: st: RE: RE: Calculating moving windows over time with conditions** - Index(es):