Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | KOTa <kota.alba@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: calculating cumulative values of other observations |
Date | Sat, 7 Apr 2012 10:42:23 -0400 |
thanks Erik, Nick both ways work on this sample, and i managed to adjust for a real database also. few followup question: 1. Eric, any specific reason why you convert time into date format? I used it without conversion and it looks fine. Also on real data your solution somehow goes into infinite loop after finishing all calculations. 2. Nick, the solution you proposed with "sum" was the one i tried initially. The problem is that in real data I have 2 columns for date (start_date end_date) and i want to compare start_date [_n] >= end_date[all previous] Eric's solution works longer, but i could adjust it to work with different date columns. your solution is works much faster in case there is one date column, but i could not figure out how to do the same with 2 different columns. regards, K. El día 7 de abril de 2012 04:08, Nick Cox <njcoxstata@gmail.com> escribió: > Why not > > bysort id activity (time): gen priorhrs = sum(time) - time > > Nick > > > > > On 7 Apr 2012, at 02:29, Eric Booth <eric.a.booth@gmail.com> wrote: > >> <> >> >> >> ******************! >> >> //input example data// >> clear >> inp ID str18 date str4 activity hours >> 1 20071230 A 10 >> 1 20071122 A 11 >> 2 20071120 A 5 >> 3 20071210 B 6 >> 3 20071111 B 24 >> 3 20071130 F 20 >> 3 20070511 B 49 >> 4 20070611 A 50 >> end >> >> //cleanup// >> g date2 = date(date, "YMD") >> format date2 %td >> sort ID activity date2 >> >> //gen prior hours var// >> g priorhrs = 0 >> forval n = 1/`=_N' { >> by ID activity: replace priorhrs = /// >> priorhrs + hours[_n-`n'] /// >> if date2[_n]>=date2[_n-`n'] >> } >> l >> ********************! >> >> >> - Eric >> >> __ >> Eric A. Booth >> Public Policy Research Institute >> Texas A&M University >> ebooth@ppri.tamu.edu >> +979.845.6754 >> >> >> On Apr 6, 2012, at 6:09 PM, KOTa wrote: >> >>> Hi all, >>> >>> i have a data looking like this: >>> >>> ID date activity hours prior result should >>> be: >>> 1 20071230 A 10 ? <-- 11 >>> 1 20071122 A 11 ? 0 >>> 2 20071120 A 5 ? 0 >>> 3 20071210 B 6 ? 73 >>> 3 20071111 B 24 ? 49 >>> 3 20071130 F 20 ? 0 >>> 3 20070511 B 49 ? 0 >>> 4 20070611 A 50 ? 0 >>> >>> (better picture here >>> http://gyazo.com/5dc1218534d8b185ab22989069034b8a.png ) >>> >>> i need to fill "prior" column with following: >>> - amount of hours spent on same activity by same user(ID) prior to >>> date of current activity. >>> >>> example: >>> in case of observation 4(line 4) activity is B, so prior = sum of >>> hours spent on activity B before it took place (i.e. before 20071210) >>> there are 2 such observations, one with 49 hour, another with 24. >>> Therefore, prior = 24+49=73. >>> >>> >>> any suggestions how to make this simple? >>> >>> i was thinking to use "sort" and "by:", but i can't figure out how to >>> conditionally("if") sum up values from different observations into >>> "prior" var >>> its not a first time i encounter similar problem and always took some >>> way around, but this time i decided to ask, maybe there is proper, >>> simple way to do this >>> >>> thank you in advance >>> >>> K. >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >> >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/