Tue, 12 Aug 2008

Thanks for this. Posting code that doesn't work together with the problem is a good Statalist strategy too. If people don't understand the code or want to start afresh, they can ignore the code and concentrate on the problem. What is certain is that people can't comment very helpfully on code they can't see. To repeat: -egen, tag()- happens to tag the first of any group, first meaning first in the current sort order. Whether that is the first in any other sense necessarily may depend on that sort order. Nick n.j.cox@durham.ac.uk Marietta Jones Thank you very much for the suggested solution. I tried writing my own do file based on your suggestions to my previous post, but I couldn't get the code to work. My main problem was that while I knew how to tag the first observation of a group, I did not know how to successfully tag the last observation. gsort appears to offer a good solution that I was not aware of. On Tue, Aug 12, 2008 at 1:51 PM, Nick Cox <n.j.cox@durham.ac.uk> wrote: > If I understand your question correctly, it yields to a minor variation > on the technique documented in > > SJ-7-3 pr0033 . . . . . . . . . . . . . . Stata tip 51: Events in > intervals > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. > J. Cox > Q3/07 SJ 7(3):440--443 (no > commands) > tip for counting or summarizing irregularly spaced > events in intervals > > and explained in my posting replying to your previous question: > > <http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist > .0808/date/article-20.html> > > local N = _N > gsort Company_No - Date > gen avex = . > qui forval i = 1/`N' { > egen tag = tag(Investor_id) /// > if Company_no == Company_no[`i'] & /// > inrange(Date, Date[`i'] - 90, Date[`i']) > > su x if tag, meanonly > > replace avex = r(mean) in `i' > > drop tag > } > > The obvious variations on the previous example are using -summarize, > meanonly- to get the mean and specifying the previous 90 days within an > -inrange()- condition. (This code includes the current date; modify > according to taste.) > > The not-so-obvious variation is using -egen, tag()- to tag the _last_ > relevant observation for each distinct Investor_id. > > -egen, tag()- tags just one of any repetitions of each distinct value of > the variable specified. It's not documented, but the observation tagged > is in fact the _first_ such occurrence in the dataset in the current > sort order. Hence use -gsort- first to reverse date order in each panel, > so that tagging catches the last such occurrence in time. > > Nick > n.j.cox@durham.ac.uk > > Marietta Jones > > I have a new problem related to the dataset I presented in my original > email (see below). > > For each observation in the dataset, I would like to calculate the > mean of variable x for the same company_no on a rolling prior 90 day > basis. > The extra complexity arises by the fact that there may be more than > one observations of variable x for the same investor_id for a given > company_no. > I would only like to include the last observation per investor_id for > a given company_no in the calculation of the mean. > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

