Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: (Moving Maximum in Unbalanced panel) Max of a variable in the last 12 months?


From   Sergio Correia <[email protected]>
To   [email protected]
Subject   Re: st: RE: (Moving Maximum in Unbalanced panel) Max of a variable in the last 12 months?
Date   Wed, 6 Jul 2005 16:11:26 -0500

As you know, I was unable to run the program due to the very large
number of observations (hundreds of thousands of individuals in a
panel with a total of 2million obs). Maybe it's because of the
summarize used in the program: running the program for 500,000
individuals is thus equivalent to writing 500,000 summ's.

Well, just for the record, the "not elegant at all but it works"
solution I had to came with is something like this:

*********************************
* The moving maximum is calculated with respect to variable "x".
* tsset was applied previously
* the "t" macro represents dates ("ym" format), but I was too lazy to
code them elegantly in the loop.
* Even if there is only one non missing value, the "maximum" can be
calculated, as I'm just ignoring missing values.

	gen max=.
	gen temp=.
	forvalues t = 504/539 {
		replace temp = x
		replace temp = cond(temp>l.temp | l.temp==., temp, l.temp) ///
					if (date>=`t'-10) & (date<=`t')
* I used a 12 month window, if another window is used the 10 needs to
be replaced with "windowsize-2"
		replace max = temp if date==`t'
	}
	drop temp
************************

Obviously this is full of weaknesses (like using "temp" instead of
creating a tempvar), but my main question is if there are more
efficient and robust ways to do this.

Thanks,
Sergio Correia


On 7/6/05, Kit Baum <[email protected]> wrote:
> I recently encountered the issue re tsfill in using mvsumm. It is not
> handled by mvsumm, but nothing prevents you from issuing a tsfill
> command before invoking mvsumm.
> 
> mvsumm can be very slow in a large data set, as it is ado-file code. I
> have also encountered that issue.
> 
> Kit Baum, Boston College Economics
> http://ideas.repec.org/e/pba1.html
> 
> On Jul 6, 2005, at 2:09 PM, Nick Cox wrote:
> 
> >> Nick,
> >>
> >> I've got two questions about your program.
> >>
> >> I first ran it with 10k obs and worked fine, but in the entire panel I
> >> have around 1.5 million observations, and Stata isn't able to run the
> >> mvsumm with that amount of data (Stata just "freezes"... it ran for 4
> >> hours until I -break-ed it.
> >>
> >> So, do you know the max # of obs that the prog. supports?
> >
> > -mvsumm- as such has no limits. The limits of the program are
> > in essence those of Stata together with your machine set-up.
> >
> >> By the way, my panel is heavy unbalanced so using the tsfill
> >> complicates things a lot. How difficult would it be if I wanted to
> >> modify it to support this w/out the tsfill ?
> >
> > As you say you are a Stata novice, my guess is very
> > difficult.
> >
> >
> 
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index