Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: A Rolling Regression with Missing Values


From   Charles Clarke <Charles.Clarke@business.uconn.edu>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: A Rolling Regression with Missing Values
Date   Thu, 9 Aug 2012 19:23:04 +0000

Dear Statalist,

I have a panel data set of several companies with monthly observations.

For each company, I would like to run several regressions of the form:  reg Y X1 X2

The regressions need to roll over the time period with a moving window.  The window starts at 24 months and increases until 60 months.  At 60 months, month 1 is dropped and month 2 to month 61 are used in the regression, and so on until the end of the estimation period.

I doubt it is the best way, but I cobbled together an approach as follows:

rollreg Y X1 X2, add(24) stub(rolladd)

rollreg Y X1 X2, move(60) stub(rollmov)

The betas for the regressions are the important part.  For betas missing in the second regression, I use the first regression.  [I would love to learn a more efficient way.]

The problem is that when data is missing the above doesn’t work.  None of my X variables are missing, but my Y variables often have missing values.  I would like for the regressions to still run, if after taking into account missing values the number of observations in the window is above a certain threshold (say 24).

That is, if ACME Corp. over a 60 month window has 10 missing observations, then the regression will be over the same 60 month time window, but will use n = 50, ignoring the months with missing Y values.  If the 60 month window has 45 missing values (less than the required amount), the code should produce a missing value.

I'm not sure if I should be trying to get rolling or rollreg to work or restructure the code with loops.  The data set is large (1.5 million observations).  I’d appreciate any help offered.

Kind regards,

Charlie
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index