Dear users, I need to estimate an equation (originally found in a paper), but I need some preliminary help.
My dataset is made up of a certain number of firms, and contains variables over the time period from 1995 to 2000. The panel in uncomplete and, in order to mitigate survivorship bias, companies are included in the analysis even if data is not available for every year.
Afer having sorted my dataset ( - sort id year - , - tsset id year - ), the regression equation to estimate is:
y_i,t = f( X1_i,t-3 , X2_i,t-3, ...)
where: "i" refers to the individual firms, "t" to the time period of the variable (measured at the accounting year end), and "t-3" to the average for the previous three years.
I have tried to compute the independent variables using -tssmooth ma X1av = X1, window(2 1)- and then lagging the result by one period but, because of the random incompleteness of firms' obervations, the resulting number is frequently the average of one or two observatons, not three ones requested. I suppose that the author uses only three-year-averages data. In this case, how could I get only the lagged three-year-average independent variables? More in general, is there any mistake in my approach?
Thanks in advance for any suggestion.
Carmine,
Napoli (Italia)
