[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: regression: fade rate residual income

From   Michael Hanson <>
Subject   Re: st: RE: regression: fade rate residual income
Date   Thu, 23 Oct 2008 21:48:19 -0400


That explanation is clearer than your earlier messages in terms of what you intend to achieve, but whether your objective makes sense is less clear: not enough information is provided on that issue.

The model you (appear to) propose is simply a pooled regression over your panel of firms and years. Thus,

reg residual_income L.residual_income

should give you one single estimate for b1 (the coefficient on lagged residual income) for your entire (unbalanced) sample. Indeed, for what you have described, -rollreg- (or -rolling-) is exactly *not* what you want to do. (It can be used to create a time series of cross-sectional estimates of b1 (a different estimate for b1 per year), for example.)

Some comments, however:

1. Any (firm, year) pair that is missing will not be included in the regression. So Stata already automatically takes care of your concern about missing consecutive observations in computing b1. This is a consequence of having -xtset- (or, equivalently, -tsset-) your data, so Stata constructs the lagged values correctly. (You can test this claim by making a copy of your data that only includes (say) even years, then try estimating your model again. It should fail to produce an estimate, since L.residual_income is undefined for every even-yeared value of residual_income in this synthetic data set.)

2. What you call a "fade rate" is probably more generally known as an "autoregressive parameter". Some textbooks may discuss the "rate of decay" implied by the value of the autoregressive parameter. The larger is b1, the longer it takes for the effects of any given shock to e(i, t+1) to dissipate from the residual_income variable. Hence, b1 is also known as a "measure of persistence" of the shocks to e(i, t +1).

3. It is not obvious that a pooled OLS estimator for b1 is most appropriate. As you have a panel data structure, you might as well try to productively exploit it. I don't know what your exposure to panel data estimators might be, but a large number of textbooks will cover this topic, even at the intermediate/advanced undergraduate level. (This is particularly true in econometrics, which one might reasonably guess is fairly close to your research area given you have data on firms.) The basic question to ask yourself in deciding what estimator to use is what do you hypothesize are the properties of your error term, e(i, t+1)? Once you have some familiarity with some basic panel data estimators, take a look at the -xt- commands for Stata, starting with -xtreg-.

4. That said, you have a lagged endogenous regressor in your equation. Depending on how you model the error term and what your purposes are, that could be a significant problem. The issues involved with lagged endogenous regressors ("dynamic panel data") are more advanced and only some graduate-level econometrics textbooks cover them. In Stata 10, see -xtdpd- and related commands for more information.

Hope this helps,

On Oct 23, 2008, at 5:30 PM, wrote:

Dear Nick (statalisters),

Thank you for your time. Let me be more clear this time.

I would like to examine the autoregressive properties of abnormal earinings (=residual income) (first order abnormal earnings autoregression). So I want to use a pooled analysis with one lag, i.e. residual_income (i, t+1) = b0 + b1 * residual_income(i, t) + e (i, t+1), where i is a specific company ("name" as identifier) and t is the year of the observation ("year"). What I want to get is a fade rate b1 , which describes the reversal of residual_income. b1 should be one single value in order to predict future residual incomes in another sample ( i.e. residual_income next year equals b1 times residual income this year). I expect b1 to be about 0.7 (b0=0).

When I say the regression should run over every two consecutive years for a company I mean that the regression should ignore cases, in which there is more than one year between two observations, because b1 should be the fade rate of residual_income from one year to the following year. The identifier for company is "name" and the year is given by "year". I used:

tsset name year

.panel variable:  name, 1000 to 270705
.time variable:  year, 1974 to 2006, but with gaps

rollreg residual_income l.residual_income, move(2) stub(a)

.sample may not contain gaps


Well, I don't know whether my idea is an appropiate way to solve this problem and to get one single b1. Perhaps someone can help me, whether this is an appropiate way to solve this problem and to get one single value of b1 and how to get rid of the gaps (because - rollreg-from SSC does not support gaps in the data).

Thanks for your consideration.
Greg B.

-------- Original-Nachricht --------
Datum: Mon, 20 Oct 2008 13:37:03 +0100
Von: "Nick Cox" <>
Betreff: st: RE: regression: fade rate residual income

I think you have problems at various levels.

The most obvious is that -rollreg- from SSC [please remember to explain where user-written programs you discuss come from] does not support data
with gaps. When you -tsset- your data you should have seen a comment
that your data include gaps.

The next is what you are trying to do. If I read this correctly, you
want to look at regressions for pairs of values within each panel. That
gives you at most two distinct data points and you should be able to
solve for the coefficients directly. You will get perfect fits, except when points coincide when regression will be indeterminate. Also, there
is no question of an error term.

On the other hand, I doubt that I am reading you correctly.

You posted on this topic a week ago. In response both Michael Hanson and I hinted that you may need to explain what you expect in more detail to
get better answers.


I would like to run a regression on residual_income. I have yearly
observations of residual income for firms. The year is given in variable
"year", the identifier for firm is "name".

I'd like to run the regression residual_income(year) = b0 + b1 *
residual_income(year-1) + e The regression should run on
"residual_income" over every two consecutive years ("year") within each
identifier "name" (whenever there are values for at least two
consecutive years for a given name).

I used the following:

drop if missing(residual_income)
tsset name year
rollreg residual_income l.residual_income, move(2) stub(a)

I hope this command will do what I want but unfortunately Stata always
sample may not contain gaps

What might be the problem?

*   For searches and help try:

"Feel free" - 5 GB Mailbox, 50 FreeSMS/Monat ...
Jetzt GMX ProMail testen:
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index