[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Trond.Petersen@hsphsun2.harvard.edu |

To |
JA@hsphsun2.harvard.edu |

Subject |
st: Re: log of variables in xtreg |

Date |
Tue October 15, 2002 |

I received your email from a colleague in Stockholm, Peter Hedstrom. I cc the rest of the Stata list so that people can see that the query has been answered. I also send you, not the rest of the list, a paper I wrote on these issues, as a PDF file, based on notes made while teaching regression analysis. There is also a note or an addendum to this paper written by Leo Goodman. It gives the exact conditions for when a coefficient changes sign as one goes from unlogged to logged values, in the case of comparing two groups. But I don't have a PDF version of it. What you describe happens with some regularity. I have run across it in teaching graduate classes in regression. The reason is this. Consider a simple example with a continuous dependent varable y_i and dichotomous independent D_i, for individual i. The point made carries over to multivariate regression situations and to continuous independent variables. 1. In unlogged form you estimate y_i = b_0 + b_1*D_i + error Here, b_1 gives the impact on the conditional mean of y_i, giving the mean difference in y between those with D_i=0 and D_i=1. 2. In logged form you estimate ln(y_i) = a_0 + a_1*D_i + error Here, a_1 gives the impact on the conditional mean of ln(y_i), giving the mean difference in the logarithm of y between those with D_i=0 and D_i=1. There are now two correct interpretations of a_1. The first correct interpretation is the one given above, that it gives the impact on the conditional mean of ln(y_i), giving the mean difference in the logarithms of y between those with D_i=0 and D_i=1. The second correct interpretation is that it gives the relative impact on the conditional GEOMETRIC mean of the unlogged values, that is, the relative difference in geometric means between those with D_i=0 and D_i=1. To get this, one often computes exp[a_1] - 1. When researchers interpret a_1 as giving the relative impact on the unlogged y, then that is a misinterpretation. 3. Reason 1 for sign change The interpretational difference identified in 2 is probably the source of the sign change you observe. A variable may have a positive impact on the conditional arithmetic mean of a variable but a negative impact on the conditional geometric mean of the same variable. 4. Reason 2 for sign change A second reason for a sign change is that in going from unlogged to logged form, or vice versa, to make the two formulations consistent, you would need to include interaction terms. Take the impact of age on wages controlling for sex, with a positive coefficient for age. In the unlogged form, the age lines for the sexes are parallel. In the logged form, the difference, when transformed back to unlogged values, between the sexes increases with age. Some kind of interaction term would then be needed in order to make the lines parallel for the retransformed variable in the logged form. 5. Solutions There are two solutions when the sign change occurred for the first reason. Solution 1: Estimate an exponential regression y_i = exp(a_0 + a_1*D_i) + error_i Solution 2: Estimate a GLM y_i = exp(a_0 + a_1*D_i)*error_i where error_i in GLM is gamma distributed. In STATA you can do both. In the exponential regression you would need to include dummy variables for years and countries to get fixed effects. You could develop a random effects estimator, but it would take programming. You need to program the exponential regression. In the GLM you would also to have to include dummy variables for years and countries to get fixed effects. But STATA has already implemented a canned random effects estimator, as I remember it. Trond Petersen Professor Department of Sociology UC Berkeley ----- Original Message ----- From: "Javier Aparicio" <fjaparicio@altavista.net> To: <statalist@hsphsun2.harvard.edu> Sent: Tuesday, October 15, 2002 8:06 AM Subject: st: log of variables in xtreg > Dear list, > > I am running xtreg with time and country effects in a panel with 50 years and 45 countries. My depvar and two covariates are in constant dollars per capita already. > > I am testing some policy indicator variables, and some coefficients switch signs when I take the log of my dollar-valued variables. What worries me is that the estimates are significant either in the log transform, or without it--but with opposite signs. > > Any suggestions of what should I do? > > Thanks, > > -JA * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: log of variables in xtreg***From:*"Javier Aparicio" <fjaparicio@altavista.net>

- Prev by Date:
**st: interpreting treatment effect model results** - Next by Date:
**Re: st: interpreting treatment effect model results** - Previous by thread:
**st: log of variables in xtreg** - Next by thread:
**st: loop** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |