Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: RE: Re: Generating predicted values for OLS withtransformed dependent variables


From   smerryman@kc.rr.com
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: RE: Re: Generating predicted values for OLS withtransformed dependent variables
Date   Thu, 13 Apr 2006 06:31:37 -0500

Daniel,

Here are few references you might find useful:

Lane, P.W., 2002, "Generalized linear models in soil science," 
European Journal of Soil Science 53, 241-251.

Abstract: Classical linear models are easy to understand and fit. 
However, when assumptions are not met, violence should not be used on 
the data to force them into the linear mould. Transformation of 
variables may allow successful linear modeling, but it affects several 
aspects of the model simultaneously. In particular, it can interfere 
with the scientific interpretation of the model. Generalized linear 
models are a wider class, and they retain the concept of additive 
explanatory effects. They provide generalizations of the 
distributional assumptions of the response variable, while at the same 
time allowing a transformed scale on which the explanatory effects 
combine. These models can be fitted reliably with standard software, 
and the analysis is readily interpreted in an analogous way to that of 
linear models. Many further generalizations to the generalized linear 
model have been proposed, extending them to deal with smooth effects, 
non-linear parameters, and extra compone
nts of variation. Though the extra complexity of generalized linear 
models gives rise to some additional difficulties in analysis, these 
difficulties are outweighed by the flexibility of the models and ease 
of interpretation. The generalizations allow the intuitively more 
appealing approach to analysis of adjusting the model rather than 
adjusting the data. 


Manning, Willard G., 1998.  "The logged dependent variable, 
heteroscedasticity, and the retransformation problem,"  Journal of 
Health Economics,  vol. 17(3), pages 283-295, June.


Willard G. Manning & John Mullahy, 1999. "Estimating Log Models: To 
Transform or Not to Transform?"  NBER Technical Working Papers 0246, 
National Bureau of Economic Research, Inc.

Abstract:  Data on health care expenditures, length of stay, 
utilization of health services, consumption of unhealthy commodities, 
etc. are typically characterized by: (a) nonnegative outcomes; (b) 
nontrivial fractions of zero outcomes in the population (and sample); 
and (c) positively-skewed distributions of the nonzero realizations. 
Similar data structures are encountered in labor economics as well. 
This paper provides simulation-based evidence on the finite-sample 
behavior of two sets of estimators designed to look at the effect of a 
set of covariates x on the expected outcome, E(y|x), under a range of 
data problems encountered in every day practice: generalized linear 
models (GLM), a subset of which can simply be viewed as differentially 
weighted nonlinear least-squares estimators, and those derived from 
least-squares estimators for the ln(y). We consider the first- and 
second- order behavior of these candidate estimators under alternative 
assumptions on the data generat
ing processes. Our results indicate that the choice of estimator for 
models of ln(E(x|y)) can have major implications for empirical results 
if the estimator is not designed to deal with the specific data 
generating mechanism. Garden-variety statistical problems - skewness, 
kurtosis, and heteroscedasticity - can lead to an appreciable bias for 
some estimators or appreciable losses in precision for others.

Scott




*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index