Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: visual guide to variable transformations?
Cameron McIntosh <email@example.com>
STATA LIST <firstname.lastname@example.org>
RE: st: visual guide to variable transformations?
Thu, 7 Jun 2012 21:52:23 -0400
Not exactly what you're searching for either, but if you're thinking about transforming variables in the regression context, I might also recommend taking a look at:
Ip, W.C., Wong, H., Wang, S.-G., & Jia, Z.-Z. (2004). A GIC rule for assessing data transformation in regression. Statistics & Probability Letters, 68(1), 105–110.
Cheng, T.-C. (2005). Robust regression diagnostics with data transformations. Computational Statistics & Data Analysis, 49(3), 875–891.
da Silva, M.V., Van Tassell, C.P., Sonstegard, T.S., Cobuci, J.A., & Gasbarre, L.C. (2012). Box–Cox Transformation and Random Regression Models for Fecal egg Count Data. Frontiers in Genetics, 2: 112.
Riani, M., & Atkinson, A.C. (2000). Robust Diagnostic Data Analysis: Transformations in Regression. Technometrics, 42(4), 384-394.
Dastan, A., & Horne, R.N. (2011). Robust Well-Test Interpretation by Using Nonlinear Regression With Parameter and Data Transformations. SPE Journal, 16(3), 698-712.
Stöckl, D., & Thienpont, L.M. (2008). Introduction of non-linearity by data transformation in method comparison and commutability studies. Clinical Chemistry and Laboratory Medicine, 46(12), 1784-1785.
Zhou, X.-H., Lin, H., & Johnson, E. (2008). Non-parametric heteroscedastic transformation regression models for skewed data with an application to health care costs. Journal of the Royal Statistical Society, Series B (Statistical Methodology), 70(5), 1029–1047.
> Date: Fri, 8 Jun 2012 00:55:11 +0100
> Subject: Re: st: visual guide to variable transformations?
> From: email@example.com
> To: firstname.lastname@example.org
> I agree with Austin and David. The business of why, when and how to
> transform is rather too complicated to reduce easily to a very concise
> statement. Nevertheless I wrote a Stata-linked guide to
> transformations that is downloadable as a help file. It can be found
> on SSC at -transint-.
> David is too modest to underline that some of the best expository
> material on transformations is still that to be found in a book he
> co-edited with Frederick Mosteller and John Tukey:
> not to mention the evergreen
> Hoaglin, D.C. 1988. Transformations in everyday experience. Chance
> 1(4): 40--45.
> Austin Nichols
> > Why not? Such advice would be generically incorrect.
> > You are assuming only a bivariate relationship among continuous variables,
> > but even in such a restricted setting, linearity and normality are far
> > from required, and it is unclear how you would discern from
> > most scatterplots how to get there even if they were indicated.
> > E.g.
> > clear all
> > set seed 1
> > drawnorm z e, n(1000)
> > g x=normal(z)
> > g y=x*exp(e)
> > lpoly y x
> > g y2=x+rnormal(exp(e),x)
> > lpoly y2 x
> > That said, a review of available -glm- links and
> > common -nl- specifications might make a good FAQ.
> David Hoaglin
> Quite a lot has been written about transformations, including their
> role in regression modeling. I'll have to look for material that
> approaches "a visual guide."
> For now, I would like to correct the misimpression that, after
> transformation, the data on an independent variable should resemble a
> normal distribution. I would not transform an independent variable
> for that reason.
> In the context of a regression model, the main aim in transforming an
> independent variable is to promote linearity of the relation between
> the dependent variable and the independent variable (as you describe
> for Figure 1d). Promoting linearity is also an important aim in
> transforming the dependent variable. Also, if the model involves more
> than one independent variable, transforming the dependent variable may
> make the contributions of the independent variables more nearly
> additive (i.e., reduce or remove interactions among the independent
> Another reason for transforming the dependent variable is to make
> residual variability more nearly constant across the range of that
> variable. One usually checks on this by making various plots of
> Choosing transformations often requires thought. It should not be
> reduced to a simple rule. The transformations need to make sense in
> the context of the data.
> Lloyd Dumont
> >> Does anyone know of a visual guide to variable transformations? I have seen many decent verbal exlanations of whether, when, and specifically how to transform variables. But, is there a single resource that shows which transformation is appropriate when. For example, something like...
> >> When an indep variable is distributed as it is in Figure 1a and is related to the dep var as shown here in Figure 1b, then you should use the _____ transformation. Then, the transformed indep variable will be displayed as in Figure 1c (which I imagine will almost always be something like a normal distribution) and the relationship between the transformed variable and the dep var will be as displayed in Figure 1d (which I imagine will almost always be linear).
> >> Of course, it all gets a little more complicated if we start talking about transforming the dep var, though this sort of transformation could also easily be displayed and explained visually.
> >> Does anyone know of such a resource? If not, why not?
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: