Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Cameron McIntosh <cnm100@hotmail.com> |

To |
STATA LIST <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: visual guide to variable transformations? |

Date |
Thu, 7 Jun 2012 21:52:23 -0400 |

Lloyd, Not exactly what you're searching for either, but if you're thinking about transforming variables in the regression context, I might also recommend taking a look at: Ip, W.C., Wong, H., Wang, S.-G., & Jia, Z.-Z. (2004). A GIC rule for assessing data transformation in regression. Statistics & Probability Letters, 68(1), 105–110. Cheng, T.-C. (2005). Robust regression diagnostics with data transformations. Computational Statistics & Data Analysis, 49(3), 875–891. da Silva, M.V., Van Tassell, C.P., Sonstegard, T.S., Cobuci, J.A., & Gasbarre, L.C. (2012). Box–Cox Transformation and Random Regression Models for Fecal egg Count Data. Frontiers in Genetics, 2: 112. Riani, M., & Atkinson, A.C. (2000). Robust Diagnostic Data Analysis: Transformations in Regression. Technometrics, 42(4), 384-394. Dastan, A., & Horne, R.N. (2011). Robust Well-Test Interpretation by Using Nonlinear Regression With Parameter and Data Transformations. SPE Journal, 16(3), 698-712. Stöckl, D., & Thienpont, L.M. (2008). Introduction of non-linearity by data transformation in method comparison and commutability studies. Clinical Chemistry and Laboratory Medicine, 46(12), 1784-1785. Zhou, X.-H., Lin, H., & Johnson, E. (2008). Non-parametric heteroscedastic transformation regression models for skewed data with an application to health care costs. Journal of the Royal Statistical Society, Series B (Statistical Methodology), 70(5), 1029–1047. Cam > Date: Fri, 8 Jun 2012 00:55:11 +0100 > Subject: Re: st: visual guide to variable transformations? > From: njcoxstata@gmail.com > To: statalist@hsphsun2.harvard.edu > > I agree with Austin and David. The business of why, when and how to > transform is rather too complicated to reduce easily to a very concise > statement. Nevertheless I wrote a Stata-linked guide to > transformations that is downloadable as a help file. It can be found > on SSC at -transint-. > > David is too modest to underline that some of the best expository > material on transformations is still that to be found in a book he > co-edited with Frederick Mosteller and John Tukey: > > http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0471384917,descCd-tableOfContents.html > > not to mention the evergreen > > Hoaglin, D.C. 1988. Transformations in everyday experience. Chance > 1(4): 40--45. > > Nick > > Austin Nichols > > > Why not? Such advice would be generically incorrect. > > You are assuming only a bivariate relationship among continuous variables, > > but even in such a restricted setting, linearity and normality are far > > from required, and it is unclear how you would discern from > > most scatterplots how to get there even if they were indicated. > > > > E.g. > > > > clear all > > set seed 1 > > drawnorm z e, n(1000) > > g x=normal(z) > > g y=x*exp(e) > > lpoly y x > > g y2=x+rnormal(exp(e),x) > > lpoly y2 x > > > > That said, a review of available -glm- links and > > common -nl- specifications might make a good FAQ. > > David Hoaglin > > Quite a lot has been written about transformations, including their > role in regression modeling. I'll have to look for material that > approaches "a visual guide." > > For now, I would like to correct the misimpression that, after > transformation, the data on an independent variable should resemble a > normal distribution. I would not transform an independent variable > for that reason. > > In the context of a regression model, the main aim in transforming an > independent variable is to promote linearity of the relation between > the dependent variable and the independent variable (as you describe > for Figure 1d). Promoting linearity is also an important aim in > transforming the dependent variable. Also, if the model involves more > than one independent variable, transforming the dependent variable may > make the contributions of the independent variables more nearly > additive (i.e., reduce or remove interactions among the independent > variables). > > Another reason for transforming the dependent variable is to make > residual variability more nearly constant across the range of that > variable. One usually checks on this by making various plots of > residuals. > > Choosing transformations often requires thought. It should not be > reduced to a simple rule. The transformations need to make sense in > the context of the data. > > Lloyd Dumont > > >> Does anyone know of a visual guide to variable transformations? I have seen many decent verbal exlanations of whether, when, and specifically how to transform variables. But, is there a single resource that shows which transformation is appropriate when. For example, something like... > >> > >> When an indep variable is distributed as it is in Figure 1a and is related to the dep var as shown here in Figure 1b, then you should use the _____ transformation. Then, the transformed indep variable will be displayed as in Figure 1c (which I imagine will almost always be something like a normal distribution) and the relationship between the transformed variable and the dep var will be as displayed in Figure 1d (which I imagine will almost always be linear). > >> > >> Of course, it all gets a little more complicated if we start talking about transforming the dep var, though this sort of transformation could also easily be displayed and explained visually. > >> > >> Does anyone know of such a resource? If not, why not? > >> > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: visual guide to variable transformations?***From:*Lloyd Dumont <lloyddumont@yahoo.com>

**References**:**st: visual guide to variable transformations?***From:*Lloyd Dumont <lloyddumont@yahoo.com>

**Re: st: visual guide to variable transformations?***From:*Austin Nichols <austinnichols@gmail.com>

**Re: st: visual guide to variable transformations?***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**st: Goodness-of-fit measure after -oprobit- with compex survey data** - Next by Date:
**Re: st: Goodness-of-fit measure after -oprobit- with compex survey data** - Previous by thread:
**Re: st: visual guide to variable transformations?** - Next by thread:
**Re: st: visual guide to variable transformations?** - Index(es):