Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Alan Acock <acock@mac.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: compare effect size between dummys and metrics variables in logistic regression |

Date |
Mon, 27 Sep 2010 08:35:14 -0700 |

In Maarten's response below, one of his approaches, as I understand it, is to standardize the predictors and Maarten says "All of these standardization can in principle be computed for dummy variables." Does he mean to standardize a dummy independent variable? If your dummy variable represents a clear discrete dichotomy (e.g. gender) where there is no underlying continuum then I don't understand a one standard deviation change in the dummy variable. Going up or down one standard deviation on such a variable is not often an interesting substantive topic. I have this concern because I often see people report standardized beta weights when they have a dummy predictor and compare those beta weights to the beta weights for continuous predictors. They might say a one standard deviation change in race produces a .2 standard deviation change in Y whereas a one standard deviation change in education produces a .3 standard deviation change in Y. --Alan Acock On Sep 27, 2010, at Mon Sep 5 2:20 , Maarten buis wrote: > --- On Sun, 26/9/10, Joerg Eulenberger wrote: >> I want to calculate an binary logistic regression. I have >> all metric variables z-transformed (mean = 0, std=1) to >> compare the effect size between the independent variable. >> But I have also dummys in my Regressionmodell. What can i >> do to compare the effect size of the dummy's with the >> effect size of the metric independent variables? Or is >> that completely impossible? > > That is more a conceptual problem, even for metric variables. > There are many answers possible, none of them will work for > all situations, and for many (some will say all) situations > there is simply no answer. > > 1), my default is not to compare the effects of variables > unless I have a specific interest in it. > > 2), my default is not to standardize variables that have > a natural unit. It is just much more informative to say that > the average income will increase with x euros for every year > extra education than to say that the average income will > increase y standard devations for every standard deviation > increase in education. The only exception would be point 1). > > 3) If I have a situation where a comparison of coefficients > is of substantive interest, it is almost always limited to > only a few variables. In that case I would tailor my > standardization to those variables alone, and leave the > remaining variables untouched. > > The aim would be to make the unit of our variables comparable. > This can be achieved in many ways, none of these will work in > all situations. So you would need to look at every pair and > conceptually think about what makes substantive sense. Some > of examples of such standardization are: > > 3a) compute z-scores. You basically assume that a standard > deviation change in one variable is comparable with 1 > standard deviation change in another variable. > > 3b) standardize on range. You basically assume that a movement > from the minimum to the maximum in one variable is comparable > with the same move in another variable. This often does not > work well when one variable has a restricted number of categories > while the other has a much larger number of categories. A dummy > variable is an example of a variable with an extremely limited > number of categories. > > 3c) Standardize on percentile rank scores. You are looking at > the proportion of respondents that has a value less than your > own value. This sometimes also makes substantive sense: you > can have a theoretical reason to believe that people do no > react on the absolute value of a certain variable but on how > well they do compared to the rest of the population. > > All of these standardization can in principle be computed for > dummy variables, but I would be least uncomfortable with > percentile rank scores. > > *-------------------- begin example ------------------------- > sysuse nlsw88, clear > gen black = race == 2 if race <= 2 > gen byte baseline = 1 > > local rhs "black grade" > gen byte touse = !missing(union,black,grade) > tempvar n i > gen long `n' = . > gen long `i' = . > foreach var of varlist `rhs' { > // standardize by standard deviation > sum `var' if touse > gen z_`var' = (`var' - r(mean))/r(sd) > local z_rhs "`z_rhs' z_`var'" > > // standardize by range > gen r_`var' = (`var' - r(min))/(r(max)-r(min)) > local r_rhs "`r_rhs' r_`var'" > > // standardize by percentile rank score > drop `n' `i' > egen long `n' = count(`var') > egen long `i' = rank(`var') > gen h_`var' = (`i' - 0.5) / `n' > local h_rhs "`h_rhs' h_`var'" > } > > list `rhs' `z_rhs' `r_rhs' `h_rhs' in 1/10 > > qui logit union `rhs' baseline, nocons > est store non > > qui logit union `z_rhs' baseline, nocons > est store z > > qui logit union `r_rhs' baseline, nocons > est store range > > qui logit union `h_rhs' baseline, nocons > est store hazen > > est tab non z range hazen, eform b(%9.3f) > *---------------- end example ----------------------- > (For more on examples I sent to the Statalist see: > http://www.maartenbuis.nl/example_faq ) > > 4) A special case occurs when a set of your dummie variables > represent a categorical variable: e.g. race or religion. In > those case you might look at -sheafcoef-, see > -ssc desc sheafcoef- and > <http://www.maartenbuis.nl/software/sheafcoef.html> > > Hope this helps, > Maarten > > -------------------------- > Maarten L. Buis > Institut fuer Soziologie > Universitaet Tuebingen > Wilhelmstrasse 36 > 72074 Tuebingen > Germany > > http://www.maartenbuis.nl > -------------------------- > > > > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: compare effect size between dummys and metrics variables in logistic regression***From:*Maarten buis <maartenbuis@yahoo.co.uk>

**References**:**Re: st: compare effect size between dummys and metrics variables in logistic regression***From:*Maarten buis <maartenbuis@yahoo.co.uk>

- Prev by Date:
**st: Logit command Stata 10 vs. 11** - Next by Date:
**st: RE: Logit command Stata 10 vs. 11** - Previous by thread:
**Re: st: compare effect size between dummys and metrics variables in logistic regression** - Next by thread:
**Re: st: compare effect size between dummys and metrics variables in logistic regression** - Index(es):