Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Maarten buis <maartenbuis@yahoo.co.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: compare effect size between dummys and metrics variables in logistic regression |

Date |
Mon, 27 Sep 2010 09:20:06 +0000 (GMT) |

--- On Sun, 26/9/10, Joerg Eulenberger wrote: > I want to calculate an binary logistic regression. I have > all metric variables z-transformed (mean = 0, std=1) to > compare the effect size between the independent variable. > But I have also dummys in my Regressionmodell. What can i > do to compare the effect size of the dummy's with the > effect size of the metric independent variables? Or is > that completely impossible? That is more a conceptual problem, even for metric variables. There are many answers possible, none of them will work for all situations, and for many (some will say all) situations there is simply no answer. 1), my default is not to compare the effects of variables unless I have a specific interest in it. 2), my default is not to standardize variables that have a natural unit. It is just much more informative to say that the average income will increase with x euros for every year extra education than to say that the average income will increase y standard devations for every standard deviation increase in education. The only exception would be point 1). 3) If I have a situation where a comparison of coefficients is of substantive interest, it is almost always limited to only a few variables. In that case I would tailor my standardization to those variables alone, and leave the remaining variables untouched. The aim would be to make the unit of our variables comparable. This can be achieved in many ways, none of these will work in all situations. So you would need to look at every pair and conceptually think about what makes substantive sense. Some of examples of such standardization are: 3a) compute z-scores. You basically assume that a standard deviation change in one variable is comparable with 1 standard deviation change in another variable. 3b) standardize on range. You basically assume that a movement from the minimum to the maximum in one variable is comparable with the same move in another variable. This often does not work well when one variable has a restricted number of categories while the other has a much larger number of categories. A dummy variable is an example of a variable with an extremely limited number of categories. 3c) Standardize on percentile rank scores. You are looking at the proportion of respondents that has a value less than your own value. This sometimes also makes substantive sense: you can have a theoretical reason to believe that people do no react on the absolute value of a certain variable but on how well they do compared to the rest of the population. All of these standardization can in principle be computed for dummy variables, but I would be least uncomfortable with percentile rank scores. *-------------------- begin example ------------------------- sysuse nlsw88, clear gen black = race == 2 if race <= 2 gen byte baseline = 1 local rhs "black grade" gen byte touse = !missing(union,black,grade) tempvar n i gen long `n' = . gen long `i' = . foreach var of varlist `rhs' { // standardize by standard deviation sum `var' if touse gen z_`var' = (`var' - r(mean))/r(sd) local z_rhs "`z_rhs' z_`var'" // standardize by range gen r_`var' = (`var' - r(min))/(r(max)-r(min)) local r_rhs "`r_rhs' r_`var'" // standardize by percentile rank score drop `n' `i' egen long `n' = count(`var') egen long `i' = rank(`var') gen h_`var' = (`i' - 0.5) / `n' local h_rhs "`h_rhs' h_`var'" } list `rhs' `z_rhs' `r_rhs' `h_rhs' in 1/10 qui logit union `rhs' baseline, nocons est store non qui logit union `z_rhs' baseline, nocons est store z qui logit union `r_rhs' baseline, nocons est store range qui logit union `h_rhs' baseline, nocons est store hazen est tab non z range hazen, eform b(%9.3f) *---------------- end example ----------------------- (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq ) 4) A special case occurs when a set of your dummie variables represent a categorical variable: e.g. race or religion. In those case you might look at -sheafcoef-, see -ssc desc sheafcoef- and <http://www.maartenbuis.nl/software/sheafcoef.html> Hope this helps, Maarten -------------------------- Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl -------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: compare effect size between dummys and metrics variables in logistic regression***From:*Alan Acock <acock@mac.com>

**References**:**st: compare effect size between dummys and metrics variables in logistic regression***From:*Jörg Eulenberger <j.eulenberger@web.de>

- Prev by Date:
**RE: st: "Label define" syntax for "all other values"** - Next by Date:
**st: foreach and levels of string variable** - Previous by thread:
**Re: st: compare effect size between dummys and metrics variables in logistic regression** - Next by thread:
**Re: st: compare effect size between dummys and metrics variables in logistic regression** - Index(es):