Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: compare effect size between dummys and metrics variables in logistic regression

 From Alan Acock To statalist@hsphsun2.harvard.edu Subject Re: st: compare effect size between dummys and metrics variables in logistic regression Date Mon, 27 Sep 2010 08:35:14 -0700

```In Maarten's response below, one of his approaches, as I understand it, is to standardize the predictors and Maarten says "All of these standardization can in principle be computed for dummy variables." Does he mean to standardize a dummy independent variable? If your dummy variable represents a clear discrete dichotomy (e.g. gender) where there is no underlying continuum then I don't understand a one standard deviation change in the dummy variable. Going up or down one standard deviation on such a variable is not often an interesting substantive topic.

I have this concern because I often see people report standardized beta weights when they have a dummy predictor and compare those beta weights to the beta weights for continuous predictors. They might say a one standard deviation change in race produces a .2 standard deviation change in Y whereas a one standard deviation change in education produces a .3 standard deviation change in Y.

--Alan Acock

On Sep 27, 2010, at Mon Sep 5 2:20 , Maarten buis wrote:

> --- On Sun, 26/9/10, Joerg Eulenberger wrote:
>> I want to calculate an binary logistic regression. I have
>> all metric variables z-transformed (mean = 0, std=1) to
>> compare the effect size between the independent variable.
>> But I have also dummys in my Regressionmodell. What can i
>> do to compare the effect size of the dummy's with the
>> effect size of the metric independent variables? Or is
>> that completely impossible?
>
> That is more a conceptual problem, even for metric variables.
> There are many answers possible, none of them will work for
> all situations, and for many (some will say all) situations
> there is simply no answer.
>
> 1), my default is not to compare the effects of variables
> unless I have a specific interest in it.
>
> 2), my default is not to standardize variables that have
> a natural unit. It is just much more informative to say that
> the average income will increase with x euros for every year
> extra education than to say that the average income will
> increase y standard devations for every standard deviation
> increase in education. The only exception would be point 1).
>
> 3) If I have a situation where a comparison of coefficients
> is of substantive interest, it is almost always limited to
> only a few variables. In that case I would tailor my
> standardization to those variables alone, and leave the
> remaining variables untouched.
>
> The aim would be to make the unit of our variables comparable.
> This can be achieved in many ways, none of these will work in
> all situations. So you would need to look at every pair and
> conceptually think about what makes substantive sense. Some
> of examples of such standardization are:
>
> 3a) compute z-scores. You basically assume that a standard
> deviation change in one variable is comparable with 1
> standard deviation change in another variable.
>
> 3b) standardize on range. You basically assume that a movement
> from the minimum to the maximum in one variable is comparable
> with the same move in another variable. This often does not
> work well when one variable has a restricted number of categories
> while the other has a much larger number of categories. A dummy
> variable is an example of a variable with an extremely limited
> number of categories.
>
> 3c) Standardize on percentile rank scores. You are looking at
> the proportion of respondents that has a value less than your
> own value. This sometimes also makes substantive sense: you
> can have a theoretical reason to believe that people do no
> react on the absolute value of a certain variable but on how
> well they do compared to the rest of the population.
>
> All of these standardization can in principle be computed for
> dummy variables, but I would be least uncomfortable with
> percentile rank scores.
>
> *-------------------- begin example -------------------------
> sysuse nlsw88, clear
> gen black = race == 2 if race <= 2
> gen byte baseline = 1
>
> gen byte touse = !missing(union,black,grade)
> tempvar n i
> gen long `n' = .
> gen long `i' = .
> foreach var of varlist `rhs' {
>    // standardize by standard deviation
>    sum `var' if touse
>    gen z_`var' = (`var' - r(mean))/r(sd)
>    local z_rhs "`z_rhs' z_`var'"
>
>    // standardize by range
>    gen r_`var' = (`var' - r(min))/(r(max)-r(min))
>    local r_rhs "`r_rhs' r_`var'"
>
>    // standardize by percentile rank score
>    drop `n' `i'
>    egen long `n' = count(`var')
>    egen long `i' = rank(`var')
>    gen h_`var' = (`i' - 0.5) / `n'
>    local h_rhs "`h_rhs' h_`var'"
> }
>
> list `rhs' `z_rhs' `r_rhs' `h_rhs' in 1/10
>
> qui logit union `rhs' baseline, nocons
> est store non
>
> qui logit union `z_rhs' baseline, nocons
> est store z
>
> qui logit union `r_rhs' baseline, nocons
> est store range
>
> qui logit union `h_rhs' baseline, nocons
> est store hazen
>
> est tab  non z range hazen, eform b(%9.3f)
> *---------------- end example -----------------------
> (For more on examples I sent to the Statalist see:
> http://www.maartenbuis.nl/example_faq )
>
> 4) A special case occurs when a set of your dummie variables
> represent a categorical variable: e.g. race or religion. In
> those case you might look at -sheafcoef-, see
> -ssc desc sheafcoef- and
> <http://www.maartenbuis.nl/software/sheafcoef.html>
>
> Hope this helps,
> Maarten
>
> --------------------------
> Maarten L. Buis
> Institut fuer Soziologie
> Universitaet Tuebingen
> Wilhelmstrasse 36
> 72074 Tuebingen
> Germany
>
> http://www.maartenbuis.nl
> --------------------------
>
>
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```