Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: compare effect size between dummys and metrics variables in logistic regression

From	Alan Acock <[email protected]>
To	[email protected]
Subject	Re: st: compare effect size between dummys and metrics variables in logistic regression
Date	Mon, 27 Sep 2010 08:35:14 -0700

In Maarten's response below, one of his approaches, as I understand it, is to standardize the predictors and Maarten says "All of these standardization can in principle be computed for dummy variables." Does he mean to standardize a dummy independent variable? If your dummy variable represents a clear discrete dichotomy (e.g. gender) where there is no underlying continuum then I don't understand a one standard deviation change in the dummy variable. Going up or down one standard deviation on such a variable is not often an interesting substantive topic.

I have this concern because I often see people report standardized beta weights when they have a dummy predictor and compare those beta weights to the beta weights for continuous predictors. They might say a one standard deviation change in race produces a .2 standard deviation change in Y whereas a one standard deviation change in education produces a .3 standard deviation change in Y. 

--Alan Acock

On Sep 27, 2010, at Mon Sep 5 2:20 , Maarten buis wrote:

> --- On Sun, 26/9/10, Joerg Eulenberger wrote:
>> I want to calculate an binary logistic regression. I have
>> all metric variables z-transformed (mean = 0, std=1) to
>> compare the effect size between the independent variable.
>> But I have also dummys in my Regressionmodell. What can i
>> do to compare the effect size of the dummy's with the
>> effect size of the metric independent variables? Or is
>> that completely impossible?
> 
> That is more a conceptual problem, even for metric variables.
> There are many answers possible, none of them will work for
> all situations, and for many (some will say all) situations
> there is simply no answer.
> 
> 1), my default is not to compare the effects of variables
> unless I have a specific interest in it. 
> 
> 2), my default is not to standardize variables that have 
> a natural unit. It is just much more informative to say that
> the average income will increase with x euros for every year
> extra education than to say that the average income will 
> increase y standard devations for every standard deviation 
> increase in education. The only exception would be point 1).
> 
> 3) If I have a situation where a comparison of coefficients
> is of substantive interest, it is almost always limited to
> only a few variables. In that case I would tailor my 
> standardization to those variables alone, and leave the 
> remaining variables untouched.
> 
> The aim would be to make the unit of our variables comparable.
> This can be achieved in many ways, none of these will work in
> all situations. So you would need to look at every pair and 
> conceptually think about what makes substantive sense. Some
> of examples of such standardization are:
> 
> 3a) compute z-scores. You basically assume that a standard
> deviation change in one variable is comparable with 1 
> standard deviation change in another variable.
> 
> 3b) standardize on range. You basically assume that a movement
> from the minimum to the maximum in one variable is comparable
> with the same move in another variable. This often does not
> work well when one variable has a restricted number of categories
> while the other has a much larger number of categories. A dummy
> variable is an example of a variable with an extremely limited
> number of categories.
> 
> 3c) Standardize on percentile rank scores. You are looking at 
> the proportion of respondents that has a value less than your
> own value. This sometimes also makes substantive sense: you 
> can have a theoretical reason to believe that people do no 
> react on the absolute value of a certain variable but on how 
> well they do compared to the rest of the population.
> 
> All of these standardization can in principle be computed for
> dummy variables, but I would be least uncomfortable with 
> percentile rank scores. 
> 
> *-------------------- begin example -------------------------
> sysuse nlsw88, clear
> gen black = race == 2 if race <= 2
> gen byte baseline = 1
> 
> local rhs "black grade"
> gen byte touse = !missing(union,black,grade) 
> tempvar n i
> gen long `n' = .
> gen long `i' = .
> foreach var of varlist `rhs' {
>    // standardize by standard deviation
>    sum `var' if touse
>    gen z_`var' = (`var' - r(mean))/r(sd)
>    local z_rhs "`z_rhs' z_`var'"
> 	
>    // standardize by range
>    gen r_`var' = (`var' - r(min))/(r(max)-r(min))
>    local r_rhs "`r_rhs' r_`var'"
> 	
>    // standardize by percentile rank score
>    drop `n' `i'
>    egen long `n' = count(`var')
>    egen long `i' = rank(`var')
>    gen h_`var' = (`i' - 0.5) / `n' 	
>    local h_rhs "`h_rhs' h_`var'"
> }
> 
> list `rhs' `z_rhs' `r_rhs' `h_rhs' in 1/10
> 
> qui logit union `rhs' baseline, nocons
> est store non
> 
> qui logit union `z_rhs' baseline, nocons
> est store z
> 
> qui logit union `r_rhs' baseline, nocons
> est store range
> 
> qui logit union `h_rhs' baseline, nocons
> est store hazen
> 
> est tab  non z range hazen, eform b(%9.3f)
> *---------------- end example -----------------------
> (For more on examples I sent to the Statalist see: 
> http://www.maartenbuis.nl/example_faq )
> 
> 4) A special case occurs when a set of your dummie variables
> represent a categorical variable: e.g. race or religion. In
> those case you might look at -sheafcoef-, see 
> -ssc desc sheafcoef- and 
> <http://www.maartenbuis.nl/software/sheafcoef.html>
> 
> Hope this helps,
> Maarten
> 
> --------------------------
> Maarten L. Buis
> Institut fuer Soziologie
> Universitaet Tuebingen
> Wilhelmstrasse 36
> 72074 Tuebingen
> Germany
> 
> http://www.maartenbuis.nl
> --------------------------
> 
> 
> 
> 
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: compare effect size between dummys and metrics variables in logistic regression
  - From: Maarten buis <[email protected]>

References:
- Re: st: compare effect size between dummys and metrics variables in logistic regression
  - From: Maarten buis <[email protected]>

Prev by Date: st: Logit command Stata 10 vs. 11
Next by Date: st: RE: Logit command Stata 10 vs. 11
Previous by thread: Re: st: compare effect size between dummys and metrics variables in logistic regression
Next by thread: Re: st: compare effect size between dummys and metrics variables in logistic regression
Index(es):
- Date
- Thread