Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: logistic tranformation, proportion variables


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: logistic tranformation, proportion variables
Date   Thu, 13 Dec 2007 14:17:45 -0000

"Little" is not the adjective that springs to mind
for that help file. 

More important, I don't think that help file answers
much of the question here. 

As 0 and 1 are attainable, logit in the strict sense is 
out of the question. 

It seems to me that the main issue with a predictor that is 
a proportion is what is the shape of the function relating 

response | other predictors 

to 

proportional predictor | other predictors 

and, setting aside the instrumental variable aspect here, 
one handle on that might be given by added variable plots
after a plain multiple regression -- or graphical near
equivalents such as -mrunning- or -mlowess-. Use -findit- 
to locate these user-written programs. 

My first stab at this would be to consider some power of 
the predictor, say root or square. That way 0 and 1 stay 
as they are but you can bend the scale in the middle. 

Nick 
n.j.cox@durham.ac.uk 

David Airey


Nick Cox has a little Stata help file on transformations.

ssc install transint

Marck Bulter

> I have a question that is not entirely related to Stata. Do hope  
> that you forgive me.
>
> Assume the following model,
>
> *ivreg* pstrmon price maturity age coupon pstrmonprev pstrprev  
> intrest ivol compl (precmon = precmonprev)
>
> Where pstrmon, pstrmonprev, precmon and precmonprev are all  
> proportions. In this case, value bond A / total value bonds, etc.  
> Therefore, it can take any value between 0 and 1, 0 and 1 included.
> These  last 4 variables are heavily left skewed. Post estimations,  
> resid is heteroskedastic, and resid is not normal distributed.
> On the Statalist server I have found several references to logistic  
> transformations, ln(y/1-y):
> - http://www.stata.com/statalist/archive/2003-07/msg00285.html
> - home.fsw.vu.nl/m.buis/presentations/UKsug06.pdf
> - http://www.stata.com/statalist/archive/2006-02/msg00150.html
>
> If I transform the 4 variables using logistic transformation, the 4  
> variables or no longer skewed, resid is almost homoskedastic, and  
> resid is almost normal distributed.
> But my question is, is this transformation allowed, as I have mostly  
> seen only references of transformation of the dependent variable.
> In addition, the transformation makes the interpretation of the  
> coefficients hard, any comment on this?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2021 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index