|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: logistic tranformation, proportion variables
Nick Cox wrote:
"Little" is not the adjective that springs to mind
for that help file.
More important, I don't think that help file answers
much of the question here.
As 0 and 1 are attainable, logit in the strict sense is
out of the question.
It seems to me that the main issue with a predictor that is
a proportion is what is the shape of the function relating
response | other predictors
to
proportional predictor | other predictors
and, setting aside the instrumental variable aspect here,
one handle on that might be given by added variable plots
after a plain multiple regression -- or graphical near
equivalents such as -mrunning- or -mlowess-. Use -findit-
to locate these user-written programs.
My first stab at this would be to consider some power of
the predictor, say root or square. That way 0 and 1 stay
as they are but you can bend the scale in the middle.
Nick
[email protected]
David Airey
Nick Cox has a little Stata help file on transformations.
ssc install transint
Marck Bulter
I have a question that is not entirely related to Stata. Do hope
that you forgive me.
Assume the following model,
*ivreg* pstrmon price maturity age coupon pstrmonprev pstrprev
intrest ivol compl (precmon = precmonprev)
Where pstrmon, pstrmonprev, precmon and precmonprev are all
proportions. In this case, value bond A / total value bonds, etc.
Therefore, it can take any value between 0 and 1, 0 and 1 included.
These last 4 variables are heavily left skewed. Post estimations,
resid is heteroskedastic, and resid is not normal distributed.
On the Statalist server I have found several references to logistic
transformations, ln(y/1-y):
- http://www.stata.com/statalist/archive/2003-07/msg00285.html
- home.fsw.vu.nl/m.buis/presentations/UKsug06.pdf
- http://www.stata.com/statalist/archive/2006-02/msg00150.html
If I transform the 4 variables using logistic transformation, the 4
variables or no longer skewed, resid is almost homoskedastic, and
resid is almost normal distributed.
But my question is, is this transformation allowed, as I have mostly
seen only references of transformation of the dependent variable.
In addition, the transformation makes the interpretation of the
coefficients hard, any comment on this?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
Dear Nick,
I have read the transit files, these are very informative. Thank you for
sharing. And thanks to David Airey for pointing me to transit. But
indeed, these do not answer my question entirely.
Strictly, 100% is possible, but the proportion data I have range from 0
to 0.8. The author of the following published article,
http://www.cepr.org/pubs/new-dps/dplist.asp?dpno=5153
converts 0 values to, 0.001 and 1 to 0.999. Not the most prettiest
solution, but strictly logistic trans. is no longer out of the question.
My master thesis is an extension of a previous research, where the
author also used proportion dependent and independent variables, but he
did not explain if and if he did, how he transformed the variables.
For your suggestion on root and square, Sqrt does improve thinks a bit,
but of course the 0 values are problematic, in addition the resid
assumptions are problematic.
Do you think that the conversion to 0.001 is appropriate? And more
important, is it appropriate to use logistic transformed variables both
as dependent and independent variables?
Sorry for not being entirely accurate the first time.
Regards,
Marck Bulter
Currently, mlowess is running, it is a bit computer intensive.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/