[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Marck Bulter <177316mb@student.eur.nl> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: logistic tranformation, proportion variables |

Date |
Fri, 14 Dec 2007 01:23:30 +0100 |

Nick Cox wrote:

I agree with Austin here. The fudge mapping zero -> smidgen before logit cannot be harmless as logit(smidgen) will become very largeNick,

negative

as smidgen becomes very small. I can't see an easy trade-off there. Otherwise put, the smaller smidgen is, the more you create outliers in your predictor space. Better not to transform or to use a transform that is not problematic at 0. And square roots are not.

In addition, exact zeros sometimes convey qualitative information. If a predictor is proportion of income spent on tobacco, then the people with zeros presumably don't smoke and pretending that they do

(even a little) is a distortion of the data.

Without doubt, people do this kind of fudge, and sometimes the

argument is that they can't think of a better way, but I won't sign up to approve.

Nick

P.S... -mlowess- is intensive because -lowess- is.

Austin Nichols

Marck--

No, replacing 0 with .001 is not appropriate, unless replacing it with

.0001 or .0000001 or 1e-30 etc. instead has no impact on the results,

in which case you could just drop the zeros and get the same results.

Also: Why is the sqrt(0) problematic?

My guess is that a better solution to your problem would be grounded

in theory. What is this regression supposed to measure the effects

of? If y is a proportion and x1 and x2 are proportions, and they

"want to be" transformed via logits, perhaps you should be using the

logs of the numerators and denominators of those variables, since

logit(a/(a+b))=ln(a)-ln(a+b)

so including the logit of a proportion X as an explanatory var is the

same as including the logs of its numerator and denominator and

constraining the coefficients N and D to satisfy N+D=0, which is a

testable restriction. Using the logit of a proportion Y as an

explanatory var is the same as using the log of its numerator as the

depvar and the log of the denominator as a regressor and constraining

the coefficient on the log of the denominator to be 1, which is also a

testable restriction.

Of course, if the numerator is zero, the log is undefined and those

obs will drop out of the estimation. Theory can also help you here

sometimes--in particular, perhaps the sqrt(X) is actually what has a

linear effect on Y, not X, as Nick suggests.

On Dec 13, 2007 11:58 AM, Marck Bulter <177316mb@student.eur.nl> wrote:

Nick Cox wrote:for

"Little" is not the adjective that springs to mindDear Nick,

for that help file.

More important, I don't think that help file answers

much of the question here.

As 0 and 1 are attainable, logit in the strict sense is

out of the question.

It seems to me that the main issue with a predictor that is

a proportion is what is the shape of the function relating

response | other predictors

to

proportional predictor | other predictors

and, setting aside the instrumental variable aspect here,

one handle on that might be given by added variable plots

after a plain multiple regression -- or graphical near

equivalents such as -mrunning- or -mlowess-. Use -findit-

to locate these user-written programs.

My first stab at this would be to consider some power of

the predictor, say root or square. That way 0 and 1 stay

as they are but you can bend the scale in the middle.

Nick

n.j.cox@durham.ac.uk

I have read the transit files, these are very informative. Thank you

sharing. And thanks to David Airey for pointing me to transit. But0

indeed, these do not answer my question entirely.

Strictly, 100% is possible, but the proportion data I have range from

to 0.8. The author of the following published article,question.

http://www.cepr.org/pubs/new-dps/dplist.asp?dpno=5153

converts 0 values to, 0.001 and 1 to 0.999. Not the most prettiest

solution, but strictly logistic trans. is no longer out of the

My master thesis is an extension of a previous research, where thehe

author also used proportion dependent and independent variables, but

did not explain if and if he did, how he transformed the variables.bit,

For your suggestion on root and square, Sqrt does improve thinks a

but of course the 0 values are problematic, in addition the residboth

assumptions are problematic.

Do you think that the conversion to 0.001 is appropriate? And more

important, is it appropriate to use logistic transformed variables

as dependent and independent variables?*

Sorry for not being entirely accurate the first time.

Regards,

Marck Bulter

Currently, mlowess is running, it is a bit computer intensive.

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

I totally agree, conversion is an awfull solution, fitting the data to the model. But still I have to do something with the heteroskedacity and the non normal resid's.

As suggested in your transit files, I will give folded transformation a try.

thanks for your comment,

Marck Bulter

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: logistic tranformation, proportion variables***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**References**:**st: logistic tranformation, proportion variables***From:*Marck Bulter <177316mb@student.eur.nl>

**Re: st: logistic tranformation, proportion variables***From:*David Airey <david.airey@Vanderbilt.Edu>

**RE: st: logistic tranformation, proportion variables***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: logistic tranformation, proportion variables***From:*Marck Bulter <177316mb@student.eur.nl>

**Re: st: logistic tranformation, proportion variables***From:*"Austin Nichols" <austinnichols@gmail.com>

- Prev by Date:
**Re: st: logistic tranformation, proportion variables** - Next by Date:
**Re: st: Re: SQL Query Password/User ID** - Previous by thread:
**Re: st: logistic tranformation, proportion variables** - Next by thread:
**RE: st: logistic tranformation, proportion variables** - Index(es):

© Copyright 1996–2022 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |