Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Dependent continuous variable with bounded range


From   "Pavlos C. Symeou" <p.symeou@jbs.cam.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Dependent continuous variable with bounded range
Date   Wed, 16 Apr 2008 01:06:24 +0100

Dear Nick,

Stata does not allow any other "family" than "binomial" to be used with the "logit" link function. Particularly, the table in page 67 in "Stata cross-sectional time-series" Reference Manual Release 8 presents the allowed pairs between a link function and a family. I tried to use the "Gaussian" or "gamma" distributions with the "logit" link function and as expected it created an error. Considering my problem with bounded values, would you suggest the use of a different link function that allows the "Gaussian" or "gamma" distributions (these would be "identity", "log", "power", and "reciprocal")? Otherwise, should I continue with my OLS model given that the predicted values stay well within the possible range?

Yours truly,

Pavlos

Nick Cox wrote:

I don't consider the binomial to be a continuous distribution. However, it often happens that quite what error family you use is not that important. I'd play with normal (Gaussian) or gamma.
Paradoxically, the fact that your final model does not fit very well -- although well enough to be interesting -- helps you here
as it means that predictions stay well within the possible range.
Downstream of this, in a thesis, paper or oral presentation, it would often be a good idea to disarm potential critics by mentioning the question of violating the outcome range only to dismiss it as not biting in practice.
Pavlos C. Symeou

Dear Nick,

thank you for this. I have tried your suggestion below (to confirm, for the option "link" I use "logit" and for the option "family" I use "binomial"). However, I found no statistical significance in any of the coefficients and after a series of various permutations, it looked to me that the model could not fit the data sufficiently. I therefore returned back to my original random-effects OLS regression whose use you suggest for simplicity reasons. The OLS model's results are also consistent with my theoretical arguments. But still, I need to check whether the predicted values will lie in [0,10]. I have used the command - predict, xb - to save the fitted values in a new variable. The fitted values range from 5.58 to 6.93. The range of values for my observed variable is (2.95 - 8.32). Would this suggest that my model does not suffer from the limitations you note below?

Yours truly,

Pavlos

Nick Cox wrote:

The numeric result for skewness doesn't quite match the fact that the mean is nearer the maximum than the minimum, not that that need that be the case.
You possibly have a bit of a tail of fairly lousy firms, but otherwise this distribution looks quite healthy to me. How about
gen repute = reputation / 10 xtgee repute ..., link(logit) family(<continuous>)
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index