[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Eva Poen <eva.poen@gmail.com> |

To |
Statalist <statalist@hsphsun2.harvard.edu> |

Subject |
st: Opinions on fractional logit versus tobit - prediction and model fit |

Date |
Thu, 2 Apr 2009 19:27:39 +0100 |

<> I'm looking at different ways to model my outcome variable, which is bounded between zero and one (zero and 20, actually, but I don't mind modelling the fraction). It's panel data, and I would like to model individual heterogeneity in the form of random effects (both random intercepts and random slopes). There are a lot of observations at zero and one, respectively. I'm reasonably confident that the random effects are independent of the other variables in the model. So far I have been looking at the fractional logit model, as introduced by Papke and Wooldrigde in their 1996 Journal of Applied Econometrics paper. I use -gllamm- to estimate a model with random effects. I have also been looking at the tobit model, which I again estimate using -gllamm- with random effects. I have a few doubts about the fractional logit model (FLM), and would like to hear other people's opinion: - Although it appears to be a very elegant solution, some people say that FLM is not well suited for problems with a lot of zeros or ones; for example, Maarten Buis said so in this post (but didn't provide a reference): http://www.stata.com/statalist/archive/2007-07/msg00786.html If someone knows any references where this is discussed, I'd be grateful to receive them. - Since FLM is quasi-likelihood, any likelihood-based approaches to model fit are ruled out. For the tobit model I can use those measures. The only other option I can think of for FLM is to compare predicted values with actual values. However, do predicted values in FLM make sense? We know that the distributional assumption is not true. So I'm wondering how meaningful predicted values are in this context. - I am getting sensible estimates for the random effects with the tobit approach, and not so sensible ones with FLM. In fact, FLM estimates two of the three to be zero. Is this a sign of my model being incorrectly specified, or could it be a sign of FLM not handling the zeros and ones very well? Many thanks, Eva * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Opinions on fractional logit versus tobit - prediction and model fit***From:*Stas Kolenikov <skolenik@gmail.com>

**st: RE: Opinions on fractional logit versus tobit - prediction and model fit***From:*"Verkuilen, Jay" <JVerkuilen@gc.cuny.edu>

- Prev by Date:
**st: Re:** - Next by Date:
**Re: st: How to treat variables where all outcomes happens in one interval Roland- When categories with events are compared to categories with no events in a Cox model, the partial likelihood is maximized by a HR of infinity, giving you the "very large HR" you observed. The same phenomenon occurs if you estimate the odds ratio in a 2 x 2 table with no observations in either of the off-diagonal cells. If you wish to use Cox, you cannot compare age >45 to age <=45. Your definition of stages is not very clear, but you cannot make any comparison of stages where membership in one requires age<=45. You may have to exclude all people <=45 and take what stage definitions remain remains. You may still analyze or adjust for differences among other stages, confined to those >45. If you can obtain from the literature information about the distribution of deaths by age, a sample size calculation (-stpower-) should show why you observed none in the <=45 group. -Steve On Mar 31, 2009, at 4:56 AM, roland andersson wrote: I am analysing survival in two methods of syrgery for thyroid cancer. The international classification of stage of disease includes tumorsize (<2, 2-4, >4 cm within the thyroid and growth outside the thyroid, presence of distant metastases, metastases to lymphglands and age>45 years. In my patients all deaths have occured in patients >age 45 years. When the dichotomised agevariable is analysed in Coxregression the HR is very large with very large SE. There is no problem with collinearity. How should I treat this situation? One solution would be to only analyse according to the stage classification (which includes age >45 years for stage 3 and 4), but I would like to analyse the importance of each element of the stageclassification. I may dichotomise with cutoff point >50 years, but that is not correct according to the international definition of tumour stage.** - Previous by thread:
**[no subject]** - Next by thread:
**st: RE: Opinions on fractional logit versus tobit - prediction and model fit** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |