Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: ordered logistic integration problems |

Date |
Thu, 21 Mar 2013 15:03:22 +0000 |

I agree with the implication that -glm- is astonishingly little known as a way of handling responses that are continuous proportions. The FAQs cited both predate an excellent concise review SJ-8-2 st0147 . . . . . . . . . . . . . . Stata tip 63: Modeling proportions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. F. Baum Q2/08 SJ 8(2):299--303 (no commands) tip on how to model a response variable that appears as a proportion or fraction But I don't think that your response is statistically that odd-ball. It is a fraction or proportion based on counts, and few things could be more statistical. (I wouldn't call it a percent, but that is a small language issue.) There also seems nothing unusual in the idea that different proportions arise from different combinations. 1/1 of cars in our household have four seats and 3/3 cars in my friend's household. The same fraction, different situations, some information loss on data reduction. Whether a fraction is the best scale for research is a good scientific question, naturally. However, it sounds as if your distribution is rather lumpy, which won't make anything easier, but it is difficult to guess whether that will be really problematic. -glm- does not entail numerical integration. That doesn't mean it always converges to sensible results. Nick On Thu, Mar 21, 2013 at 2:18 PM, Bontempo, Daniel E <deb193@ku.edu> wrote: > Thanks. I had not realized the glm command could do handle the 0's and 1's. That may be the best distribution, although the DV is such an oddball animal half count, half proportion, and a bit standardized to each person - recall it is the percent correct of the count of spontaneously attempted past tense verb form in a given period of recording their speech. > > Also, unlike many proportions in developmental science showing floor and ceiling effects, where the variance is small for all 0's early on, large in the middle, and small again as kids score mostly 1's later on, this is very odd because of the "spontaneous" aspect. The kids are clever, and they choose easier verbs (e.g., put) in the middle, with the consequence that percent % does not always mean the same thing - because it leaves out the dimension of "difficulty" of the attempts. > > Returnign to the issue of integration, like ologit, glm seems to be running fine. I do not think numerical integration is involved in the iterations these routines are doing. The ones doing numerical integration seem to have the trouble with this data. > > > My lingering question is do I take the integration difficulties in some routines as a reason to suspect the results of glm when it runs without issue? Richard Williams > My guess, that you are spreading the data too thin. If I follow you, the DV has 12 values, and 90% of the cases are a 1, which means the other 11 values average less than 1% of the cases. With gologit2 you are estimating 11 sets of coefficients. I am not surprised you have to collapse to only 3 categories. > > But why are you using an ordinal model in the first place? Why not a model specifically designed for proportions? See, for example, > > http://www.stata.com/support/faqs/statistics/logit-transformation/ > > http://www.ats.ucla.edu/stat/stata/faq/proportion.htm Bontempo, Daniel E >>Can anyone explain the kind of data conditions that cause gllamm or >>glogit2 to spit out: >> >>flat or discontinuous region encountered numerical derivatives are >>approximate nearby values are missing could not calculate numerical >>derivatives missing values encountered r(430); >> >> >>I have a colleague with proportion data that only has about 12 discrete >>values between 0 and 1 with about 90% 1's. Skew -3.27, Kurtosis>15. >> >>We want to model for 3 groups (between) and 3 occasions (within). >>Prior work published in 2000, had similar proportions and used HML >>(Gaussian) and got interpretable results. After looking at the >>distributions, I suggested ologit might be more appropriate than regress. >> >>I was already concerned about these proportion DVs because my colleague >>has calculated proportion correct of however many scorable events there >>were, and the number of events differs a lot from subject to subject. >>Some have 2 some have 10. BUT - my question for the moment is technical >>difficulty with numerical derivatives. >> >>Since there is occasion nested within person, I was interested in >>gllamm with the ologit link, as well as robust ologit with >>"cluster(subject)". I also tried glogit2 because I was unsure the >>parallel regression assumption was met. >> >>I easily get ologit to run. However both gllamm and glogit2 make >>similar complaints about missing or discontinuous numerical derivatives >>and do not complete. I tried the log-log link in glogit2 since the >>values rise slowly from 0 and suddenly go to 1. I kept rounding to get >>fewer levels. >> >>I have to collapse to only 3 levels to get glogit2 to run. gllamm keeps >>telling me to use trace and check initial model, but when I do I see >>reasonable fixed effect values. >> >>Is ologit able to use an estimation method that avoids these >>integration issues? >> >>I am trying to get the disaggregated data so multilevel logistic >>regressions can be done, but it is not clear disaggregated data will be >>available. >> >>Any pointers, advice, suggestions, references ... all appreciated. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: ordered logistic integration problems***From:*"Bontempo, Daniel E" <deb193@ku.edu>

**Re: st: ordered logistic integration problems***From:*Richard Williams <richardwilliams.ndu@gmail.com>

**RE: st: ordered logistic integration problems***From:*"Bontempo, Daniel E" <deb193@ku.edu>

- Prev by Date:
**st: gllamm dropping variables** - Next by Date:
**Re: st: Tracking attrition in a long-shaped dataset** - Previous by thread:
**RE: st: ordered logistic integration problems** - Next by thread:
**Re: st: ordered logistic integration problems** - Index(es):