Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Marcos Almeida <virtual.596@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Ocratio gives neither AIC nor BIC |

Date |
Sun, 29 Dec 2013 17:33:12 -0200 |

Dear David and dear Statalisters, Thank you for you your reply and for the tips. Regarding the details of the data set of around 1800 adults, the independent variables are age_group according to the age spam (5 groups: 40-49; 50-59, 60-69; 70-79; > 80 years), gender, diabetes, hypertension and dyslipidemia (all four binomial variables) and the continuous bmi (body mass index). The dependent variable relates to heart rate variability software in time domain analysis. It is called pnn50. Pnn50 is a result of (validated) computerized measurements done over a 24 hour electrocardiogram and conveys the parasympathetic flow: the higher the values, the higher the parasympathetic flow. In this dataset, the mean of pnn50 is = 9; the SD = 15; min = 0.01; max = 213). Pnn50 lacked a normality patter even after log-transformation (Shapiro-Francia test). There was also positive skewness.. The scatterplot of age versus pnn50 showed a trend of increasing of pnn50 with ageing, as expected. But there was too much variance and many extreme values (as we see, from 0.01 to 213). Also, the residuals lacked a normal distribution, even after log-transforming. By the way, I choose log-transforming after analyzing the "ladder" command, that is, it seemed log-transforming was the best option, graphically and numerically, if compared to the other transformations. Regarding the glm, since the dependent variable was continuous, I choose the Gaussian family. Indeed, as you well noticed, the link must be log , as I truly commanded: I just misstyped “logit” in the last message. This was the command: . glm pnn50 gender bmi diabetes hyper dyslipidemia group, family(gaussian) link(log) Aftwards, I demanded the residuals (after “predict”), according to the histogram. For example: . twoway (scatter resid bmi) That gave many atypical observations, I mean, a significant percentage of then was not concentrated. The scatterplot with the other independent variables were not much promising, due to the same problem: too much variation. The issues regarding the violation of assumptions in the model are the same I wrote in the first message (they are described below). I gather my awkward mistyping (logit, when I wanted to say log) in the message got confusing. But now I hope I made it clearer. That said, I gathered the extreme variance of the dependent variable could be somewhat “adjusted” if we categorize it according to the quartiles. So did I. That was the reason I tried to test multinomial, ordered and generalized ordered models. And I used the AICs and BICs so as to spot the best model. The generalized ordered model with the partial proportional odds fitted best. But then, when I employed the user-written ocratio (for continuation-ratio models), I couldn’t get neiher AIC nor BIC. And I really don’t know the reason. What is more, I wish to get them so as to be able to compare with the other models. The troubleshooting was: after ocratio, Stata 13 didn't present the AIC and BIC. I know it can be presented after "aic". Also, I typed "estat ic", and even the user-writter fitstat. But I got nor AIC neither BIC, but only the red message: "estimates not found". That's what puzzled me. That said, may this kind of model be considered inappropriate for the task, please let me know. Indeed, it's the first time I delve into polytomous models. I thank you, David, for your considerations and I still hope to get further advice and suggestions from you and the fellows from Statalist! Best regards, Marcos Almeida Associate Professor of Medicine UNIT Brazil >Date: Sat, 28 Dec 2013 21:14:43 -0500 >From: David Hoaglin <dchoaglin@gmail.com> >Subject: Re: st: Ocratio gives neither AIC nor BIC > Dear Marcos, >I have not seen any replies so far to your posting. Perhaps I can >make a start, though I have more questions than answers. >I did not see any information on your dependent variable, other than >that it is continuous and very positively skewed. It would help me >(and it may help others) to know more about the nature of that >variable. Skewness of the dependent variable, considered alone, does not necessarily prevent you from using it in a regression model. It is more important to examine (e.g., in scatterplots) the relation between the dependent variable and each of the predictor variables. Some of those relations may account for the apparent skewness. After fitting an initial model, you should examine the residuals. The various plots may suggest that you transform the dependent variable (e.g., to a logarithmic scale). A GLM is a common alternative to using a transformation, but I don't understand why you chose the logit link. With a continuous dependent variable, I would have expected a log link. I will stop here. The rest of your analysis goes in a direction whose logic I do not understand. David Hoaglin On Fri, Dec 27, 2013 at 5:02 PM, Marcos Almeida <virtual.596@gmail.com> wrote: >> Hello, Statalisters, >> I have a dataset whose continuous dependent variable is very > positively skewed. I decided to eschew regression analysis, even after > log-transforming it, because I gather a generalized linear model gives > better adjustments for this "situation". >> After testing with glm family (gaussian) link (logit), it still > presented signs of needing a better-fit model. Then, I took the > decision to create a new variable, that is, I transformed the > dependent variable in quartiles. After that, I got 4 categories(up to > the 25th percentile; from the 25th to the median; from the median to > the 75th percentile; from the 75th up to the highest value). >> And now comes my question. >> I compared several models: the multinomial logit (mlogit) the ordinal > logit(ologit), the generalized ordered model (gologit2 user-written > command) and finally, the gologit2 with proportional-odds(autofit > option)pleased me most. I mean it because the multinomial logit didn't > comply with the IIA assumption (the much debated Hausman test), the > ologit didn't comply with the proportional-odds assumption and the > gologit2 with the autofit option dutifully adjusted for the partial > proportional-odds. >> After doing each modelling, I calculated the AIC and BIC without any > trouble. However, just for a last try, I decided to perform a > continuation-ratio model. At first, I found it a reasonable option, > theoretically speaking. >> After installing the user-written ocratio, I did the estimations and > all seemed to be just fine. But I noticed something wrong: the report > didn't show the AIC statistic. That came as I surprise.I really don't > understand what might have happened. I did (almost) everything, I > mean, in terms of commands I knew:estat ic, for example. Also, I > installed use-written commands, like fitstat, unfortunately of no > avail. By the way, I carefully read a book (Generalized Linear Models > and Extensions, Hardin and Hilbe, StataPress, page 343), and,lo and > behold, there ocratio gave the AIC after typing "aic". With much hope, > I typed this command, again of no avail. Sadly > enough, all I got was the message in red: estimates not found. >> I checked the FAQs on the matter as well as potential queries on the > Web, but nothing was found related to this. And I'm still perplexed. >> My software is a weekly updated Stata13 IC. I wonder if you could give > me some advice. >> Finally, I heartly thank you for your consideration. >> Best regards, >> Marcos Almeida, * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Ocratio gives neither AIC nor BIC***From:*Phil Schumm <pschumm@uchicago.edu>

- Prev by Date:
**Re: st: stset survival analysis with right censoring and left truncation for a bankruptcy dataset** - Next by Date:
**st: Marginal effects for panel logit correlated random effects estimation** - Previous by thread:
**Re: st: Ocratio gives neither AIC nor BIC** - Next by thread:
**Re: st: Ocratio gives neither AIC nor BIC** - Index(es):