Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Marcos Almeida <[email protected]> |

To |
[email protected] |

Subject |
Re: re: st: Ocratio gives neither AIC nor BIC |

Date |
Wed, 1 Jan 2014 11:56:22 -0200 |

Hello Phil and all Statalisters, Dear Phil, thank you very much for your reply and for the suggestions. Also, for the research done on heart rate variability. Regarding pnn50, as I underlined, we're dealing with long-run (24 h) "real-life" electrocardiogram, and it represents the mean value for all the period. Therefore, it's different from the registers we'd take in 15 minutes, having patients resting on a lit. There is no standard for the elderly population and my aim is to shed some light on the changes of this variable according to the age groups, taking into consideration a few covariates. The "problem" was: there were many outliers and I feared we'd rather not just "dismiss" them. For many reasons, let alone the very fact we still don't have the standard for the elderly population, particularly within 24 h reports. I absolutely agree with your and David's recommendations, and I guess, to some extent, as I mentioned, they were put into practice, including the log-transformation and the checking of residuals. To may dismay, I'm afraid these deeds weren't enough to cope with such a huge variation. That prompted me to accept losing some information by employing the quartiles, but having as a reward a much "representative" estimation. Or "stable', if you will. So I first-in-time "plunged" into a whole bunch of multinomial response models. Unfortunately, the assumption of proportional odds didn't favour the ordered logit model. Nor the IIA assumption favoured the multinomial logit model. Then I tested the generalized ordered outcome model, and the adjustments for the partial proportional odds pleased me most. I did this by installing the user-written gologit2 and adding in the Stata command the option "autofit". The reason was the fact that some covariates didn't violate the proportional-odds assumption, so I found it reasonable to keep their regression coefficients without restrictions. By the way, I guess I dutifully abided by the instructions in the book I mentioned: Generalized Linear Models and Extensions, Hardin and Hilbe, Stata Press, third edition, 2012. Indeed, no sooner after reading in the book the chapter on continuation-ratio, then I decided to give it a try, and compare with my "collection" of AICs and BICs from the models hitherto evaluated. I installed the user-written command ocratio as suggested in pages 339-344. According to the reports presented in the book, we'd easily get the AIC by just typing "aic" after the estimation. It didn't happen so, though. Not even when typing "estat ic", or, else, after installing the user-written "fitstat". All I got was the message in red: "estimates not found". Since my software is a weekly uptaded Stata 13 IC, and I gather the version from the above-mentioned book might well be Stata 11 or 12, I decided to share this "situation" in the Statalist, may it perchance be some kind of troubleshooting. Hopefully you will give me some further advice on how to endly get the AIC after ocratio. Thank you again for all the consideration and thoughtful suggestions! Have all Statalisters an excellent 2014! Best regards, Marcos Almeida Associate Professor of Medicine UNIT Brazil Date: Mon, 30 Dec 2013 12:09:36 -0600 From: Phil Schumm <[email protected]> Subject: Re: st: Ocratio gives neither AIC nor BIC On Dec 29, 2013, at 1:33 PM, Marcos Almeida <[email protected]> wrote: > The dependent variable relates to heart rate variability software in time domain analysis. It is called pnn50. Pnn50 is a result of (validated) computerized measurements done over a 24 hour electrocardiogram and conveys the parasympathetic flow: the higher the values, the higher the parasympathetic flow. > > In this dataset, the mean of pnn50 is = 9; the SD = 15; min = 0.01; max = 213). My understanding was that pNN50 is a percentage. The mean and SD you cite sound plausible (e.g., Ramaekers et al. 1998), but the minimum (and obviously the maximum) do not. How does your measure differ from the standard pNN50 calculation? As David said, understanding how your dependent variable is measured/calculated is typically the first step in determining a reasonable model. For example, if you are indeed modeling a percentage, then a logistic model (i.e., glm with logit link and binomial variance function) might be a plausible candidate. Another important aspect of choosing an appropriate model is the goal of your analysis. Are you looking merely to test a null hypothesis, or are you looking for a richer description of the relationships in the data? Are you primarily interested in estimating how the response changes (possibly on a specific scale) according to changes in your covariates? Are you looking to replicate a previous analysis, or to provide information that can be used in subsequent studies or possibly for clinical purposes (e.g., a nomogram)? Or, are you primarily interested in prediction? Having a clear idea of your analytic goal(s) is also an important part of model-building. David gave excellent advice WRT using plots to examine the distribution of your dependent variable *conditional on* the covariates (as opposed to only the marginal distribution of the dependent variable). The most important features here are the mean of the distribution (which determines the appropriate link function) and the variance (which determines the appropriate variance function or distribution family). In particular, so-called component plus residual plots are excellent for examining how the mean changes with the covariates, while a smoothed plot of the absolute (standardized) residuals can be helpful in identifying an appropriate variance function (I presume that these strategies or similar alternatives are discussed in Hardin and Hilbe's book). Whether you ultimately transform the dependent variable or use an appropriate combination of link/family to obtain a model depends in part on your analytic goals, but either should, if properly performed, give similar con! clusions. Personally, I wouldn't transform pNN50 into quartiles (at least not without a compelling reason for doing so). This throws away information, and ties your conclusions to the observed quartiles in your sample, which may not be relevant in other samples/populations. On a related note, I'm not surprised that the proportional odds model doesn't fit, since that assumes an underlying logistic distribution (albeit conditional on the covariates), and from what you've said, it doesn't sound like your dependent variable is symmetric (as is the logistic distribution). As an alternative, if you want to explore how the quantiles of your dependent variable are related to your covariates, you could use quantile regression (-qreg- in Stata). Koenker (2005) illustrates the value of plotting the coefficient from quantile regression against the quantile, which can be very informative. In sum, your dataset is large enough (n = 1,800) to provide a lot of information about which models fit well (and which do not), and, once you have a good model, to yield relatively precise estimates. My advice (consistent with David's) would be to spend some more time thinking about the precise nature of your dependent variable, and examining how the (conditional) distribution of this changes with the covariates. Based on this, it is likely that you can come up with a reasonable regression model (either by transforming the dependent variable or by an appropriate choice of link/family), which will serve as a good baseline model even if you decide to pursue other approaches. - -- Phil References - ---------- Hardin, J. W. & Hilbe, J. M. (2007). Generalized Linear Models and Extensions, 2nd edition. College Station, TX: Stata Press. Koenker, R. (2005). Quantile Regression. Cambridge: Cambridge University Press. Ramaekers, D., Ector, H., Aubert, A. E., Rubens, A., & Van de Werf, F. (1998). Heart rate variability and heart rate in healthy volunteers. European Heart Journal, 19, 1334-1341. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: Randomly assigning certain percentage of observations within groups** - Next by Date:
**Re: st: Randomly assigning certain percentage of observations within groups** - Previous by thread:
**st: Randomly assigning certain percentage of observations within groups** - Next by thread:
**re: Re: st: RE: Propensity score balancing property** - Index(es):