Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: q-q plots, theoretical distribution with values higher than the sample's cutoff point


From   Nick Cox <n.j.cox@durham.ac.uk>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: q-q plots, theoretical distribution with values higher than the sample's cutoff point
Date   Fri, 20 Jul 2012 17:53:50 +0100

I don't disagree with anything important here. 

I should have said "closed form or canned as accessible functions". What is considered as closed form is historically contingent. The status of e.g. log x has shifted considerably from the 16th century to now. The status of what Stata calls -invnormal()- has varied too. If I can write Stata code that calls -invnormal()- that is in practice on all fours with e.g. writing down a polynomial. 

Nick 
n.j.cox@durham.ac.uk 


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of David Hoaglin
Sent: 20 July 2012 17:45
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: q-q plots, theoretical distribution with values higher than the sample's cutoff point

Nick,

You're correct that, in general, the g-and-h distributions do not have closed-form densities or cumulative distribution functions.  The quantile function doesn't exist in closed form either, but only because the quantile function of the normal distribution is not closed-form.

For reasons of resistance and robustness, I usually prefer to work with quantiles.  Fitting by maximum likelihood opens you up to problems when the distribution has heavy tails and the data may contain outliers.  Nowadays, fitting a g-and-h distribution by maximum likelihood is not a major problem, but it is not just a few lines of code!  I don't know how much has been done on fitting models that involve predictors.  In any event, the g-and-h distributions are a valuable part of my toolkit, but not a panacea.

I have no basic problem with maximum likelihood.  I've made heavy use of it, in Stata and elsewhere.  But good data analysis is iterative:
one should look at data and residuals at various stages.

David Hoaglin

On Fri, Jul 20, 2012 at 10:29 AM, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> Fair question for me at the end. I mean that g- and h- distributions are despite their flexibility rather awkward or elusive customers. It may be just psychology or convenience, but I like distributions with relatively simple closed-form definitions of density, distribution and quantile functions so that I can write a few lines of code to fit them by maximum likelihood, etc. Correct me if I am wrong, but g- and h- don't score well under that heading. As David implies, the practical problem is usually fitting a distribution given predictors, and fitting easily into the ML framework is to me highly desirable.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index