Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
David Hoaglin <[email protected]> |

To |
[email protected] |

Subject |
Re: st: CDF plot with normal probability axis |

Date |
Thu, 14 Nov 2013 07:17:16 -0500 |

Nick, For plotting positions, I prefer (i - (1/3))/(n + (1/3)). John Tukey introduced these after analyzing the sampling distributions of the order statistics in a sample of n from the uniform distribution on (0,1). The expression above is a good approximation for the median of the sampling distribution of the i-th order statistic in such a sample (a slight modification improves the approximation when i = 1 and i = n). In a Q-Q plot against a distribution with c.d.f. F, the plotting positions (from any definition) are transformed by F-inverse. Since monotonic transformations preserve medians, the transformed plotting positions are good approximations for the medians of the sampling distributions of the order statistics of a sample from the chosen distribution. David Hoaglin On Thu, Nov 14, 2013 at 4:21 AM, Nick Cox <[email protected]> wrote: > -distplot- (SJ), -cdfplot- (STB originally, SSC now): as always, > please explain the origin of the user-written commands you refer to. > > -qplot- (SJ) can do this, roughly. > > . sysuse auto > (1978 Automobile Data) > > . qplot turn trunk, trscale(invnormal(@)) > > . qplot turn trunk, trscale(invnormal(@)) xtitle(standard normal > deviate) xla(-2/2) > > The axes are the other way round from what you ask; I'd argue that is > better practice, or at least consistent with -qnorm-. (-ysc(log)- is > also possible.) > > Note that you should not expect cumulative distribution plots to do > this by default as they usually plot cumulative probabilities as 1/n, > ..., n/n and -invormal(n/n)- is -invnormal(1)- and as such > indeteminate. > > But it is as easy to do this pretty much from first principles. See e.g. > > http://www.stata.com/support/faqs/statistics/percentile-ranks-and-plotting-positions/index.html > > http://www.stata-journal.com/sjpdf.html?articlenum=gr0027 > > http://www.stata-journal.com/sjpdf.html?articlenum=gr0032 > > I will cheat slightly and use -mylabels- (SSC). > > Here is some code. Any number of possible small variations should be evident. > > sysuse auto, clear > > replace price = price/1000 > > foreach v in price mpg { > egen y`v' = rank(`v') > su `v', meanonly > replace y`v' = invnormal((y`v' - 0.5) / r(N)) > label var y`v' "`: var label `v''" > } > > mylabels 1 5 10(10)90 95 99, myscale(invnormal(@/100)) local(labels) > > twoway connect yprice price, ms(Dh) sort || /// > connect ympg mpg, sort ms(Th) xsc(log) yla(`labels', ang(h)) xla(5 10 20 40) /// > ytitle(Cumulative percent) > > Nick > [email protected] * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: CDF plot with normal probability axis***From:*Nick Cox <[email protected]>

**References**:**st: CDF plot with normal probability axis***From:*"Livingston, Michael (TP)" <[email protected]>

**Re: st: CDF plot with normal probability axis***From:*Nick Cox <[email protected]>

- Prev by Date:
**Re: st: reference meta-analysis adjusted OR** - Next by Date:
**Re: st: CDF plot with normal probability axis** - Previous by thread:
**Re: st: CDF plot with normal probability axis** - Next by thread:
**Re: st: CDF plot with normal probability axis** - Index(es):