Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <n.j.cox@durham.ac.uk> |
To | Nick Cox <n.j.cox@durham.ac.uk>, "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: qnorm |
Date | Mon, 5 Mar 2012 17:00:41 +0000 |
Sorry; please ignore that version. It won't work. Nick n.j.cox@durham.ac.uk -----Original Message----- From: Nick Cox Sent: 05 March 2012 17:00 To: 'statalist@hsphsun2.harvard.edu' Subject: RE: st: qnorm I pushed this a bit further. This is just a helper program that calculates variables defining an envelope; it deliberately stops short of the graphics, which -qplot- (SJ) can do for you. A help file will follow in due course. *! 1.0.0 NJC 5 March 2012 program qnormenv version 9 syntax varname(numeric) [if] [in], GENerate(str) [ reps(int 100) level(int 95) ] marksample touse qui count if `touse' if r(N) == 0 error 2000 tokenize "`generate'" if "`2'" == "" | "`3'" != "" { di as err "two names required in generate()" exit 198 } confirm new var `generate' mata : _qnormenv("`varlist'", "`touse'", "`generate'", `reps', `level') end mata: void _qnormenv( string scalar varname, string scalar tousename, string rowvector newnames, real scalar reps, real scalar level) { real matrix compare real colvector y real scalar mean, sd, n, l1, l2, u1, u2 y = st_view(., varname, tousename) mean = mean(y) sd = sqrt(variance(y)) n = rows(y) compare = J(n, 0, .) for (j = 1; j <= reps; j++) { compare = compare, sort(rnormal(n, 1, mean, sd), 1) } level1 = (100 - level) / 200 level2 = (100 + level) / 200 l1 = floor(reps * level1) l2 = ceil(reps * level1) u1 = floor(reps * level2) u2 = ceil(reps * level2) envelope = J(n, 2, .) for (i = 1; i <= n; i++) { x = sort(compare[i,]', 1) envelope[i,] = ((x[l1] + x[l2])/2, (x[u1] + x[u2])/2) } newnames = tokens(newnames) (void) st_addvar("float", newnames) st_store(., newnames, tousename, envelope) } end Nick n.j.cox@durham.ac.uk -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox Sent: 05 March 2012 15:24 To: 'statalist@hsphsun2.harvard.edu' Subject: RE: st: qnorm The approach in Maarten's program is to generate a number of random samples and show the lot as replicates. Here as an alternative is some example code for individual 95% confidence intervals for each plotted point. -qplot- used at the end is from SJ. The code isn't smart about missing values, but it could easily be made smarter. I also guess the code could be shortened in the middle. sysuse auto, clear mata y = sort(st_data(., "mpg"), 1) mean = mean(y) sd = sqrt(variance(y)) n = rows(y) compare = J(n, 0, .) for (j = 1; j <= 100; j++) { compare = compare, sort(rnormal(n, 1, mean, sd), 1) } envelope = J(n, 2, .) for (i = 1; i <= n; i++) { x = sort(compare[i,]', 1) envelope[i,] = ((x[2] + x[3])/2, (x[97] + x[98])/2) } names = tokens("_lower _upper") (void) st_addvar("float", names) st_store(., names, envelope) end qplot mpg _lower _upper, ms(O i i) c(. J J) legend(off) ytitle("`: var label mpg'") Nick n.j.cox@durham.ac.uk Maarten Buis On Mon, Mar 5, 2012 at 10:03 AM, Nick Cox wrote: > 1. Simulate several samples from a distribution with the same mean and > standard deviation (or more generally an appropriate mean and standard > deviation) and use the resulting portfolio of plots in assessing what > kind of variability is to be expected. An easy way to do so is to use the -margdistfit- package, which you can install by typing in Stata -ssc install margdistfit-. The default is actually to first sample the mean and the standard deviation from its sampling distribution and than sample a new variable with those sampled means and standard deviation. I suspect that this makes sense in most cases, though I also suspect that it won't matter much. If you want to do exactly what Nick proposes you can add the -noparsamp- option. Here is an example of what such a graph would look like: *-------- begin example ----------- sysuse auto, clear reg mpg margdistfit, qq name(qq) margdistfit, pp name(pp) margdistfit, hangroot name(hangr) margdistfit, cumul name(cumul) *--------- end example ------------ (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq ) * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/