Just an additional small question, is there some way that I can extract all the proportions and their lower and upper bonds from the simulation by using some sort of loop? I have made it by repeating the code for each group, i.e. [x:1], [x:2] and so on. But to make the code a bit leaner when I have many groups I assume there is some way to loop this, but my tries with -foreach var- did not work well. Any suggestions? Richard 2010/9/17 Richard Moverare <richard.moverare@gmail.com> > > Thank you Maarten, this really helped me. > > All the best, > Richard > > 2010/9/17 Maarten buis <maartenbuis@yahoo.co.uk> >> >> --- On Thu, 16/9/10, Richard Moverare wrote: >> > I would like to illustrate the uncertainty of a SRS >> > (without replacement) by first creating a dataset with >> > one variable that identifies a number of different >> > groups in the population (N), e.g. 415 units in group A, >> > 634 units in group B, on so forth. Then I would like to >> > draw a number of samples from that population, e.g. 20 >> > different samples and get estimates for the proportion of >> > the population belonging to group A, B, ..., and the >> > confidence interval (95 percent) for those estimates. And >> > finally I would like to, in a graph, illustrate the true >> > population proportion and the 20 different samples with >> > their confidence intervals. This in order to illustrate >> > the uncertainty but also that the confidence interval >> > sometimes do not include the true population value. >> >> As I understand Simple Random Sampling, it would be sampling >> with replacement (but if the population is large compared >> to the sample that should not matter too much). >> >> For such an excercise I would use the -simulate- command, >> like in the example below. I recovered the confidence >> intervals as discussed in (Buis 2007). >> >> *------------------- begin example -------------------- >> program drop _all >> program define sim, rclass >> >> // create population >> drop _all >> set obs 10000 >> gen x = cond(_n <= 500, 1, /// >> cond(_n <= 5000, 2, 3)) >> >> // draw a 1% sample without replacement >> sample 1 >> >> // estimate the proportions and return the results >> proportion x >> return scalar p = _b[x:1] >> return scalar lb = _b[x:1] - invttail(e(df_r),0.025)*_se[x:1] >> return scalar ub = _b[x:1] + invttail(e(df_r),0.025)*_se[x:1] >> end >> >> // repeat this 20 times and store the results in a dataset >> simulate p=r(p) lb=r(lb) ub=r(ub), reps(20) : sim >> >> //graph the results >> gen sample = _n >> twoway scatter sample p || /// >> rcap lb ub sample, horizontal xline(.05) >> *-------------------- end example -------------------------------- >> (For more on examples I sent to the Statalist see: >> http://www.maartenbuis.nl/example_faq ) >> >> Hope this helps, >> Maarten >> >> M.L. Buis (2007), "Stata tip 54: Where did my p-values go?", >> The Stata Journal, 7(4), pp.584-586. >> >> >> -------------------------- >> Maarten L. Buis >> Institut fuer Soziologie >> Universitaet Tuebingen >> Wilhelmstrasse 36 >> 72074 Tuebingen >> Germany >> >> http://www.maartenbuis.nl >> -------------------------- >> >> >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

