[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Martin Weiss" <martin.weiss1@gmx.de> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: AW: R: AW: R: RE: odd results after insample |

Date |
Sat, 26 Sep 2009 21:47:23 +0200 |

<> I am not sure that you can do anything about tables 3 and 4, as they are out of reach as long as you do not hold the specific data used in these tables in hand. What you can do, though, is replicate the (left panel of) table 1, which is an interesting exercise in Stata programming and the use of -postfile- and -tabdisp-. I am not sure whether I have done this efficiently, but it sure works, and under a -version- other than mine at that. Note that I am using 100 replications for the -simulate- command, instead of 10,000 as in the original article, so that this thing finishes in finite time. You want to check this _very_ carefully: ************* clear* vers 9.2 //lognormal part capt prog drop lognorm prog def lognorm, rclass vers 9.2 syntax [,obs(integer 100) cov(real 2)] drop _all set obs `obs' loc sd = sqrt(log(`cov'^2+1)) loc mean = log(1000)-.5*`sd'^2 tempvar z gen `z'=exp(invnormal(uniform())*`sd'+`mean') su `z' ret sca rmse=sqrt((`r(mean)'-1000)^2) end //gamma part capt prog drop gam prog def gam, rclass vers 9.2 syntax [,obs(integer 100) cov(real 2)] drop _all set obs `obs' loc shape = (`cov')^(-2) loc scale = 1000/`shape' tempvar z gengamma `z', alpha(`shape') beta(`scale') su `z', mean ret sca rmse=sqrt((`r(mean)'-1000)^2) end //postfile tempname hdle tempfile info postfile `hdle' str15 distr cov /* */ size meanrmse using `info' //gamma foreach cv in .25 .5 1 1.5 2{ foreach size in 20 50 200 500 2000{ simulate rmse=r(rmse), reps(100): /* */ gam, obs(`size') cov(`cv') sum rmse, mean post `hdle' ("Gamma") (`cv') (`size') (r(mean)) } } //lognormal foreach cv in .25 .5 1 1.5 2{ foreach size in 20 50 200 500 2000{ simulate rmse=r(rmse), reps(100): /* */ lognorm, obs(`size') cov(`cv') sum rmse, mean post `hdle' ("Lognormal") (`cv') (`size') (r(mean)) } } postclose `hdle' preserve use `info', clear replace meanrmse=round(meanrmse) tabdisp cov size, cellvar(meanrmse) /* */ by(distr) restore ************* Do you have any idea how to replicate the right panel? Somehow these results are out of reach for me :-( HTH Martin -----Ursprüngliche Nachricht----- Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Carlo Lazzaro Gesendet: Samstag, 26. September 2009 20:56 An: statalist@hsphsun2.harvard.edu Cc: 'Martin Weiss' Betreff: st: R: AW: R: RE: odd results after insample Dear Martin, many thanks for your unvaluable efforts. However, I am current interested in data reported in Table 3 and 4 of the article. I have not access to the related data sets and I am figuring out a way to mimick them and random sampling from the obtained (far fetched) distributions. Thanks a lot again, especially for downloading the paper and devoting your time to think it over? Out of curiosity: is the economic evaluation of health care programmes one of your research fields? Kind Regards, Carlo -----Messaggio originale----- Da: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Martin Weiss Inviato: sabato 26 settembre 2009 19.51 A: statalist@hsphsun2.harvard.edu Oggetto: st: AW: R: RE: odd results after insample <> So if you wanted the entire figure 1 under Stata -version- 9.2, you would probably want to install Bobby`s -findit gendist- and then: ************* clear* vers 9.2 //lognormal part capt prog drop myprog prog def myprog, rclass vers 9.2 syntax newvarname(numeric max=1), [obs(integer 100) cov(real 2)] set obs `obs' loc sd = sqrt(log(`cov'^2+1)) loc mean = log(1000)-.5*`sd'^2 gen `varlist'=exp(invnormal(uniform())*`sd'+`mean') qui su `varlist' ret sca mean=r(mean) ret sca cv=r(sd)/r(mean) end loc gra loc j 1 //for sample size 2000 foreach cv in 0.25 0.5 1 1.5 2{ myprog lognorm`j', obs(2000) cov(`cv') loc gra `gra' (kdensity lognorm`j' if lognorm`j'<3000) loc ++j } //see the mean and coeff of variation tabstat _all, stat(mean cv sd) tw `gra', legend(off) nodraw /* */ name(lognormal, replace) //gamma part capt prog drop mynewprog prog def mynewprog, rclass vers 9.2 syntax newvarname(numeric max=1) [,obs(integer 100) cov(real 2)] set obs `obs' loc shape = (`cov')^(-2) loc scale = 1000/`shape' gengamma `varlist', alpha(`shape') beta(`scale') qui su `varlist' ret sca mean=r(mean) ret sca cv=r(sd)/r(mean) end loc gra loc j 1 //for sample size 2000 foreach cv in 0.25 0.5 1 1.5 2{ mynewprog gamma`j', obs(2000) cov(`cv') loc gra `gra' (kdensity gamma`j' if gamma`j'<3000) loc ++j } //see the mean and coeff of variation tabstat _all, stat(mean cv sd) tw `gra', legend(off) /* nodraw */ name(gamma, replace) //combine 'em gr combine lognormal gamma, /* */ cols(1) ************* HTH Martin -----Ursprüngliche Nachricht----- Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Carlo Lazzaro Gesendet: Samstag, 26. September 2009 16:20 An: statalist@hsphsun2.harvard.edu Cc: 'Martin Weiss' Betreff: st: R: RE: odd results after insample Dear Martin, thanks a lot for your kind reply. The approach sketched in my previous message follows the one suggested by: Briggs, A. and Nixon, R. and Dixon, S. and Thompson, S. (2005) Parametric modelling of cost data: some simulation evidence. Health Economics 14(4):pp. 421-428. So far, I have been quite successful with other Stata procedures for drawing random samples from a given distribution (for instance, -simulate-), including the approach you kindly advice me about. Unfortunately, I cannot figure out what went wrong with this last do_file. Thanks a lot again and enjoy your W_E. Kind Regards, Carlo -----Messaggio originale----- Da: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Martin Weiss Inviato: sabato 26 settembre 2009 15.21 A: statalist@hsphsun2.harvard.edu Oggetto: st: RE: odd results after insample <> Just out of curiosity: If you want 20 obs per sample, and 2,000 samples, should that not lead to 40,000 observations overall? HTH Martin -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Carlo Lazzaro Sent: Samstag, 26. September 2009 15:00 To: statalist@hsphsun2.harvard.edu Subject: st: odd results after insample Dear Statalisters, as an alternative to - simulate - , I have written the following do file (for Stata 9.2/SE) to draw 2000 random samples, 20 observations each, from a normal distribution: drop _all set more off set obs 2000 obs was 0, now 2000 g double ln_g_20=. g double ln_sd_g_20=. set seed 999 qui gen A=5.37 + 1.19*invnorm(uniform()) in 1/972 qui forvalues i = 1(1)2000 { qui gen ln_20`i'=A qui generate random`i' = uniform() qui sort random`i' qui generate insample`i' = _n <= 20 qui sum ln_20`i' if insample`i' == 1 replace ln_g_20=r(mean) in `i' replace ln_sd_g_20=r(sd) in `i' drop ln_20`i' drop random`i' drop insample`i' } drop A However, as a result I have obtained 1721 observations instead of the expected 2000. sum ln_g_20 ln_sd_g_20 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ln_g_20 | 1271 5.314033 .3800687 3.79247 6.587941 ln_sd_g_20 | 1271 1.101084 .2835007 .0260279 2.161299 Besides, results are even more puzzling when I increase the number of samples (again 20 observations each), in that I get a different number of observation for ln_g and ln_sd_g. Comments are gratefully acknowledged. Thanks a lot for your kindness and for your time. Kind Regards, Carlo * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: R: AW: R: AW: R: RE: odd results after insample***From:*"Carlo Lazzaro" <carlo.lazzaro@tin.it>

**References**:**st: AW: R: RE: odd results after insample***From:*"Martin Weiss" <martin.weiss1@gmx.de>

**st: R: AW: R: RE: odd results after insample***From:*"Carlo Lazzaro" <carlo.lazzaro@tin.it>

- Prev by Date:
**Re: st: multiple imputation in Stata 11** - Next by Date:
**Re: st: svyset problem 2: using svy with partially complete surveys** - Previous by thread:
**st: AW: R: AW: R: RE: odd results after insample** - Next by thread:
**st: R: AW: R: AW: R: RE: odd results after insample** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |