Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Cameron McIntosh <cnm100@hotmail.com> |

To |
STATA LIST <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Bootstrap to compare ROC area on imputed dataset |

Date |
Thu, 17 Nov 2011 15:36:22 -0500 |

Roland, You're asking for both specific Stata code and more general methodological guidance. I can try to take a bit of a crack at the latter. Bootstrapping in conjunction with imputation is quite intensive, although it can of course be done (after all, the two are similar in a number of ways): Efron, B. (1994). Missing Data, imputation, and the bootstrap. Journal of the American Statistical Association, 89(426), 463-475. Heymans, M.W., van Buuren, S., Knol, D.K., van Mechelen, M., & de Vet, H.C.W. (2007). Variable selection under multiple imputation using the bootstrap in a prognostic study. BMC Medical Research Methodology, 7:33.http://www.biomedcentral.com/content/pdf/1471-2288-7-33.pdf Kim, J.K., Brick, J.M., Fuller, W.A., & Kalton, G. (2006). On the bias of the multiple-imputation variance estimator in survey sampling. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(3), 509–521. Kim, J.K., & Rao, J.N.K. (2009). A unified approach to linearization variance estimation from survey data after imputation for item nonresponse. Biometrika, 96(4), 917-932. Davison, A.C., & Sardy, S. (2007). Resampling Variance Estimation in Surveys with Missing Data. Journal of Official Statistics, 23(3), 371–386. Eltinge, J.L. (1996). On Variance Estimation With Imputed Survey Data: Comment. Journal of the American Statistical Association, 91(434), 513-515. Efron, B. (1994). Missing Data, imputation, and the bootstrap. Journal of the American Statistical Association, 89(426), 463-475. Saigo, H., Shao, J., & Sitter, R.R. (2001). A Repeated HalfSample Bootstrap and Balanced Repeated Replications for Randomly Imputed Data. Survey Methodology, 27(2), 189-196.http://www.statcan.gc.ca/ads-annonces/12-001-x/6095-eng.pdf Shao, J., & Sitter, R.R. (1996). Bootstrap for imputed survey data. Journal of the American Statistical Association, 91, 12781288. Chen, J., Rao, J.N.K., & Sitter, R.R. (2000). Adjusted imputation for missing data in complex surveys. Statistics Sinica, 10, 11531169. Essentially, what you want is D=(AUC1-AUC2)/SE_b, where AUC1 and AUC2 are the original AUCs from the two models being compared, and SE_b is the standard error of the bootstrapped AUC differences. I don't imagine this would be very hard to program, perhaps in R if not in Stata. I think you would just bootstrap from each imputed data set so this would expand the number of replications as follows: k imputations * b bootstrap samples. You also definitely need to see (and you might want to try the empirical likelihood approach): Long, Q., Zhang, X., & Hsu, C.-H. (2011). Nonparametric multiple imputation for receiver operating characteristics analysis when some biomarker values are missing at random. Statistics in Medicine, Early View.http://onlinelibrary.wiley.com/doi/10.1002/sim.4338/abstract;jsessionid=63E100FD9A64CCB7B6C8E6D57CA08581.d01t02 Liu, D., & Zhou, X.-H. (January 21, 2011). Semiparametric Estimation of the Covariate-Specific ROC Curve in Presence of Ignorable Verification Bias. UW Biostatistics Working Paper Series. Working Paper 374. Seattle, WA: University of Washington - Seattle Campus.http://www.bepress.com/cgi/viewcontent.cgi?article=1213&context=uwbiostat An, Y. (2011). Empirical Likelihood Confidence Intervals for ROC Curves with Missing Data. Mathematics Theses. Paper 95.http://digitalarchive.gsu.edu/math_theses/95 Liu, X. (2010). Semi-Empirical Likelihood Confidence Intervals for the ROC Curve with Missing Data. Mathematics Theses. Paper 89.http://digitalarchive.gsu.edu/math_theses/89 Janssen, K.J.M., Vergouwe, Y., Donders, A.R.T., Harrell, F.E., Jr., Chen, Q., Grobbee, D.E., & Moons, K.G.M. (2009). Dealing with Missing Predictor Values When Applying Clinical Prediction Models.Clinical Chemistry, 55, 994-1001.http://www.clinchem.org/cgi/reprint/55/5/994http://www.clinchem.org/cgi/content/full/clinchem.2008.115345/DC1 Liu, D., & Zhou, X.-H. (2010). A model for adjusting for nonignorable verification bias in estimation of the ROC curve and its area with likelihood-based approach. Biometrics, 66(4), 1119-1128. Hope this helps, Cam >Date: Thu, 17 Nov 2011 18:04:56 +0100> Subject: st: Bootstrap to compare ROC area on imputed dataset> From: rolandersson@gmail.com > To: statalist@hsphsun2.harvard.edu > > We are analysing discriminating capacity of a clinical score. Because > of some missing values we had to use imputed dataset. We have now > constructed a new clinical score and want to compare the new with an > old, using bootstrap. > > We have used mim, category(combine) est(r(area)) se(r(se)) : roctab > diagnosis score1, summary to analyse the combined ROC area of the > imputed datasets. However we want to compare two different models and > would normally use roctab for this, but this does not work with mim, > category(combine). > > We also want to make a bootstrapped analysis of the diagnostic > properties of a new clinical score on the imputed dataset. > > We would appreciate any help on how to do the bootstrapping of the ROC > areas and comparing two areas on the imputed dataset. > > Regards > > Roland Andersson > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Bootstrap to compare ROC area on imputed dataset***From:*roland andersson <rolandersson@gmail.com>

**References**:**st: Bootstrap to compare ROC area on imputed dataset***From:*roland andersson <rolandersson@gmail.com>

- Prev by Date:
**Re: st: mlogit using pweight** - Next by Date:
**Re: st: Numbers with decimals and -float- command** - Previous by thread:
**st: Bootstrap to compare ROC area on imputed dataset** - Next by thread:
**Re: st: Bootstrap to compare ROC area on imputed dataset** - Index(es):