Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steve Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Postestimation Analysis in Survey Data |

Date |
Sat, 17 Nov 2012 17:34:55 -0600 |

For descriptive statistics which are means (e.g. proportions), you can create frequency weights equivalent to probability weights to any degree of accuracy. The idea is Austin Nichol's. ********************************* gen fwt = 10^k*round(your_wt,10^(-k)) // e.g. Match to two decimal places di 100 * round(123.456789,1/100) ********************************* -svyset- your data and use -svy: logistic- for your analysis. then: You can also do ROC curves directly from the results: http://www.stata.com/statalist/archive/2011-02/msg00082.html, with reference to Roger Newson's -senspec- (from SSC). • predictions: see the help for - logistic postestimation - • influential points: you can compute a weighted version of DFBETA using the code below, explained at http://www.stata.com/statalist/archive/2012-04/msg01230.html Note that if your goal is prediction, then the ROC curve is optimistic. You'll need a leave-some-out validation procedure to have unbiased ROC curves. Steve Code follows: *************CODE BEGINS************* /* Explanation: Base on jackknife pseudo values. X_i is a statistic based on all obs except the i-th dfbeta_i = (b - b_i)/se_i(b) officially This version uses se(b) from svy: logistic regression in place of se_i(b). */ sysuse auto, clear gen makr = substr(make,1,2) svyset makr [pw = trunk] local lhs foreign //response local xvars turn price // predictors //get pseudo values: // Not svy: jackknife, because units are clusters jackknife , keep: logistic `lhs' `xvars' [pw = trunk] // get betas and se for all data svy: logistic `lhs' `xvars' foreach z of varlist `xvars'{ gen dfb_`z' = /// (1/_se[`z'])*(`lhs'_b_`z'-_b[`z'])/(e(N)-1) } sum dfb* //dfbetas ***********CODE ENDS******************* On Nov 16, 2012, at 5:22 PM, jcalder wrote: Hello, I am doing a logistic regression analysis on a complex survey dataset and would like to do the following post estimation analyses; • ROC • classification table • Sensitivity curve • Get the predicted probabilities of a positive outcome • Evaluate influential data Numbers 1-3 require frequency weights and I have pweights is their another way to get these in STATA? What command do you suggest using for number 4 & 5 with SVY command? Dr. Jennifer Calder jcvet08@gmail.com * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Postestimation Analysis in Survey Data***From:*jcalder <jcvet08@gmail.com>

- Prev by Date:
**Re: st: reshape command with multiple variables** - Next by Date:
**st: example about choice experiment datasheet** - Previous by thread:
**Re: st: Postestimation Analysis in Survey Data** - Next by thread:
**st: Imputation using ML for a lognormal ordered income variable** - Index(es):