Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Postestimation Analysis in Survey Data

From	Steve Samuels <[email protected]>
To	[email protected]
Subject	Re: st: Postestimation Analysis in Survey Data
Date	Sat, 17 Nov 2012 17:34:55 -0600

For descriptive statistics which are means (e.g. proportions), you can
create frequency weights equivalent to probability weights to any degree
of accuracy. The idea is Austin Nichol's.  

*********************************
gen fwt = 10^k*round(your_wt,10^(-k))

// e.g. Match to two decimal places
di 100 * round(123.456789,1/100)
*********************************

-svyset- your data and use -svy: logistic- for your analysis. then:

You can also do ROC curves directly from the results:

http://www.stata.com/statalist/archive/2011-02/msg00082.html, with
reference to Roger Newson's -senspec- (from SSC).

• predictions: see the help for - logistic postestimation -

• influential points: you can compute a weighted version of DFBETA using
the code below, explained at http://www.stata.com/statalist/archive/2012-04/msg01230.html

Note that if your goal is prediction, then the ROC curve is optimistic. You'll need
a leave-some-out validation procedure to have unbiased ROC curves.

Steve

Code follows:

*************CODE BEGINS*************
/* Explanation:
Base on jackknife pseudo values.
X_i is a statistic based on all obs except the i-th
dfbeta_i = (b - b_i)/se_i(b) officially
This version uses se(b) from svy: logistic regression
in place of se_i(b).
*/

sysuse auto, clear
gen makr = substr(make,1,2)
svyset makr [pw = trunk]
local lhs foreign //response
local xvars  turn price  // predictors

//get pseudo values:
// Not svy: jackknife, because units are clusters
jackknife , keep:  logistic `lhs'   `xvars' [pw = trunk]
// get betas and se for all data
svy: logistic `lhs' `xvars'
foreach z of varlist `xvars'{
gen dfb_`z' = ///
  (1/_se[`z'])*(`lhs'_b_`z'-_b[`z'])/(e(N)-1)
}

sum dfb*   //dfbetas
***********CODE ENDS*******************




On Nov 16, 2012, at 5:22 PM, jcalder wrote:

Hello,
I am doing a logistic regression analysis on a complex survey dataset and would like to do the following post estimation analyses;
	• ROC
	• classification table
	• Sensitivity curve
	• Get the predicted probabilities of a positive outcome
	• Evaluate influential data 

Numbers 1-3 require frequency weights and I have pweights is their another way to get these in STATA?

What command do you suggest using for number 4 & 5  with SVY command?

Dr. Jennifer Calder
[email protected]






*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Postestimation Analysis in Survey Data
  - From: jcalder <[email protected]>

Prev by Date: Re: st: reshape command with multiple variables
Next by Date: st: example about choice experiment datasheet
Previous by thread: Re: st: Postestimation Analysis in Survey Data
Next by thread: st: Imputation using ML for a lognormal ordered income variable
Index(es):
- Date
- Thread