Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Postestimation Analysis in Survey Data

From   Steve Samuels <>
Subject   Re: st: Postestimation Analysis in Survey Data
Date   Sat, 17 Nov 2012 17:34:55 -0600

For descriptive statistics which are means (e.g. proportions), you can
create frequency weights equivalent to probability weights to any degree
of accuracy. The idea is Austin Nichol's.  

gen fwt = 10^k*round(your_wt,10^(-k))

// e.g. Match to two decimal places
di 100 * round(123.456789,1/100)

-svyset- your data and use -svy: logistic- for your analysis. then:

You can also do ROC curves directly from the results:, with
reference to Roger Newson's -senspec- (from SSC).

• predictions: see the help for - logistic postestimation -

• influential points: you can compute a weighted version of DFBETA using
the code below, explained at

Note that if your goal is prediction, then the ROC curve is optimistic. You'll need
a leave-some-out validation procedure to have unbiased ROC curves.


Code follows:

*************CODE BEGINS*************
/* Explanation:
Base on jackknife pseudo values.
X_i is a statistic based on all obs except the i-th
dfbeta_i = (b - b_i)/se_i(b) officially
This version uses se(b) from svy: logistic regression
in place of se_i(b).

sysuse auto, clear
gen makr = substr(make,1,2)
svyset makr [pw = trunk]
local lhs foreign //response
local xvars  turn price  // predictors

//get pseudo values:
// Not svy: jackknife, because units are clusters
jackknife , keep:  logistic `lhs'   `xvars' [pw = trunk]
// get betas and se for all data
svy: logistic `lhs' `xvars'
foreach z of varlist `xvars'{
gen dfb_`z' = ///

sum dfb*   //dfbetas
***********CODE ENDS*******************

On Nov 16, 2012, at 5:22 PM, jcalder wrote:

I am doing a logistic regression analysis on a complex survey dataset and would like to do the following post estimation analyses;
	• ROC
	• classification table
	• Sensitivity curve
	• Get the predicted probabilities of a positive outcome
	• Evaluate influential data 

Numbers 1-3 require frequency weights and I have pweights is their another way to get these in STATA?

What command do you suggest using for number 4 & 5  with SVY command?

Dr. Jennifer Calder

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index