# Re: st: svy and pweight postestimation tools

 From Steven Samuels To statalist@hsphsun2.harvard.edu Subject Re: st: svy and pweight postestimation tools Date Sun, 18 Jan 2009 13:49:32 -0500

```Carissa,

```
I think that the legitimacy is "obvious" from inspection of the formulas for weighted data. Still, here's a demonstration that - lroc- with frequency weights produces the same area under the ROC curve as a properly probability weighted estimate. I computed probability-weighted versions of the ROC and AUC with Roger Newson's programs -somersd- and -senspec-, available at SSC. -somersd- computes the AUC (he calls it the "c" statistic); and -senspec- produces sensitivities and specificities for all cut points. Both take pweights and -somersd- will take a cluster variable, so that you can compute a proper CI for the area under the curve. I had to add a zero-zero point to Roger's results before plotting. If you want to completely satisfy your committee, just use the probability-weighted versions. Be sure to zap gremlins before trying this code.
```
-Steve

**************************CODE BEGINS**************************
sysuse auto,clear
****************************************************
* Frequency weighted analysis
****************************************************
logistic foreign mpg [fw=rep78]
predict phat0
lroc [fw=rep78]

****************************************************
* Probability weights
****************************************************
svyset _n [pweight=rep78]
quietly svy: logistic foreign mpg
predict phat

somersd foreign phat [pweight=rep78], tr(c)
matrix b = e(b)
local auc = b[1,1]
di   "Area under the Curve: " %6.5f `auc'

****************************************************
*  Graph ROC Curve with probability weights
****************************************************
```
senspec foreign phat [pweight=rep78], sensitivity(sens) specificity (spec)
```
tempfile t1
save `t1'
clear
input spec sens
1 0
end
append using `t1'
gen ispec=1-spec

```
twoway (scatter sens ispec , sort(sens ispec) connect(L) mlab(mpg)) (line sens sens)
```***************************CODE ENDS***************************

On Jan 17, 2009, at 5:23 PM, Carissa Moffat Miller wrote:

```
```
Steve,

```
I was able to create the ROC curves using your advice about converting the pweights to fweights. However, now a dissertation committee member has asked me to justify (provide documentation) of the legitimacy of doing such a conversion. Is the conversion just to put the pweight in a format that will be accepted by the ROC command and artificially calling it an "fweight"?
```
```
I was not able to find this specific issue addressed in the below reference and I have not been able to find another reference. Do you have any suggested citations?
```
Carissa

```
```From: sjhsamuels@earthlink.net
Subject: Re: st: svy and pweight postestimation tools
Date: Sun, 23 Nov 2008 12:13:01 -0500
To: statalist@hsphsun2.harvard.edu

Carissa, consider ROC curves (the classification tables are not very
useful in my experience). ROC curves show the trade-off between
sensitivity and specificity. You would usually want population
estimates of these probabilities, so ignoring the weights wouldn't be
wise.

My previous post describes how you can compute residuals. These are
inherently unweighted, because observations with the same covariate
pattern will have the same predicted value, and so have only two
values of residuals (for events and non-events). If you are
comparing mean residuals, you might choose to weight them. See Korn
& Graubard, Analysis of Health Surveys, Wiley, 1999, pp 105-115.

-Steve

On Nov 23, 2008, at 10:40 AM, Carissa Moffat Miller wrote:

```
```

Steve and Joao,

found the goodness of fit measure do file from your discussions
(svylogitgof)
and thought there might be something similar for the estat clas or
residuals for svy.

All I was trying to say in my note is that the strata and
PSUs account for so little difference in the outcome that if it
were possible
to run residuals or classification tables using just pweights, I
wanted to keep
that option open. Such as:

xi: logistic aepart i.agecat i.Incomequ i.HIGHEDUC female
[pweight=FAWT]

But it appears that I will have the same issues. Thank you
so much for your responses and help.

Carissa

```
```

2008/11/22 Steven Samuels :
```
```--

Carissa:

-help logistic postestimation- will show you which commands are
available
after -svy: logistic-. The -esttat clas- command is not one of
them in
Stata 9 or 10. -predict- with a -residuals- option is valid in
Stata 10.1
but not in Stata 9. You _can_ compute your own weighted survey -
of fit.

predict hat, xb
gen hat2 = hat*hat
svy: logistic aepart hat hat2 //link test is the significance
of phat2

You can also construct ROC Curves. Use -logistic- with fweights,
the survey
weights rounded to the nearest integer. See the thread at:
http://www.stata.com/statalist/archive/2007-08/
msg00739.html#_jmp0_ .

-Steve

On Nov 21, 2008, at 11:45 AM, Carissa Moffat Miller wrote:

```
```
StataList:

I am conducting logistic regression for a complex survey design
using
Stata version 9. I have found in your past discussions and the
user manuals
that many postestimation tests are not appropriate with svy
commands. I have
have been
unable to get the following commands to work either with an svy
command or
by just using the pweights in Stata.

I have been able to get these to work in another software
program using
the weights, but I'm concerned it isn't appropriately applied.
Can someone
tell me: 1) if these tests are appropriate with complex survey
data or just
pweights, and 2) if so,what are the commands or where would I
find them? or
3) if not appropriate, a reference I might cite?

(Note: The strata and PSUs, when analyzed separately, provide
design
effects almost equal to
1 so the effects in my model are almost entirely from the
weighting. So, I
could get results -except for standard errors - using just the
weights.)

Cheers, Carissa

Syntax and error messages:

svyset APSU [pweight=FAWT], strata (ASTRATUM)
xi: svy: logistic aepart i.agecat i.Incomequ i.HIGHEDUC employed
female
urban

estat clas

{ERROR}: invalid subcommand clas

predict r, residuals
summarize r, detail

{ERROR}: option residuals not allowed

*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```
```
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

```
```

--
----------------------------------------
Joao Ricardo Lima, D.Sc.
Professor
UFPB-CCA-DCFS
Fone: +553138923914
Skype: joao_ricardo_lima
----------------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```
```
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```
```
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```
```
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```