Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Predicted probabilities in graph


From   Steven Samuels <sjhsamuels@earthlink.net>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Predicted probabilities in graph
Date   Thu, 4 Sep 2008 20:19:13 -0400

Use this one.
It didn't take me long to find a bad type.
-Steve


On Aug 28, 2008, at 8:07 AM, Richard Ohrvall wrote:

Hi!

I am trying to illustrate predicted probabilities for a successful
outcome in a logistic regression, e.g. when one independent variable
goes from min to max when all other variables are kept at their mean.

This can be done by using the user-written program -prgen- by J. Scott
Long and J. Freese and the way that they explain in their book
Regression Models for Categorical Depedent Variables Using Stata
(2006), Stata Press. However, according to the book, the program can
not be used when data comes from a complex sample, i.e.  when prefix
svy is used.

Since I mainly use survey data with complex sample design, this is a
problem for me if I want to present the confidence intervals in the
graph. Therefore, I wonder if there is some other program or some
other way to make predicted probabilities for such graphs when taking
the sampling design into account when estimating the confidence
interval?

Here is a do file for Richard. It differs from the previous solution I submitted in two main ways:
1. It adds the confidence interval endpoints that Richard requested
2. It adjusts predictors to their survey-estimated means, not to their raw sample means.

The second feature is the primary contribution. Stata's -adjust- can do all that the previous version did, and more. I simply was not familiar with -adjust- and didn't recognize that it met most of Richard's needs--except the need to adjust to survey-estimated means. In fact, wiith only a few covariates, Richard could have added those means by hand.


/**************************START CODE***********************************/

local pname pprob02 //SET PROGRAM NAME FOR USE IN LOG & GRAPH
capture log close
set more off
log using `pname', replace text

/ ************************************************************************ ***
This do file e creates predicted probabilities from survey logistic regression which vary in one predictor at a time, holding others at their estimated population means. Also computed are upper and lower confidence interval endpoints for the prediction. An optional graphics section plots predictions and CI endpoints for each variable against that variable.

The code takes advantage of Stata's -adjust- command to compute adjusted logit values and their standard errors.
************************************************************************ ****/

/* -svyset- the auto dat */
sysuse auto,clear
svyset _n [pweight=rep78]

/ ************************************************************************ ***
Define a predictor list for the right-hand-side of the svy: logit equation
************************************************************************ ****/
local rhs mpg weight trunk

/ ************************************************************************ ***
Compute estimated means of each predictor to use in the adjustment. These are the variable names prefixed by m_ ************************************************************************ ****/

svy: mean `rhs'
foreach v of varlist `rhs' {
local m_`v' = _b[`v']
}

/ ************************************************************************ ***
For each varying predictor, build up parts of the -adjust- command.
If mpg is the predictor, for example, the adjust command requires:

adjust weight = 2941.106382978724 trunk = 13.73191489361702
The constants are the means from -svy: mean- for variables OTHER than mpg.
************************************************************************ ****/

foreach v of varlist `rhs' {
local less : list rhs -v
local listx " "
foreach w of varlist `less' {
local lw "`w' = `m_`w'' " // e.g. "weight = 2941.106382978724"
local listx `listx' `lw' // Augment the list
}
local list_`v' `listx'
// di "`v'"
// di "`list_`v''"
}
/ ************************************************************************ ***
Now the survey logit command itself:
************************************************************************ ****/
svy: logit foreign `rhs'

/ ************************************************************************ ***
Create a constant to multiply the standard error of the prediction
************************************************************************ ****/

local mult = invttail(e(df_r), (1- .95)/2)

/ ************************************************************************ ***
For each predictor variable v (e.g. mpg), use -adjust- to compute:

xb_v : the logit of the predicted value
se_v : the se of the logit

From these, get:

pred_v : predicted value for each observation's value of v, holding other
predictors at their mean
llim_v : lower 95% confidence interval endpoint
ulim_v : upper 95% confidence interval endpoint
************************************************************************ ****/

foreach v of varlist `rhs' {
adjust "`list_`v''" , g(xb_`v' se_`v') se

gen pred_`v' = 1/(1+exp(-xb_`v'))

gen llim_`v' = xb_`v' - `mult'*se_`v'
gen ulim_`v' = xb_`v' + `mult'*se_`v'

replace llim_`v' = 1/(1+exp(-llim_`v'))
replace ulim_`v' = 1/(1+exp(-ulim_`v'))

label var pred_`v' "Predicted"
label var llim_`v' "Lower CI"
label var ulim_`v' "Upper CI"
}

sum xb_* se_*
drop xb_* se_* //optionally drop

/ ************************************************************************ ***
Save the data if you want to graph later.
************************************************************************ ****/

// save (choose a new name)

/ ************************************************************************ ***
Optional Graph Module
Graph the prediction and confidence interval endpoints for each v against v, and save the graph.
************************************************************************ ****/


foreach v of varlist `rhs' {
twoway scatter pred_`v' llim_`v' ulim_`v' `v', msymbol(o o o) title ("Plot of Prediction against `:variable label `v''" "Other Predictors at their Mean") note("Program: `pname'. Data: ") saving (graph_`v', replace)
}

log close
set more on

/************************END CODE ***********************************/

*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index