Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Predicted probabilities in graph


From   "Richard Ohrvall" <richard.ohrvall@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Predicted probabilities in graph
Date   Fri, 5 Sep 2008 10:31:39 +0200

Steven,

Thank you for your extensive reply. I will look into this during the weekend.

All the best,
Richard

On 9/5/08, Steven Samuels <sjhsamuels@earthlink.net> wrote:
> Use this one.
> It didn't take me long to find a bad type.
> -Steve
>
>
> On Aug 28, 2008, at 8:07 AM, Richard Ohrvall wrote:
>
> > Hi!
> >
> > I am trying to illustrate predicted probabilities for a successful
> > outcome in a logistic regression, e.g. when one independent variable
> > goes from min to max when all other variables are kept at their mean.
> >
> > This can be done by using the user-written program -prgen- by J. Scott
> > Long and J. Freese and the way that they explain in their book
> > Regression Models for Categorical Depedent Variables Using Stata
> > (2006), Stata Press. However, according to the book, the program can
> > not be used when data comes from a complex sample, i.e.  when prefix
> > svy is used.
> >
> > Since I mainly use survey data with complex sample design, this is a
> > problem for me if I want to present the confidence intervals in the
> > graph. Therefore, I wonder if there is some other program or some
> > other way to make predicted probabilities for such graphs when taking
> > the sampling design into account when estimating the confidence
> > interval?
> >
> >
> Here is a do file for Richard. It differs from the previous solution I
> submitted in two main ways:
> 1. It adds the confidence interval endpoints that Richard requested
> 2. It adjusts predictors to their survey-estimated means, not to their raw
> sample means.
>
> The second feature is the primary contribution. Stata's -adjust- can do all
> that the previous version did, and more. I simply was not familiar with
> -adjust- and didn't recognize that it met most of Richard's needs--except
> the need to adjust to survey-estimated means. In fact, wiith only a few
> covariates, Richard could have added those means by hand.
>
>
> /**************************START
> CODE***********************************/
>
> local pname pprob02    //SET PROGRAM NAME FOR USE IN LOG & GRAPH
> capture log close
> set more off
> log using `pname', replace text
>
> /***************************************************************************
> This do file e creates predicted probabilities from survey logistic
> regression which vary in one predictor at a time, holding others at their
> estimated population means.  Also computed are upper and lower confidence
> interval endpoints for the prediction.  An optional  graphics section plots
> predictions and CI endpoints for each variable against that variable.
>
>
>  The code takes advantage of Stata's -adjust- command to compute adjusted
> logit values and their standard errors.
> ****************************************************************************/
>
>  /* -svyset- the auto dat      */
> sysuse auto,clear
> svyset _n [pweight=rep78]
>
> /***************************************************************************
> Define  a predictor list for the right-hand-side of the svy: logit equation
> ****************************************************************************/
> local  rhs mpg weight trunk
>
> /***************************************************************************
> Compute estimated means of each predictor to use in the adjustment. These
> are the variable names prefixed by m_
> ****************************************************************************/
>
> svy: mean `rhs'
> foreach v of varlist `rhs' {
> local m_`v' = _b[`v']
> }
>
> /***************************************************************************
> For each varying predictor, build up  parts of the  -adjust- command.
> If mpg is the predictor, for example, the adjust command requires:
>
>  adjust weight =  2941.106382978724 trunk =  13.73191489361702
> The constants are the means from -svy: mean- for variables OTHER than mpg.
> ****************************************************************************/
>
> foreach v of varlist `rhs' {
> local less : list rhs -v
> local listx " "
> foreach w of varlist `less' {
> local lw "`w' =  `m_`w''  "  // e.g. "weight =  2941.106382978724"
> local listx `listx' `lw'     // Augment the list
> }
> local list_`v' `listx'
> // di "`v'"
> // di  "`list_`v''"
>  }
> /***************************************************************************
>  Now the survey logit command itself:
> ****************************************************************************/
> svy: logit foreign `rhs'
>
> /***************************************************************************
>  Create a constant to multiply the standard error of the prediction
> ****************************************************************************/
>
> local mult = invttail(e(df_r), (1- .95)/2)
>
> /***************************************************************************
> For each predictor variable v (e.g. mpg), use -adjust- to compute:
>
> xb_v      : the logit of the predicted value
> se_v      : the se of the logit
>
> From these, get:
>
> pred_v    : predicted value for each observation's value of v, holding other
>            predictors at their mean
> llim_v    : lower 95% confidence interval endpoint
> ulim_v    : upper 95% confidence interval endpoint
> ****************************************************************************/
>
> foreach v of varlist `rhs' {
> adjust "`list_`v''" , g(xb_`v' se_`v') se
>
> gen     pred_`v' = 1/(1+exp(-xb_`v'))
>
> gen     llim_`v' = xb_`v' - `mult'*se_`v'
> gen     ulim_`v' = xb_`v' + `mult'*se_`v'
>
> replace llim_`v' = 1/(1+exp(-llim_`v'))
> replace ulim_`v' = 1/(1+exp(-ulim_`v'))
>
> label var pred_`v' "Predicted"
> label var llim_`v' "Lower CI"
> label var ulim_`v' "Upper CI"
>  }
>
> sum xb_* se_*
> drop xb_* se_*   //optionally drop
>
> /***************************************************************************
> Save the data if you want to graph later.
> ****************************************************************************/
>
> // save (choose a new name)
>
> /***************************************************************************
> Optional Graph Module
> Graph the prediction and confidence interval endpoints for each v against v,
> and save the graph.
> ****************************************************************************/
>
>
> foreach v of varlist `rhs' {
> twoway scatter pred_`v'  llim_`v' ulim_`v' `v', msymbol(o o o)  title("Plot
> of Prediction against `:variable label `v''" "Other Predictors at their
> Mean")  note("Program: `pname'. Data:    ")  saving(graph_`v', replace)
> }
>
> log close
> set more on
>
> /************************END CODE
> ***********************************/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index