Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: predicted probabilities


From   "Mona Mowafi" <mmowafi@hsph.harvard.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   Re: st: predicted probabilities
Date   Mon, 17 Nov 2008 14:17:30 -0500

Dear Joao, Maarten, all,

Thank you for your help.  It seems I am getting the same values regardless of whether I set the other variables constant or not, and this seems odd.  Here's what I did:

*//GETTING PREDICTED PROBABILITIES OF BMICAT FROM THE FULL MODEL*//

do "C:\DOCUME~1\MONAMO~1\LOCALS~1\Temp\STD06000000.tmp"
do "C:\DOCUME~1\MONAMO~1\LOCALS~1\Temp\STD06000000.tmp" (these are do files for my multionomial model)
predict pbmi2 pbmi3 pbmi4, pr
sum pbmi2
sum pbmi3
sum pbmi4
table ED2, c(m pbmi2 m pbmi3 m pbmi4)
table WB_pov, c(m pbmi2 m pbmi3 m pbmi4)
table ASSET_INDEX, c(m pbmi2 m pbmi3 m pbmi4)
table PCAwealthindex, c(m pbmi2 m pbmi3 m pbmi4)
describe pbmi2
describe pbmi3
describe pbmi4

*//GETTING PREDICTED PROBABILITIES OF BMICAT KEEPING OTHER VARIABLES CONSTANT (SET AT LOWEST RISK GROUP FOR ALL)*//

preserve
replace WB_pov=4
replace ASSET_INDEX=1
replace PCAwealthindex=1
replace  AGECAT4=1
replace FATHERED=1
replace GENHEALTH_PAST=1
predict pbmiset2 pbmiset3 pbmiset4, p
table ED2, c(m pbmiset2 m pbmiset3 m pbmiset4)
table WB_pov, c(m pbmi2 m pbmi3 m pbmi4)
table ASSET_INDEX, c(m pbmi2 m pbmi3 m pbmi4)
table PCAwealthindex, c(m pbmi2 m pbmi3 m pbmi4)
restore

I have 2 follow-up questions:

1) Does it make sense that I would get the same predicted probabilities whether or not I fixed the other variables in the model?

2) Do you know how I can get 95% CI's for these means? (did not see that in the options with stata help)

A millions thanks,
Mona




>>> "Joao Ricardo F. Lima" <jricardofl@gmail.com> 11/17/2008 6:14 AM >>>
Dear Mona, Maarten and Statalisters,

reading Maarten's answer, I would like to ask if this procedure is correct:

******
" // creating predictions while keeping other variables constant
 // predicted probabilities of urban women of average age
 preserve
 sum age if e(sample), meanonly
 replace age = r(mean)
 replace female = 1
 replace rural = 0

 predict pra*, pr
 table race , c(m pra1 m pra2 m pra3 m pra4 m pra5)

 restore"
***************
because the value of r(mean) (sample) is different of svy: mean (population):

webuse nhanes2f, clear
svyset psuid [pweight=finalwgt], strata(stratid)
. svy: mean age
(running mean on estimation sample)

Survey: Mean estimation

Number of strata =      31       Number of obs    =      10337
Number of PSUs   =      62       Population size  =  117023659
                                 Design df        =         31

--------------------------------------------------------------
             |             Linearized
             |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
         age |   42.23732   .3034412      41.61844    42.85619
--------------------------------------------------------------

. sum age if e(sample)

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
         age |     10337     47.5637    17.21678         20         74


If I am using svy: mlogit, the mean to be used isńt the populational?

Thanks a lot,

Best Regards,

Joao Lima


2008/11/16 Maarten buis <maartenbuis@yahoo.co.uk>:
> --- Mona Mowafi <mmowafi@hsph.harvard.edu> wrote:
>> I am seeking to attain predicted probabilities of my outcome (BMI
>> cats - normal, overweight, obese) for four main independent
>> variables.  I am not sure how to do it, but here is what I have
>> tried:
>>
>> svyset [pweight=femaleweight], strata(order) psu(place)
>>
>> xi: svymlogit BMICAT i.AGECAT4 i.ED2 i.WB_pov i.ASSET_INDEX
>> i.PCAwealthindex i.FATHERED i.GENHEALTH_PAST, basecategory(2) nolog
>> svymlogit, rrr
>>
>> predict p1 p2 p3
>> sort ED2
>> by ED2: sum p1
>> by ED2: sum p2
>> by ED2: sum p3
>>
>> Here are my main questions:
>>
>> 1) Does this syntax, does p1 refer to my reference outcome = normal
>> weight; p2= overweight, p3 = obese?  I want to make sure that I am
>> interpreting what p1, p2, and p3 is properly.
>
> You can see what category the variables refer to by looking at the
> labels that -predict- has attached to them. You can see those by typing
> -desc p*- (which will describe all variables whose name start with p,
> if there are too many of those type -desc p1 p2 p3-).
>
>> 2) If I sort and sum by p1, p2, and p3 - is this giving me the mean
>> predicted probability of each of my three outcomes for all
>> individuals in each of those three sub-categories (of education, for
>> example, as seen above)?  That is what I'm trying to do.
>
> Yes, but there is a subtle issue here: the differences between the
> educational categories may be due to the effect of education but can
> also be due to differences between the educational categories in the
> distribution of the other explanatory variables. For instance the lower
> educational categories will consist of individuals from a lower social
> background and these tend to have , and these tend a higher BMI. You
> can keep the other variables constant by first replacing the other
> variables by some number, e.g. the mean, and than predict, and than
> make the tables.
>
> Both methods are illustrated below (I used -table- in this examples as
> it creates more compact tables, but -by ...: sum...- will work too,
> another alternative would be -tabstat-).
>
> *---------------------- begin example ---------------------
> webuse nhanes2f, clear
> svyset psuid [pweight=finalwgt], strata(stratid)
> tab health
> svy: mlogit health rural black orace sex age
>
> // create predictions without keeping other variables constant
> predict pr*, pr
>
> // the labels show which variable belongs to which category
> desc pr*
>
> // comparing the average predicted probabilities with the observed
> percentages
> sum pr*
> tab health
>
> table race , c(m pr1 m pr2 m pr3 m pr4 m pr5)
>
>
> // creating predictions while keeping other variables constant
> // predicted probabilities of urban women of average age
> preserve
> sum age if e(sample), meanonly
> replace age = r(mean)
> replace female = 1
> replace rural = 0
>
> predict pra*, pr
> table race , c(m pra1 m pra2 m pra3 m pra4 m pra5)
>
> restore
> *--------------------- end example -------------------
> (For more on how to use examples I sent to the Statalist, see
> http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html )
>
> Hope this helps,
> Maarten
>
> -----------------------------------------
> Maarten L. Buis
> Department of Social Research Methodology
> Vrije Universiteit Amsterdam
> Boelelaan 1081
> 1081 HV Amsterdam
> The Netherlands
>
> visiting address:
> Buitenveldertselaan 3 (Metropolitan), room N515
>
> +31 20 5986715
>
> http://home.fsw.vu.nl/m.buis/ 
> -----------------------------------------
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search 
> *   http://www.stata.com/support/statalist/faq 
> *   http://www.ats.ucla.edu/stat/stata/ 
>



-- 
-------------------------------
Joao Ricardo Lima
Professor
UFPB-CCA-DCFS
+553138923914
-------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search 
*   http://www.stata.com/support/statalist/faq 
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index