Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: prediction of model using svy-commands


From   sjsamuels@gmail.com
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: prediction of model using svy-commands
Date   Sat, 10 Oct 2009 11:12:44 -0400

I was wrong, and MIchael is correct.   -predictnl-  is a function of
the data and will not work.   - -adjust-  or  -margins-  will solve
Rune's problem:
_________________________________________
 use http://www.stata-press.com/data/r11/margex
 mlogit group sex age
 margins, at(sex=1 age=40) predict(outcome(1))
_________________________________________

-Steve




On Sat, Oct 10, 2009 at 9:30 AM,  <sjsamuels@gmail.com> wrote:
> The answer to Rune's question is "No".  -prvalue- will not give the
> proper CI's with survey data.
>
> -adjust- is not the answer. For one,  Rune wants to specify the entire
> prediction equation, something that -adjust- doesn't do.  Also,
> neither  -prvalue-  nor -adjust- will hold the other predictors at
> their -svy- weighted means.   (See:
> http://www.stata.com/statalist/archive/2008-09/msg00667.html for code
> that makes -adjust- do it for -svy: logit-.)
>
> The solution is to use  -predictnl- after -svy: mlogit- .
>
> -Steve
>
>
>
> On Fri, Oct 9, 2009 at 5:23 PM, Michael I. Lichter <mlichter@buffalo.edu> wrote:
>> Rune,
>>
>> Here's one additional idea. Use -adjust-. I believe that the standard error
>> of a predicted Y-hat is slightly different from the standard error of an
>> adjusted Y-bar, but perhaps the latter will be good enough for you. Note
>> that you will have to tell -adjust- to restrict its calculations to the
>> estimation sample if that's what you want; -prvalue- does this
>> automatically. You should get identical point estimates from both, but wider
>> confidence intervals from -adjust-.
>>
>> The following program shows that -adjust- and -prvalue- generate the same
>> point estimates, and that they generate *nearly* the same CIs for the
>> non-survey regression, but the confidence intervals using -adjust- for the
>> survey regression are clearly wider, as you would expect.
>>
>> -----
>> clear all
>> sysuse auto
>> set seed 20091009
>> bysort foreign: gen psu = runiform() >= .5
>> svyset psu [pw=mpg], strata(foreign)
>> reg price weight
>> prvalue, x(weight=2000)
>> adjust weight=2000 if e(sample), ci
>> svy: reg price weight
>> prvalue, x(weight=2000)
>> adjust weight=2000 if e(sample), ci
>> -----
>>
>> Output edited to show only -adjust- and -prvalue-
>>
>>
>> . reg price weight
>> . prvalue, x(weight=2000)
>>
>> regress: Predictions for price
>>
>>                               95% Conf. Interval
>>  Predicted y:        4081.4   [   3137,    5025.9]
>>
>>   weight
>> x=    2000
>>
>> . adjust weight=2000 if e(sample), ci
>>
>> -------------------------------------------------------------------------------
>>    Dependent variable: price     Command: regress
>> Covariate set to value: weight = 2000
>> -------------------------------------------------------------------------------
>>
>> ----------------------------------------------
>>     All |         xb          lb          ub
>> ----------+-----------------------------------
>>         |    4081.42    [3120.83    5042.01]
>> ----------------------------------------------
>>    Key:  xb         =  Linear Prediction
>>          [lb , ub]  =  [95% Confidence Interval]
>>
>>
>>
>> . svy: reg price weight
>>
>> . prvalue, x(weight=2000)
>>
>> regress: Predictions for price
>>
>>                               95% Conf. Interval
>>  Predicted y:        4241.1   [ 4067.2,    4415.1]
>>
>>   weight
>> x=    2000
>>
>> . adjust weight=2000 if e(sample), ci
>>
>> -------------------------------------------------------------------------------
>>    Dependent variable: price     Command: regress
>> Covariate set to value: weight = 2000
>> -------------------------------------------------------------------------------
>>
>> ----------------------------------------------
>>     All |         xb          lb          ub
>> ----------+-----------------------------------
>>         |    4241.12    [3859.26    4622.99]
>> ----------------------------------------------
>>    Key:  xb         =  Linear Prediction
>>          [lb , ub]  =  [95% Confidence Interval]
>>
>>
>> Michael
>>
>>
>> Rune Nielsen wrote:
>>>
>>> Michael,
>>> your solution works out fine, thank you. I created dummys for my
>>> variables, and then set version to 10.1.
>>>
>>> However, in their 2006 "Regression Models for Categorical Dependent
>>> Variables Using Stata" Long and Freeze says that "Unfortunately, our SPost
>>> commands do not work when the svy prefix is specified."  So, now my question
>>> is whether I can trust the confidence intervals that the prvalue produces?
>>>
>>> Grateful if anyone could answer this.
>>>
>>> Best wishes
>>>
>>> Rune
>>>
>>>
>>>
>>>
>>>
>>> Den 8. okt. 2009 kl. 19.31 skrev Michael I. Lichter:
>>>
>>>> Rune,
>>>>
>>>> I was hoping somebody with Stata 11.0 would answer you. That not being
>>>> the case ...
>>>>
>>>> 1. On J Scott Long's SPost web site, he says "Stata 11 news: The package
>>>> spost9_ado does not work with factor variables. We have updated prchange (v
>>>> 1.8.1) and countfit (0.8.1) to work with Stata 11. We are still working with
>>>> StataCorp to make SPost work with mlogit under Stata 11. In the short run,
>>>> the easiest thing is run mlogit under version control. So instead of:
>>>> "mlogit y a b c" use "version 10.1: mlogit y a b c" Then the SPost commands
>>>> should work fine. If you encounter other problems using Stata 11, let us
>>>> know. 10Sep09." (http://www.indiana.edu/~jslsoc/spost.htm)
>>>>
>>>> 2. I presume you were aware of this and that's why you did not use factor
>>>> variables directly in -prvalue-. Nevertheless, -prvalue- is complaining
>>>> about factor variables. Have you tried creating dummy variables and using
>>>> those instead? You can use, e.g., -tab GOLDogrestr, gen(GOLD)- to do this.
>>>> You don't have to try it with the whole model ... unless it works with a
>>>> partial model.
>>>>
>>>> 3. You could also try what he suggests for mlogit: "version 10.1: svy:
>>>> regress y a b c".
>>>>
>>>> 4. If I knew how to calculate confidence intervals for predictions I
>>>> would tell you. Perhaps somebody else will chime in. Short of that, I can
>>>> tell you that prvalue.ado calls _peciml.ado to find the upper and lower
>>>> bounds, and you can see it with -viewsource _peciml.ado-.
>>>>
>>>> Michael
>>>>
>>>>
>>>> Rune Nielsen wrote:
>>>>>
>>>>> Hi, I'm using stata 11.0 on a Mac with os 10.6. I'm using spost 9. After
>>>>> setting the survey parameters I try running this regression:
>>>>> /svyset [pweight=vekt_alle], strata(kilde)/
>>>>> /svy: regress log_kost i.GOLDogrestr i.kjonn_omv i.smoke_bin alder2006
>>>>> i.utdannelse_omv if har364dg==1 & n474==100/
>>>>>
>>>>> when I try using prvalue:
>>>>> /prvalue, x(_IGOLDogres_2=0 _IGOLDogres_3=0 _IGOLDogres_4=0
>>>>> _Ikjonn_omv_1=0.415 _Ismoke_bin_1=0.559 _Iutdannels_1=0.479
>>>>> _Iutdannels_2=0.3047 alder2006=61.868) /
>>>>>
>>>>> I get this error:
>>>>>
>>>>> /factor variables and time-series operators not allowed/
>>>>>
>>>>> However, if I try doing this without the svy-prefix, spost manages to
>>>>> predict nicely. I'm not that skilled, so perhaps there is some kind of
>>>>> obvious error in my commands?
>>>>>
>>>>> Best wishes,
>>>>>
>>>>> Rune
>>>>>
>>>>>
>>>>> Den 8. okt. 2009 kl. 09.42 skrev Michael I. Lichter:
>>>>>
>>>>>> What version of Stata are you using, what version of spost, what
>>>>>> command line are you using, and what do you mean by "not possible"? In Stata
>>>>>> 10.1, using spost9, I just ran -svy: reg depvar age gender- followed by
>>>>>> -prvalue, x(age=50)- and got a sensible answer from -prvalue- (I did not,
>>>>>> however, verify that the answer was correct).
>>>>>>
>>>>>> Michael
>>>>>>
>>>>>> Rune Nielsen wrote:
>>>>>>>
>>>>>>> Dear statalist-members,
>>>>>>> I'm running a model (multiple linear regression) using the
>>>>>>> /svy/-prefix to adjust for a stratified sample. The outcome is
>>>>>>> log-transformed due to large left-skewing.
>>>>>>> I would like to do some predictions using prvalue from the
>>>>>>> spost-package. However, this is not possible when I've used the
>>>>>>> survey-version of/ regress/. Does anybody know a simple solution to this?
>>>>>>> I've already tried running the model without svy, but specifying pweight and
>>>>>>> vce(cluster clustervar), but then I don't get reliable confidence intervals.
>>>>>>> Grateful for any answers (in simple language for a non-statician)
>>>>>>>
>>>>>>> Best wishes,
>>>>>>>
>>>>>>> Rune Nielsen
>>>>>>>
>>>>>>> ---
>>>>>>> Rune Nielsen, MD, research fellow
>>>>>>> Institute of Medicine, Bergen, Norway
>>>>>>
>>>>>> --
>>>>>> Michael I. Lichter, Ph.D. <mlichter@buffalo.edu
>>>>>> <mailto:mlichter@buffalo.edu>>
>>>>>> Research Assistant Professor & NRSA Fellow
>>>>>> UB Department of Family Medicine / Primary Care Research Institute
>>>>>> UB Clinical Center, 462 Grider Street, Buffalo, NY 14215
>>>>>> Office: CC 126 / Phone: 716-898-4751 / FAX: 716-898-3536
>>>>>>
>>>>>> *
>>>>>> *   For searches and help try:
>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>> *   http://www.stata.com/support/statalist/faq
>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>
>>>>
>>>> --
>>>> Michael I. Lichter, Ph.D. <mlichter@buffalo.edu>
>>>> Research Assistant Professor & NRSA Fellow
>>>> UB Department of Family Medicine / Primary Care Research Institute
>>>> UB Clinical Center, 462 Grider Street, Buffalo, NY 14215
>>>> Office: CC 126 / Phone: 716-898-4751 / FAX: 716-898-3536
>>>>
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/statalist/faq
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> --
>> Michael I. Lichter, Ph.D. <mlichter@buffalo.edu>
>> Research Assistant Professor & NRSA Fellow
>> UB Department of Family Medicine / Primary Care Research Institute
>> UB Clinical Center, 462 Grider Street, Buffalo, NY 14215
>> Office: CC 126 / Phone: 716-898-4751 / FAX: 716-898-3536
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
>
>
> --
> Steven Samuels
> sjsamuels@gmail.com
> 18 Cantine's Island
> Saugerties NY 12477
> USA
> 845-246-0774
>



-- 
Steven Samuels
sjsamuels@gmail.com
18 Cantine's Island
Saugerties NY 12477
USA
845-246-0774

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index