Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Query about finding predicted change in probability after logit for changing two variables

 From Urmi Bhattacharya To statalist@hsphsun2.harvard.edu Subject Re: st: Query about finding predicted change in probability after logit for changing two variables Date Fri, 10 Jun 2011 12:39:12 -0400

Hi Austin and fellow Statalisters,

I ran the same code again and it worked!. But I got a note (as
indicated in the log below)

prog drop _all

. prog mymarg, rclass
1. u if rep78<. using `c(sysdir_base)'a/auto, clear
2. bsample
3. g byte medium=inlist(rep78,3)
4. g byte high=inlist(rep78,45)
5. g byte turnhi=(turn>35)
6. logit foreign high medium turnhi headroom mpg trunk
7. replace medium=1
8. replace high=0
9. replace turnhi=0
10. predict p1
11. replace medium=0
12. replace high=1
13. replace turnhi=1
14. predict p2
15. g dp=p2-p1 if e(sample)
16. su dp, mean
17. ret scalar dp=r(mean)
18. eret clear
19. end

. simulate, reps(141) seed(123): mymarg

command:  mymarg
dp:  r(dp)

Simulations (141)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
...............x...x.......x............x.........    50
.........................x........................   100
.........x.................x.............

. bstat, stat(dp) n(`n')

Bootstrap results                               Number of obs      =        69
Replications       =       134

------------------------------------------------------------------------------
|   Observed   Bootstrap                         Normal-based
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
dp |   .1620932   .2280807     0.71   0.477    -.2849368    .6091231
------------------------------------------------------------------------------
Note: one or more parameters could not be estimated in some of the bootstrap
replicates; standard-error estimates include only complete replications.

I understand, it may happen because of the sampling with replacement
that happens with the bsample in the program code. But since it also
mentions that the standard error estimates include only complete
replications, I think that the results can be trusted. I would be glad

Thank you,

Urmi

On Fri, Jun 10, 2011 at 11:25 AM, Urmi Bhattacharya <ub3@indiana.edu> wrote:
> Hi Austin,
>
> Thank you so much for both your responses. I understand I need to
> calculate the change in predicted probability at each duration of risk
> separately, using only the sample of people who are at risk in the
> duration.
>
> So I need to modify the code below in such a way do as to figure out
> how to do the same for a subset of the sample which is what i am
> struggling to do.
>
> However, when i run the code below just as you have typed it,I get the
> following error message:
>
> u if rep78<. using `c(sysdir_base)'a/auto
> (1978 Automobile Data)
>
> .
> . g byte medium=inlist(rep78,3,4)
>
> . g byte high=inlist(rep78,5)
>
> . g byte turnhi=(turn>35)
>
> . logit foreign high medium turnhi headroom mpg trunk
>
> Iteration 0:   log likelihood = -42.400729
> Iteration 1:   log likelihood =  -26.95755
> Iteration 2:   log likelihood = -25.645315
> Iteration 3:   log likelihood = -25.378962
> Iteration 4:   log likelihood = -25.326588
> Iteration 5:   log likelihood = -25.315298
> Iteration 6:   log likelihood = -25.313291
> Iteration 7:   log likelihood = -25.312837
> Iteration 8:   log likelihood = -25.312726
> Iteration 9:   log likelihood = -25.312703
> Iteration 10:  log likelihood = -25.312698
> Iteration 11:  log likelihood = -25.312697
>
> Logistic regression                               Number of obs   =         69
>                                                  LR chi2(6)      =      34.18
>                                                  Prob > chi2     =     0.0000
> Log likelihood = -25.312697                       Pseudo R2       =     0.4030
>
> ------------------------------------------------------------------------------
>     foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>        high |   19.23396   2458.202     0.01   0.994    -4798.753    4837.221
>      medium |   16.98224   2458.202     0.01   0.994    -4801.005    4834.969
>      turnhi |   -1.44119   .9947506    -1.45   0.147    -3.390866     .508485
>    headroom |  -.5846668   .6258184    -0.93   0.350    -1.811248    .6419148
>         mpg |  -.0071336   .0863283    -0.08   0.934    -.1763339    .1620667
>       trunk |  -.1430355   .1284915    -1.11   0.266    -.3948742    .1088032
>       _cons |  -13.22992   2458.203    -0.01   0.996     -4831.22     4804.76
> ------------------------------------------------------------------------------
> Note: 6 failures and 0 successes completely determined.
>
> . replace medium=1
>
> . replace high=0
>
> . replace turnhi=0
>
> . predict p1
> (option pr assumed; Pr(foreign))
>
> . replace medium=0
>
> . replace high=1
>
> . replace turnhi=1
>
> . predict p2
> (option pr assumed; Pr(foreign))
>
> . g dp=p2-p1
>
> .
> . mat dp=r(mean)
>
> . loc n=r(N)
>
> . prog drop _all
>
> . prog mymarg, rclass
>  1. u if rep78<. using `c(sysdir_base)'a/auto, clear
>  2. bsample
>  3. g byte medium=inlist(rep78,3)
>  4. g byte high=inlist(rep78,45)
>  5. g byte turnhi=(turn>35)
>  6. logit foreign high medium turnhi headroom mpg trunk
>  7. replace medium=1
>  8. replace high=0
>  9. replace turnhi=0
>  10. predict p1
>  11. replace medium=0
>  12. replace high=1
>  13. replace turnhi=1
>  14. predict p2
>  15. g dp=p2-p1
>  16. su dp, mean
>  17. ret scalar dp=r(mean)
>  18. eret clear
>  19. end
>
> . simulate, reps(141) seed(123): mymarg
>
>      command:  mymarg
>           dp:  r(dp)
>
> Simulations (141)
> ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
> ...............x...x.......x............x.........    50
> .........................x........................   100
> .........x.................x.............
>
> . bstat, stat(dp) n(`n')
> estimates post: matrix has missing values
> r(504);
>
> end of do-file
>
> r(504);
>
> Looking at the error code it says
>
>
> [P]     error . . . . . . . . . . . . . . . . . . . . . . . .  Return code 504
>        matrix has missing values;
>        You have issued a matrix command attempting a matrix operation
>        that, were it carried out, would result in a matrix with missing
>        values; dividing by zero is a good example.
>
>
> I was wondering if you could tell me why am I getting this? I wanted
> to get this code to work, so that i can try and modify it to get the
> change in predicted probability for subsamples of the dataset.
>
> Thank you so much for your kind advice on this so far.
>
> Best
>
> Urmi
>
>
>
> On Thu, Jun 9, 2011 at 3:20 PM, Austin Nichols <austinnichols@gmail.com> wrote:
>> Urmi Bhattacharya <ub3@indiana.edu> wrote:
>> "now i want to find what is the predicted change in the
>> probability of foreign for those cars with cqual_high=1 with
>> turn_high=1 from those with cqual_medium=1 and turn_low=1"
>>
>> The meaning of the above is totally unclear to me,
>> and I cannot understand you cqual_ coding is supposed to be,
>> but perhaps you mean to:
>> compare (1) medium==1 and turnhi==0
>>     to (2) high==1   and turnhi==1
>> like so:
>>
>> clear all
>> u if rep78<. using `c(sysdir_base)'a/auto
>> g byte medium=inlist(rep78,3,4)
>> g byte high=inlist(rep78,5)
>> g byte turnhi=(turn>35)
>> logit foreign high medium turnhi headroom mpg trunk
>> replace medium=1
>> replace high=0
>> replace turnhi=0
>> predict p1
>> replace medium=0
>> replace high=1
>> replace turnhi=1
>> predict p2
>> g dp=p2-p1 if e(sample)
>> su dp, mean
>> mat dp=r(mean)
>> loc n=r(N)
>> prog drop _all
>> prog mymarg, rclass
>> u if rep78<. using `c(sysdir_base)'a/auto, clear
>> bsample
>> g byte medium=inlist(rep78,3)
>> g byte high=inlist(rep78,45)
>> g byte turnhi=(turn>35)
>> logit foreign high medium turnhi headroom mpg trunk
>> replace medium=1
>> replace high=0
>> replace turnhi=0
>> predict p1
>> replace medium=0
>> replace high=1
>> replace turnhi=1
>> predict p2
>> g dp=p2-p1 if e(sample)
>> su dp, mean
>> ret scalar dp=r(mean)
>> eret clear
>> end
>> simulate, reps(141) seed(123): mymarg
>> bstat, stat(dp) n(`n')
>>
>> *But I want to reiterate that you should look at each duration's risk
>> set separately:
>> http://www.stata.com/statalist/archive/2011-06/msg00465.html
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/