Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Post-estimation predicted probability and predicted probability calculated by hand do not match after running cloglog


From   Austin Nichols <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Post-estimation predicted probability and predicted probability calculated by hand do not match after running cloglog
Date   Fri, 27 May 2011 15:40:51 -0400

Urmi Bhattacharya <ub3@indiana.edu>:
Presumably, you are miscalculating the linear index.

webuse lbw, clear
cloglog low age lwt smoke ptl ht ui
predict p1, p
predict z, xb
g p2=1-exp(-exp(z))
su low p1 p2

Why do you need to calculate by hand?  Are you trying to calculate
median or mean survival?  If so, you need to assume the last dummy for
elapsed time is the baseline hazard for all future periods as well.
What is the omitted category of time there--zero?  You will have to
loop over all possible dates, -preserve- and -replace-, and calculate
the survival probability, perhaps.


On Fri, May 27, 2011 at 3:12 PM, Urmi Bhattacharya <ub3@indiana.edu> wrote:
> Dear Statalisters,
> I am estimating a discrete proportional hazard model using clogclog
> model. I use a fully non parametric specification for the baseline
> hazard. My variables include continuous as well as categorical ones.
> Of the categorical variables, three of them (dad_edu, mom_edu and
> caste) have three categories each.
> I run the following specification(durat1-11 are time dummies):
>
>
>  cloglog school_left childage i.child_female i.urban i.caste
> b3.dad_edu b3.mom_edu wage_less_primary wage_compl~5 wage_compl~8
> wage_bey
>> on~9 distance_p~l distance_m~l distance_h~l  month_percap_cons durat1 durat2 durat3 durat4 durat5 durat6 durat7 durat8 durat9 durat10
>> durat11, nocons nolog
> Complementary log-log regression                Number of obs     =      47569
>                                                Zero outcomes     =      40967
>                                                Nonzero outcomes  =       6602
>                                                Wald chi2(28)     =   17650.53
> Log likelihood = -14452.672                     Prob > chi2       =     0.0000
> ------------------------------------------------------------------------------
>  school_left |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>    childage |  -.1256983   .0052831   -23.79   0.000    -.1360531   -.1153435
> 1.child_fe~e |   .0038576   .0257962     0.15   0.881     -.046702    .0544172
>     1.urban |   .0133847   .0309483     0.43   0.665    -.0472728    .0740421
>             |
>       caste |
>          2  |   1.822595      .1166 15.63 0.000     1.594063    2.051126
>          3  |   1.743429    .118398    14.73   0.000     1.511373    1.975484
>             |
>     dad_edu |
>          1  |   .5320739   .0372011    14.30   0.000     .4591611    .6049868
>          2  |   .2982183   .0428693     6.96   0.000      .214196    .3822406
>             |
>     mom_edu |
>          1  |   1.104712   .0750107    14.73   0.000     .9576939    1.251731
>          2  |   .7798729   .0818774     9.52   0.000     .6193962    .9403496
>             |
> wage_less_~y |  -.0207552   .0052278    -3.97   0.000    -.0310015   -.0105088
> wage_compl~5 |  -.0049688   .0038456    -1.29   0.196     -.012506    .0025684
> wage_compl~8 |   .0170525   .0036375     4.69   0.000     .0099231    .0241818
> wage_beyon~9 |    .011931   .0018516     6.44   0.000     .0083018    .0155601
> distance_p~l |  -.0294569   .0135674    -2.17   0.030    -.0560485   -.0028654
> distance_m~l |  -.0046905   .0093872    -0.50   0.617     -.023089    .0137081
> distance_h~l |   .0163331   .0033153     4.93   0.000     .0098353    .0228309
> month_perc~s |  -.0001359   .0000245    -5.55   0.000    -.0001838   -.0000879
>      durat1 |  -5.423936   .1248589   -43.44   0.000    -5.668655   -5.179218
>      durat2 |  -4.619776   .1035593   -44.61   0.000    -4.822748   -4.416803
>      durat3 |  -4.371627   .1000453   -43.70   0.000    -4.567713   -4.175542
>      durat4 |  -3.766473   .0925507   -40.70   0.000    -3.947869   -3.585077
>      durat5 |  -2.848097   .0862321   -33.03   0.000    -3.017108   -2.679085
>      durat6 |  -3.315598     .09182   -36.11   0.000    -3.495562   -3.135634
>      durat7 |  -2.654227   .0876545   -30.28   0.000    -2.826027   -2.482428
>      durat8 |  -2.156138   .0864863   -24.93   0.000    -2.325648   -1.986628
>      durat9 |   -1.53362   .0858737   -17.86   0.000     -1.70193   -1.365311
>     durat10 |   -.831892   .0873459    -9.52   0.000    -1.003087   -.6606973
>     durat11 |  -2.114238   .1222815   -17.29   0.000    -2.353905   -1.874571
> ------------------------------------------------------------------------------
>
> Since I want the predicted hazards, I type the following
> predict h,p
> However, when I run the check as calculate the p by hand, using the formula
>
> p=1-exp(-exp(b'x+duration parameters))
> I get very different values from h.
> But they should be the same. I would appreciate any help in figuring
> out what am I missing here.
>
> Thanks in advance for any possible advice.
> Urmi Bhattacharya

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index