From |
Urmi Bhattacharya <ub3@indiana.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Post-estimation predicted probability and predicted probability calculated by hand do not match after running cloglog |

Date |
Fri, 27 May 2011 15:12:13 -0400 |

Dear Statalisters, I am estimating a discrete proportional hazard model using clogclog model. I use a fully non parametric specification for the baseline hazard. My variables include continuous as well as categorical ones. Of the categorical variables, three of them (dad_edu, mom_edu and caste) have three categories each. I run the following specification(durat1-11 are time dummies): cloglog school_left childage i.child_female i.urban i.caste b3.dad_edu b3.mom_edu wage_less_primary wage_compl~5 wage_compl~8 wage_bey > on~9 distance_p~l distance_m~l distance_h~l month_percap_cons durat1 durat2 durat3 durat4 durat5 durat6 durat7 durat8 durat9 durat10 > durat11, nocons nolog Complementary log-log regression Number of obs = 47569 Zero outcomes = 40967 Nonzero outcomes = 6602 Wald chi2(28) = 17650.53 Log likelihood = -14452.672 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ school_left | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- childage | -.1256983 .0052831 -23.79 0.000 -.1360531 -.1153435 1.child_fe~e | .0038576 .0257962 0.15 0.881 -.046702 .0544172 1.urban | .0133847 .0309483 0.43 0.665 -.0472728 .0740421 | caste | 2 | 1.822595 .1166 15.63 0.000 1.594063 2.051126 3 | 1.743429 .118398 14.73 0.000 1.511373 1.975484 | dad_edu | 1 | .5320739 .0372011 14.30 0.000 .4591611 .6049868 2 | .2982183 .0428693 6.96 0.000 .214196 .3822406 | mom_edu | 1 | 1.104712 .0750107 14.73 0.000 .9576939 1.251731 2 | .7798729 .0818774 9.52 0.000 .6193962 .9403496 | wage_less_~y | -.0207552 .0052278 -3.97 0.000 -.0310015 -.0105088 wage_compl~5 | -.0049688 .0038456 -1.29 0.196 -.012506 .0025684 wage_compl~8 | .0170525 .0036375 4.69 0.000 .0099231 .0241818 wage_beyon~9 | .011931 .0018516 6.44 0.000 .0083018 .0155601 distance_p~l | -.0294569 .0135674 -2.17 0.030 -.0560485 -.0028654 distance_m~l | -.0046905 .0093872 -0.50 0.617 -.023089 .0137081 distance_h~l | .0163331 .0033153 4.93 0.000 .0098353 .0228309 month_perc~s | -.0001359 .0000245 -5.55 0.000 -.0001838 -.0000879 durat1 | -5.423936 .1248589 -43.44 0.000 -5.668655 -5.179218 durat2 | -4.619776 .1035593 -44.61 0.000 -4.822748 -4.416803 durat3 | -4.371627 .1000453 -43.70 0.000 -4.567713 -4.175542 durat4 | -3.766473 .0925507 -40.70 0.000 -3.947869 -3.585077 durat5 | -2.848097 .0862321 -33.03 0.000 -3.017108 -2.679085 durat6 | -3.315598 .09182 -36.11 0.000 -3.495562 -3.135634 durat7 | -2.654227 .0876545 -30.28 0.000 -2.826027 -2.482428 durat8 | -2.156138 .0864863 -24.93 0.000 -2.325648 -1.986628 durat9 | -1.53362 .0858737 -17.86 0.000 -1.70193 -1.365311 durat10 | -.831892 .0873459 -9.52 0.000 -1.003087 -.6606973 durat11 | -2.114238 .1222815 -17.29 0.000 -2.353905 -1.874571 ------------------------------------------------------------------------------ Since I want the predicted hazards, I type the following predict h,p However, when I run the check as calculate the p by hand, using the formula p=1-exp(-exp(b'x+duration parameters)) I get very different values from h. But they should be the same. I would appreciate any help in figuring out what am I missing here. Thanks in advance for any possible advice. Urmi Bhattacharya * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

