Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: mistake in weibhet_glfa


From   Weihua Guan <wguan@stata.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: mistake in weibhet_glfa
Date   Thu, 22 Aug 2002 11:30:58 -0500

--Alfonso Miranda<alfonsomirand4@yahoo.com> wrote:                    

> While doing an extension of the code weibhet_glfa I
> found what I believe is a mistake in the
> log-likelihood expression of this code. 
[...]
> I have written down the log-likelihood obtaining  (in an intermediate
> expression that helps comparison) the expression:
> 
> Logl = ln{1+theta*exp[-xb*p]*t^(p)}^(-(1/theta + d))
> + ln{exp[-xb*p]*p*t^(p-1)}                           
>         (3)
> 
> Where x is the vector of observed characteristics and
> b is its corresponding vector of coefficients. In the
> weibhet_glfa code the log-likelihood is written as 
> 
> Logl = ln{1+theta*exp[-xb*p]*t^(p)}^(-(1/theta + d))+
> + ln {exp[-xb*p]*p*t^(p)}                            
>         (4)
[...]

Alfonso noticed the difference between the equation (3) and (4), and wondered
whether -weibhet_glfa- made a mistake.  

Short answer:

ML will get the same estimated parameters from either equation(3) or (4).  The
program chooses equation (4) because the log-likelihood value will keep
constant in this form when the scale of survival time changes.

Long answer:

Compared the two equations, the difference lies on a term ln(t).  Note that ML
finds the estimates by the first derivatives (of b and p), and the term ln(t)
will not be presented in the first derivatives. The two log-likelihood
functions will result in the same estimation results except the log-likelihood
value.

In fact, the program is written in this way on purpose such that the value of
the log-likelihood will be invariant to the scale of survival time.  Let's see
a simple example with Weibull model:

. use http://www.stata-press.com/data/r7/cancer, clear
(Patient Survival in Drug Trial)

. stset studytime, fail(died)
(output omitted)

. streg drug age, dist(weibull) nolog

         failure _d:  died
   analysis time _t:  studytime


Weibull regression -- log relative-hazard form 

No. of subjects =           48                     Number of obs   =        48
No. of failures =           31
Time at risk    =          744
                                                   LR chi2(2)      =     35.92
Log likelihood  =   -42.662838                     Prob > chi2     =    0.0000
[...]

Now we rescale the time variable "_t" by 100 and re-estimate the model:

. replace _t = _t/100
_t was byte now float
(48 real changes made)

. streg drug age, dist(weibull) nolog

         failure _d:  died
   analysis time _t:  studytime


Weibull regression -- log relative-hazard form 

No. of subjects =           48                     Number of obs   =        48
No. of failures =           31
Time at risk    =  7.439999977
                                                   LR chi2(2)      =     35.92
Log likelihood  =   -42.662838                     Prob > chi2     =    0.0000
[...]

The value of log-likelihood does not change when the scale of _t changes.
This trick can be also found in some other models with -streg-.  To get back
the original value of log-likelihood, we can simply adjust the calculated
log-likelihood by the term ln(t).

. gen double lnt = _d*ln(_t)

. summarize lnt if e(sample)

. di e(ll) - r(sum)



Weihua Guan <wguan@stata.com>
Stata Corp.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index