Search
   >> Home >> Resources & support >> FAQs >> Stata 5: Pseudo R in weibull
Note: This FAQ is for users of Stata 5, an older version of Stata. It is not relevant for more recent versions.

Stata 5: What is the pseudo R2 in the weibull output?

Title   Stata 5: Pseudo R2 in weibull
Author William Sribney, StataCorp
Date March 1997

The weibull command is different from some other ML commands in that the first iteration does not correspond to the constant-only model. For weibull (unlike ereg) there is no simple quick way to compute the constant-only model. It has to be iterated.

In the following sample output, iterations 0–3 compute the constant-only model:

 Iteration 0: Log Likelihood = -61.34299 Log(sigma)=         0
 Iteration 1: Log Likelihood = -60.65666 Log(sigma)= -.2060869
 Iteration 2: Log Likelihood = -60.62434 Log(sigma)= -.1859658
 Iteration 3: Log Likelihood = -60.62402 Log(sigma)= -.1882137

The remaining iterations compute the full model starting from the constant-only model. Note that the log likelihood of iteration 4 is the same as iteration 3:

 Iteration 4: Log Likelihood = -60.62402 Log(sigma)= -.1882429
 Iteration 5: Log Likelihood = -50.56099 Log(sigma)= -.2763068
 Iteration 6: Log Likelihood = -47.38802 Log(sigma)= -.0988737
 Iteration 7: Log Likelihood = -44.35852 Log(sigma)= -.6532183
 Iteration 8: Log Likelihood =  -42.8621 Log(sigma)= -.5385912
 Iteration 9: Log Likelihood = -42.66388 Log(sigma)= -.5630554
 Iteration 10: Log Likelihood = -42.66284 Log(sigma)= -.5639635
 
 Weibull regression (log relative hazard form) Number of obs    =      48
 Sigma                       =     0.569       Model chi2(2)    =  35.922
 Std Err(Sigma)              =     0.074       Prob > chi2      =  0.0000
 Log Likelihood              =   -42.663       Pseudo R2        =  0.3132

Thus the model chi2 is 2*(-42.66284 - -60.62402) = 35.92236

Concerning the pseudo-R2, for ML models with discrete outcomes, we use the formula

        pseudo-R2 = 1 - L1/L0

where LO and L1 are the constant-only and full model log likelihoods respectively. For discrete outcomes, the log likelihood is the log of a probability, so it is always negative. For continuous outcomes, the log likelihood is the log of a density. Since density functions can be greater than 1 (cf. the normal density at 0), the log likelihood can be positive or negative. Thus the formula 1 - L1/L0 could give a value greater than 1!

Thus there are no standard formulas for pseudo-R2s for continuous ML models. So as a rough guide for model fitting, we have devised ad hoc formulas for some commands like weibull that take advantage of the form of the likelihood. For weibull, the formula is

        pseudo R2 = 1 - sigma/sigma0

The reasoning behind this choice of a pseudo R2 is as follows. The Weibull model can be written as

        y = a + x*beta + sigma*e

where y = log(time) and e is an error term. Hence, the more terms one includes in the model, the smaller the estimate of sigma. This also proves 0 <= pseudo R2 <= 1.

We do not intend that our pseudo R2 should be reported in formal write-ups of results. The idea of a pseudo R2 came from economists who wanted some rough measure of explanatory power of the model. So it’s really just a guide for fitting models. A small pseudo R2 should make one humble about the model's explanatory ability, but a big pseudo R2 should not be taken as something necessarily wonderful.

Regarding the likelihoods from the examples in the manual, they are indeed positive. The maximum of the likelihood is at a point where the density of the joint distribution is > 1.

The Stata Blog: Not Elsewhere Classified Find us on Facebook Follow us on Twitter LinkedIn Google+ Watch us on YouTube