Note: This FAQ is for users of Stata 5, an older version of Stata.
It is not relevant for more recent versions.
Stata 5: What is the pseudo R2 in the weibull output?
|
Title
|
|
Stata 5: Pseudo R2 in weibull
|
|
Author
|
William Sribney, StataCorp
|
|
Date
|
March 1997
|
The weibull command is different from some other ML commands in that
the first iteration does not correspond to the constant-only model. For
weibull (unlike ereg) there is no simple quick way to compute
the constant-only model. It has to be iterated.
In the following sample output, iterations 0–3 compute the
constant-only model:
Iteration 0: Log Likelihood = -61.34299 Log(sigma)= 0
Iteration 1: Log Likelihood = -60.65666 Log(sigma)= -.2060869
Iteration 2: Log Likelihood = -60.62434 Log(sigma)= -.1859658
Iteration 3: Log Likelihood = -60.62402 Log(sigma)= -.1882137
The remaining iterations compute the full model starting from the
constant-only model. Note that the log likelihood of iteration 4 is the
same as iteration 3:
Iteration 4: Log Likelihood = -60.62402 Log(sigma)= -.1882429
Iteration 5: Log Likelihood = -50.56099 Log(sigma)= -.2763068
Iteration 6: Log Likelihood = -47.38802 Log(sigma)= -.0988737
Iteration 7: Log Likelihood = -44.35852 Log(sigma)= -.6532183
Iteration 8: Log Likelihood = -42.8621 Log(sigma)= -.5385912
Iteration 9: Log Likelihood = -42.66388 Log(sigma)= -.5630554
Iteration 10: Log Likelihood = -42.66284 Log(sigma)= -.5639635
Weibull regression (log relative hazard form) Number of obs = 48
Sigma = 0.569 Model chi2(2) = 35.922
Std Err(Sigma) = 0.074 Prob > chi2 = 0.0000
Log Likelihood = -42.663 Pseudo R2 = 0.3132
Thus the model chi2 is 2*(-42.66284 - -60.62402) = 35.92236
Concerning the pseudo-R2, for ML models with discrete
outcomes, we use the formula
pseudo-R2 = 1 - L1/L0
where LO and L1 are the constant-only and full model log likelihoods
respectively. For discrete outcomes, the log likelihood is the log of a
probability, so it is always negative. For continuous outcomes, the log
likelihood is the log of a density. Since density functions can be greater
than 1 (cf. the normal density at 0), the log likelihood can be positive or
negative. Thus the formula 1 - L1/L0 could give a value greater than 1!
Thus there are no standard formulas for pseudo-R2s for
continuous ML models. So as a rough guide for model fitting, we have
devised ad hoc formulas for some commands like weibull that take
advantage of the form of the likelihood. For weibull, the formula is
pseudo R2 = 1 - sigma/sigma0
The reasoning behind this choice of a pseudo R2 is as
follows. The Weibull model can be written as
y = a + x*beta + sigma*e
where y = log(time) and e is an error term. Hence, the more terms one
includes in the model, the smaller the estimate of sigma. This also proves
0 <= pseudo R2 <= 1.
We do not intend that our pseudo R2 should be reported in
formal write-ups of results. The idea of a pseudo R2 came
from economists who wanted some rough measure of explanatory power of the
model. So it’s really just a guide for fitting models. A small pseudo
R2 should make one humble about the model's explanatory
ability, but a big pseudo R2 should not be taken as
something necessarily wonderful.
Regarding the likelihoods from the examples in the manual, they are
indeed positive. The maximum of the likelihood is at a point where the
density of the joint distribution is > 1.
|