Univariate log-likelihood tests for model identification (STB-9: sqv5) -------------------------------------------------------- ^unilogit^ depvar predictors, ^df(^x^)^ Description ----------- Comparing the log-likelihood of a logistic regression model containing only the intercept with that of a model having a single predictor provides prima facie evidence of whether the predictor in fact contributes to the model. The comparison statistic is provided by the likelihood ratio test. At the early model building stage, a p value of .25 or under can be considered adequate for inclusion of the variable as a main effect predictor. However, this does not mean that transformation or collapsing value levels may not prove to later enhance a variables contribution to a full model. What we are really looking for at this stage are high p values. If we find one at say .80, we can probably exclude it from subsequent analyses. But a caveat - if the variable may serve as a factor in a significant interaction, we may need to retain it regardless of its p value. ^unilogit^ calculates for each variable listed after the response variable, its coefficient, standard error, log-likelihood, Chi-square (LL ratio statistic), and significance. The intercept only model log-likelihood is also provided for comparison. The intercept only log-likelihood can be calculated directly from the distribution of the reponse variable. Let n0 and n1 represent the respective number of observations having 0's or 1's; and let N be the total number of nonmissing observations. The log-likelihood can be determined by: ^LL = n0 ln(n0) + n1 ln(n1) - N ln(N) After the ^logit^ command, one can also obtain the LL statistic directly from Stata by: ^LLo = -(_result(6)+(-2*_result(2)))/2 where _result(6) is the chi2 and _result(2) is the log-likelihood of the model with predictor(s). The likelihood ratio test evaluates the hypothesis that the slope coefficient is zero. Given LL1 as the log-likelihood of the model with the predictor, and LL0 as the intercept only log-likelihood, the ratio is determined by: ^chi2 = 2(LL1 - LL0) Example: ^. use lbw ^. d Contains data from lbw.dta Obs: 189 (max= 166927) Vars: 12 (max= 99) Width: 14 (max= 200) 1. id int %8.0g identification code 2. low byte %8.0g birth weight<2500g 3. age byte %8.0g age of mother 4. lwt int %8.0g weight at last menstrual period 5. smoke byte %8.0g smoked during pregnancy 6. ptl byte %8.0g premature labor history (count) 7. ht byte %8.0g has history of hypertension 8. ui byte %8.0g presence, uterine irritability 9. race1 byte %8.0g race==white 10. race2 byte %8.0g race==black 11. race3 byte %8.0g race==other 12. ftv byte %8.0g 1st trimester M.D. visits Sorted by: ^. unilogit low age lwt smoke ptl ht ui ftv Univariate Logistic Regression Models 1 Degrees of Freedom Intercept LL = -117.3360 Variable Coeff St Error LL Chi2 Prob __________________________________________________________________________ age -0.0512 0.0315 -115.9560 2.7600 0.0966 lwt -0.0141 0.0062 -114.3453 5.9813 0.0145 smoke 0.7041 0.3196 -114.9023 4.8674 0.0274 ptl 0.8018 0.3172 -113.9463 6.7794 0.0092 ht 1.2135 0.6083 -115.3249 4.0221 0.0449 ui 0.9469 0.4168 -114.7979 5.0761 0.0243 ftv -0.1351 0.1567 -116.9494 0.7731 0.3792 References: ---------- Hosmer, D. W. and S. Lemeshow. 1989. ^Applied Logistic Regression^. New York: John Wiley & Sons. Author: ------ Joseph Hilbe, Editor, STB, Fax 602-860-1446