[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Joseph Coveney" <[email protected]> |

To |
"Statalist" <[email protected]> |

Subject |
st: Has -tabulate , lrchi2- changed in Release 10? |

Date |
Mon, 21 Jan 2008 23:27:01 +0900 |

The do-file below creates a cross-tabulation with an empty cell (zero-count cell) and yet -tabulate , lrchi2- returns a result. I don't remember Stata doing that in earlier releases? Has this changed in Stata 10, or is it just forgetfulness as usual? The documentation still says, "lrchi2 displays the likelihood-ratio chi-squared statistic. The request is ignored if any cell of the table contains no observations." and the formula given in the manual still has the chi-square test statistic calculated as 2 * sum_i sum_j ln(n_ij / m_ij). On the same topic, recently, I've encountered a dataset where -tabulate- has a couple of zero-count cells and the likelihood ratio chi-square yields a p-value (0.304) that gives a substantially different picture from those of the Pearson chi-square test (0.044) and the Fisher test (0.068). The do-file below includes a contrived example where the opposite is the case (P = 0.02 for the likelihood ratio test, and P > 0.05 for the other two). What should we make of the likelihood ratio test here? Analogously, what to make of -logit- compared to -logit , asis-, where the likelihood-ratio tests differ because the null-model log-likelihoods differ (reflecting dropped perfect predictors)? I wasn't able to attend the London users' group meeting in 2006; did the discussion following Ian White's presentation ( http://repec.org/usug2006/White.ppt ) conclude with any guidance? Was there any discussion (pro or con) about Firth's method (Slide 19 of Ian's presentation) in cases of zero-count cells? Sometimes I can follow advice given in Chapter 4 of D. W. Hosmer & S Lemeshow, _Applied Logistic Regression_ Second Edition. (New York: John Wiley & Sons, 2000). But in other cases, collapsing or omitting categories, or fudging an ordering to the categories isn't compatible with the objective of the analysis. In a recent case, because of scientific interest in the interaction of two categorical predictors, I would have liked to use -exlogistic- to handle zero-count cells, but ran out of memory. Should I just continue to raise memory allocation to Stata and -exlogistic-? I'm not sure how much trouble disc swapping might become. I suppose that I could use resampling, as well: does anyone have any recommendations for choosing between a likelihood-ratio test or a Wald test for the chi-square returned by the program called by -permute-? Joseph Coveney clear * set seed `=date("2008-01-21", "YMD")' set obs 10 generate byte A = mod(_n, 5) generate byte B = mod(_n, 2) generate byte count = floor(uniform() * 100) replace count = 0 in 1 tabulate A B [fweight = count], lrchi2 drop in 1 tabulate A B [fweight = count], lrchi2 quietly xi: logit B i.A [fweight = count], asis display in smcl as text " likelihood-ratio chi2(" /// as result e(df_m) as text ") =" /// as result %9.4f e(chi2) as text " Pr = " /// as result %05.3f chi2tail(e(df_m), e(chi2)) quietly replace count = floor(count / 1.5) tabulate A B [fweight = count], /// lrchi2 chi2 exact nolog // N > 200; which one? exit * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: hall's skewness adjustment** - Next by Date:
**st: re: endogenous variables** - Previous by thread:
**st: AW: Comma-delimited output** - Next by thread:
**st: re: endogenous variables** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |