Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

If Richard's suggestion of using option -difficult- doesn't solve your problem, then consider the following. 1. Confirm that Stata's -xtlogit- and SAS's PROC LOGISTIC are seeing the same dataset. One way to verify that the data each sees are the same is to compare the log-likelihoods. Because you cannot get convergence with Stata when the predictors are included, fit a model with no predictors (-clogit Y, group(ID)-). Then compare twice the negative log-likelihood value from Stata with null-model value shown by SAS (you mention that it's 3930972). Are they identical? If not, then there's probably a data-management error causing a difference in the two datasets. With the size of your dataset, this might not be particularly sensitive to occasional differences, but it will detect systematic differences. (It might even not be very specific; the log-likelihoods might differ despite an identical dataset, which would be helpful to know, too: see the footnote to 2. below.) You've already compared a subset of your dataset, and the results match, but there could be some systematic difference in the longer panels. Also, verify that the number of singletons (and other cases with a constant response) that is being thrown out by both packages is the same. Stata gives you a message before the iteration begins ("note: XXX groups (YYY obs) dropped because of all positive or all negative outcomes.". SAS says "Number of Uninformative Strata" and "Frequency Uninformative". Both numbers should agree between packages. 2. If Stata's Marquardt algorithm isn't so aggressive as SAS's (or Stata's singularity threshold is too sensitive), then you can try to side-step the Hessian altogether. Try -clogit . . . , . . . technique(bhhh)-. (See http://www.stata.com/statalist/archive/2010-03/msg01192.html for a thread by someone with the same problem as you report.*) 3. If that fails, then you can go the route that SAS used to use for conditional/fixed-effects logistic regression prior to the STRATA statement, namely, Cox regression. generate byte time = 2 - Y stset time, failure(Y = 1) stcox DUM CONT1 CONT2, strata(ID) exactp nohr 4. If everything fails, then you might need to use SAS's answer, as Klaus suggests. In light of the warnings from Stata, you might want to check on a couple of things in SAS's model-fit before relying extensively on it. a. You mention two lines in your SAS output. "I obtain: Newton-Raphson Ridge Optimization Without Parameter Scaling" The very next line in the output, the one just after that last line above. You didn't mention it. Does it say, "Convergence criterion (GCONV=1E-8) satisfied."? The same claim should be repeated in the SAS .LOG file. b. Is everything else agreeable in the SAS .LOG file? c. You mentioned that the omnibus tests are all P < 0.0001. What do the regression coefficients and their covariance matrix look like? Are they sensible? d. You probably didn't ask for an iteration trace in the SAS run, but it would be good to see how things look at convergence. I haven't tried the following for a run that blows up, but I believe that you can get an idea of SAS's gradient and Hessian at-convergence by feeding its regression coefficients to Stata and then not iterating at all. If it works, then it avoids re-running the model-fit in SAS. Try the steps below. Type in SAS's logit (untransformed) regression coefficients at full displayed precision into a Stata matrix. matrix input Beta = (<DUM's coefficient> <CONT1's coefficient> /// <CONT2's coefficient>) Then, clogit Y DUM CONT1 CONT2, group(ID) from(Beta, copy) /// iterate(0) gradient hessian Are you satisfied that the gradient's length is reasonably close to zero, that SAS's GCONV was tight enough? Which predictor is Stata complaining about in the Hessian? (Probably DUM, from your description of the dataset.) Look back at 4.c. above, again, asking how sensible that predictor's coefficient and standard error are. Joseph Coveney *That the same problem arose twice in independent situations, combined with your observation that another software package has no trouble, raises the distinct possibility that there's a bug in Stata's -clogit-. If so, then it's a rare bug that's difficult for StataCorp to replicate and fix without help from users. I'm obviously just guessing here, but from the user's manual, the objective function that -clogit- maximizes resembles a penalized log-likelihood, and something like a problem in the recursive algorithm to compute the conditioning factor looks as if it could give rise to the kind of behavior you and Yu Xue describe. Regardless, if you're satisfied with the items in 4. above, then you might be doing everyone a favor by contacting StataCorp for follow-up. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: SAS vs STATA : why is xtlogit SO slow ?** - Next by Date:
**st: RE: dropping integers from id variable** - Previous by thread:
**st: How to standardize (and save) a variable with SVY** - Next by thread:
**st: Factor variable notation vs. hand made dummy vars** - Index(es):