Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: SAS vs STATA : why is xtlogit SO slow ?

From   Klaus Pforr <>
Subject   Re: st: SAS vs STATA : why is xtlogit SO slow ?
Date   Sun, 05 Feb 2012 15:37:15 +0100



Am 04.02.2012 13:33, schrieb
Hello everyone,

Sorry for the delay.. I had to try your very interesting suggestions
before anything else...

Richard, Clyde, thank you for your interesting comments but the option
from doesnt help... Stata cannot converge :
Iteration 0:   log likelihood =    -1.#INF
Iteration 1:   log likelihood =    -1.#IND
Hessian is not negative semidefinite

Klaus, indeed I try to estimate a Fixed effect logit, not a random
effect. However are you sure that Stata uses the pooled coefficients
from the plain logit estimation?
Indeed if I send the Stata command : logit Y DUM CONT, the computation
takes a few seconds only to converge, but the results are quite
different from the logit fixed effect SAS estimation... One parameter
has the opposite sign for example which probaly means that including
dummies by individual is important.. ;-)
xtlogit.ado with fe-option refers to clogit.ado. You find this in lines 208-248 (in version 2.12.3 11may2010). In line 246 you find the actual reference to clogit. In the clogit.ado (version 1.6.15 15jul2011) you find the management of the starting values in lines 269-304. Depending on the from-options and other stuff, the default is a binary logit (you see it in line 281) to get the starting values.

But dont get me wrong. Don't use the logit to estimate your results, when you have reasons to estimate a fixed effects model.

Another thing, that raises doubts for me is your mentioning of inclusion of dummies by individual. You cannot use this approach for any ml-estimated fixed effects models because of the incidental parameters problem. The conditions for the consistency of the ml estimators are not met, if the coeffiecent vector depends on N, which it does, as you have a constant for almost any case. That is why you use the complicated conditioanl logit approach in the first place. Please give us also the command line, that you gave to Stata, and maybe a glimpse on your data.

By the way I have checked that there is indeed enough variation in the
DUM categorical variable so I do not think the problems are coming
from the variables...

MORE IMPORTANTLY : when I compare SAS's results with STATA on a MUCH
(really much) smaller sample (less than 2000 observations, 146
individuals, 11 points on average per individual) then the results are
exactly the same between the two systems (same point values + standars
errors+ P-values)... thus suggesting that something bad is going on
when STATA try to fit the fixed effect logit model on a larger dataset
So I am puzzled ...

What do you think ?
Thanks again for your help



Klaus Pforr
Universität Mannheim
D - 68131 Mannheim
Tel:  +49-621-181 2797
fax:  +49-621-181 2803

Besucheranschrift: A5, Raum A309

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index