[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: SAS vs STATA : why is xtlogit SO slow ?
Re: st: SAS vs STATA : why is xtlogit SO slow ?
Sat, 4 Feb 2012 13:33:40 +0100
Sorry for the delay.. I had to try your very interesting suggestions
before anything else...
Richard, Clyde, thank you for your interesting comments but the option
from doesnt help... Stata cannot converge :
Iteration 0: log likelihood = -1.#INF
Iteration 1: log likelihood = -1.#IND
Hessian is not negative semidefinite
Klaus, indeed I try to estimate a Fixed effect logit, not a random
effect. However are you sure that Stata uses the pooled coefficients
from the plain logit estimation?
Indeed if I send the Stata command : logit Y DUM CONT, the computation
takes a few seconds only to converge, but the results are quite
different from the logit fixed effect SAS estimation... One parameter
has the opposite sign for example which probaly means that including
dummies by individual is important.. ;-)
By the way I have checked that there is indeed enough variation in the
DUM categorical variable so I do not think the problems are coming
from the variables...
MORE IMPORTANTLY : when I compare SAS's results with STATA on a MUCH
(really much) smaller sample (less than 2000 observations, 146
individuals, 11 points on average per individual) then the results are
exactly the same between the two systems (same point values + standars
errors+ P-values)... thus suggesting that something bad is going on
when STATA try to fit the fixed effect logit model on a larger dataset
So I am puzzled ...
What do you think ?
Thanks again for your help
On 3 February 2012 18:14, Klaus Pforr <firstname.lastname@example.org> wrote:
> just some comments on this, although I hope that the person who posted this
> problem originally will eventually tell us more about the data and the
> Am 03.02.2012 17:34, schrieb Clyde B Schechter:
>> I don't really know much about how xtlogit (or any of the other xt
>> estimators) work "under the hood" [that's "under the bonnet" to Nick Cox]
>> but I have used these estimators a fair amount and have some pragmatic tips
>> for dealing with non-convergence of random effects models that have served
>> me well.
> I think that he/she wants to estimate a fixed-effects-model (although I'm
> sure, if this is generally easier or more difficult to estimate than RE)
>> 1. Check all of your categorical predictors. If any of them have any
>> level that is only instantiated in a small number of cases in the estimation
>> sample, the coefficient for that level can be very difficult to estimate.
>> Try combining some levels in that variable (or, if it is a dichotomous
>> variable drop it from the model.)
>> 2. Similarly check your continuous variables to be sure the have some
>> reasonable amount of variability in the estimation sample.
>> 3. Check the scales of your continuous variables to see that they are all
>> in the same "ballpark." If two variables differ by several orders of
>> magnitude, Stata will often thrash around trying to fit coefficients and
>> ultimately fail.
>> 4. Try providing Stata with starting values of your own using the from()
>> option. Other responders have already suggested this. I have a couple of
>> specific suggestions for selecting starting values:
>> a. Try the non-xt version of the same model, in this case logit. See if
>> those values will get Stata over the hump.
>> b. Try the population averaged version of the same model. The population
>> averaged estimator is calcualted using a different approach that seems to be
>> more robust to quirks in the data, and those estimates often work well as
>> starting values for the random effects model. [Which surprises me, because
>> the population averaged parameters are actually different conceptually and
>> often distant numerically from the corresponding parameters of a random
>> effects model. But my experience is that they almost always work as a
>> starting point nonetheless.]
> Atleast for FE, the implemented estimator uses the pooled coefficients of
> the logit-model by default. Annother possibility are random starting values,
> either by turning on the search option with the ml-options, or computing
> them before and passing them via the from()-option.
>> Hope this helps.
>> Clyde Schechter
>> Department of Family& Social Medicine
>> Albert Einstein College of Medicine
>> Bronx, New York, USA
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
> Klaus Pforr
> MZES AB - A
> Universität Mannheim
> D - 68131 Mannheim
> Tel: +49-621-181 2797
> fax: +49-621-181 2803
> URL: http://www.mzes.uni-mannheim.de
> Besucheranschrift: A5, Raum A309
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: