Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: too good to be true : lr test in mlogit?

From	John Litfiba <[email protected]>
To	[email protected]
Subject	Re: st: too good to be true : lr test in mlogit?
Date	Mon, 16 May 2011 12:02:51 +0200

Dear Marten,

Thank you very much again for your support!

Well, when I run a xtlogit (I first type xtset id, where id is the
unique id for each of my individual in my database) with on year of
data (1million) observations I get the message

Yvar is categorical (Yes=1, No=0) and Xvar is also categorical
(type1=1, type2=2)

*********************************************************************************************************
. xtlogit Yvar Xvar, re

Fitting comparison model:

Iteration 0:   log likelihood = -699882.93
Iteration 1:   log likelihood = -669440.74
Iteration 2:   log likelihood = -669402.92
Iteration 3:   log likelihood = -669402.89

Fitting full model:

tau =  0.0     log likelihood = -669402.89
tau =  0.1     log likelihood =  -460383.3
tau =  0.2     log likelihood = -425672.08
tau =  0.3     log likelihood = -409117.91
tau =  0.4     log likelihood = -398842.25
tau =  0.5     log likelihood = -391638.85
tau =  0.6     log likelihood = -386752.42
tau =  0.7     log likelihood = -384063.23
tau =  0.8     log likelihood = -383816.98

initial values not feasible
r(1400);
***********************************************************************************************
and when I run a random effect model I get

**************************************************************************************************
. xtlogit Yvar Xvar, fe
note: multiple positive outcomes within groups encountered.
note: 18475 groups (170046 obs) dropped because of all positive or
      all negative outcomes.

Iteration 0:   log likelihood =    -1.#INF
Iteration 1:   log likelihood =    -1.#IND
Hessian is not negative semidefinite
r(430);
************************************************************************************************************************

However, lets say I only keep the last 100 000 observations of my
sample and then I get

************************************************************************************************************************
 xtlogit Yvar Xvar, fe

note: multiple positive outcomes within groups encountered.
note: 11791 groups (49177 obs) dropped because of all positive or
      all negative outcomes.

Iteration 0:   log likelihood = -22470.418
Iteration 1:   log likelihood = -22218.949
Iteration 2:   log likelihood = -22218.885
Iteration 3:   log likelihood = -22218.885

Conditional fixed-effects logistic regression   Number of obs      =     69669
Group variable: id2                             Number of groups   =      3794

                                                Obs per group: min =         2
                                                               avg =      18.4
                                                               max =       876

                                                LR chi2(1)         =   5266.26
Log likelihood  = -22218.885                    Prob > chi2        =    0.0000

------------------------------------------------------------------------------
    Yvar|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      Xvar|   5.315414   .1412066    37.64   0.000     5.038654    5.592174
------------------------------------------------------------------------------


and for the random effect I get :

***************************************************************************************************

. xtlogit Yvar Xvar, re

Fitting comparison model:

Iteration 0:   log likelihood = -79872.579
Iteration 1:   log likelihood = -70108.483
Iteration 2:   log likelihood = -69952.535
Iteration 3:   log likelihood = -69950.066
Iteration 4:   log likelihood =  -69950.06

Fitting full model:

tau =  0.0     log likelihood =  -69950.06
tau =  0.1     log likelihood = -55891.467
tau =  0.2     log likelihood = -51186.623
tau =  0.3     log likelihood = -48260.258
tau =  0.4     log likelihood = -46086.379
tau =  0.5     log likelihood = -44358.837
tau =  0.6     log likelihood = -42957.577
tau =  0.7     log likelihood = -41790.563
tau =  0.8     log likelihood = -40944.535

Iteration 0:   log likelihood = -41603.261
Iteration 1:   log likelihood = -39231.257
Iteration 2:   log likelihood =  -38979.35
Iteration 3:   log likelihood = -38947.091
Iteration 4:   log likelihood = -38947.091  (backed up)
Iteration 5:   log likelihood = -38947.026
Iteration 6:   log likelihood = -38947.026

Random-effects logistic regression              Number of obs      =    118846
Group variable: id2                             Number of groups   =     15585

Random effects u_i ~ Gaussian                   Obs per group: min =         1
                                                               avg =       7.6
                                                               max =      1255

                                                Wald chi2(1)       =   2339.74
Log likelihood  = -38947.026                    Prob > chi2        =    0.0000

------------------------------------------------------------------------------
    Yvar |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      Xvar|   9.349703   .1932923    48.37   0.000     8.970857    9.728549
       _cons |  -8.300978   .1855708   -44.73   0.000     -8.66469   -7.937266
-------------+----------------------------------------------------------------
    /lnsig2u |   2.728813   .0338054                      2.662555     2.79507
-------------+----------------------------------------------------------------
     sigma_u |   3.913399    .066147                      3.785877    4.045216
         rho |   .8231687   .0049208                      .8133168    .8326077
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =  6.2e+04 Prob >= chibar2 = 0.000


Best Regards

On 16 May 2011 09:49, Maarten Buis <[email protected]> wrote:
> On Sat, May 14, 2011 at 11:31 AM, John Litfiba wrote:
>> 1) The log likelihood doesnt converge when I try to fit a random or
>> fixed effect with xtlogit on my entire dataset..
>> I have to chose a very "small" (well, compared to the total size of
>> the sample) of about 10000 observations in order to see the results...
>> otherwise I get an error message after 3 or 4 iterations
>
> If you do not tell use what the error message is than we obviously
> cannot help you. We need to know exactly what you typed and what Stata
> told you in return.
>
>> 2) The idea of running lets say M regressions over randomly chose
>> samples could be a solution, but it is statistically valid ? I mean if
>> I obtain the distribution of the parameters across my M simulation can
>> I infer something on the parameters of the simulation that should have
>> been done on the entire dataset ?
>
> No, but if you sample correctly a single random sample of higher level
> units will be just as valid a sample from your population as your
> large sample, just with a smaller N. The added value of additional
> observations tends to decrease with sample size, so going from 10 to
> 11 observations will have a much bigger effect on your inference than
> moving from 100 to 101 observations. There are many estimates for
> which the difference between 10000 and 10000000 observations is just
> negligible (but there are estimates where it will matter, for example
> higher order interaction terms or a categorical variables containing a
> rarely occurring category).
>
> Hope this helps,
> Maarten
>
> --------------------------
> Maarten L. Buis
> Institut fuer Soziologie
> Universitaet Tuebingen
> Wilhelmstrasse 36
> 72074 Tuebingen
> Germany
>
>
> http://www.maartenbuis.nl
> --------------------------
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: too good to be true : lr test in mlogit?
  - From: Maarten Buis <[email protected]>
- Re: st: too good to be true : lr test in mlogit?
  - From: John Litfiba <[email protected]>

References:
- st: too good to be true : lr test in mlogit?
  - From: John Litfiba <[email protected]>
- Re: st: too good to be true : lr test in mlogit?
  - From: Maarten Buis <[email protected]>
- Re: st: too good to be true : lr test in mlogit?
  - From: John Litfiba <[email protected]>
- Re: st: too good to be true : lr test in mlogit?
  - From: Maarten Buis <[email protected]>
- Re: st: too good to be true : lr test in mlogit?
  - From: Joerg Luedicke <[email protected]>
- Re: st: too good to be true : lr test in mlogit?
  - From: John Litfiba <[email protected]>
- Re: st: too good to be true : lr test in mlogit?
  - From: Maarten Buis <[email protected]>

Prev by Date: Re: st: calcuating unadjusted and adjusted means
Next by Date: Re: st: Simplification of formula in logistic regression
Previous by thread: Re: st: too good to be true : lr test in mlogit?
Next by thread: Re: st: too good to be true : lr test in mlogit?
Index(es):
- Date
- Thread