Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Binary panel data questions

From	Kim Peeters <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: Binary panel data questions
Date	Thu, 9 Feb 2012 09:27:52 -0800 (PST)
Dear Maarten,

Thank you for your reply. Concerning your data preparation / quality remark, it turns out that the data is correct. The ailment is not very common and once you suffer from it, it is very unlikely that it will ever cure.

In the meantime I fitted two different models.

First model: standard logistic regression including a time factor variable and clustered standard errors, allowing for intra-patient correlation

Logistic regression                               Number of obs   =       4526
                                                  Wald chi2(21)   =      62.63
                                                  Prob > chi2     =     0.0000
Log pseudolikelihood = -2889.4078                 Pseudo R2       =     0.0690

                                                    (Std. Err. adjusted for 588 clusters in ID)
-----------------------------------------------------------------------------------------------
                              |               Robust
                 Profitstatus |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
------------------------------+----------------------------------------------------------------
                         Year |
                        1995  |          0  (empty)
                        1996  |   .3095819   .6289854     0.49   0.623    -.9232068    1.542371
                        1997  |   .2287845   .3932461     0.58   0.561    -.5419637    .9995326
                        1998  |   .2779752   .2760959     1.01   0.314    -.2631629    .8191133
                        1999  |   .2198992   .2423209     0.91   0.364    -.2550409    .6948394
                        2000  |   .2776958   .1964845     1.41   0.158    -.1074067    .6627984
                        2001  |    .173692   .1671147     1.04   0.299    -.1538467    .5012308
                        2002  |  -.0233154   .1418964    -0.16   0.869    -.3014272    .2547964
                        2003  |   .0028641   .1155645     0.02   0.980    -.2236381    .2293663
                        2004  |   .0281098   .0998883     0.28   0.778    -.1676678    .2238873
                        2005  |   .0220868   .0823186     0.27   0.788    -.1392547    .1834282
                        2006  |   .0470962   .0740874     0.64   0.525    -.0981124    .1923049
                        2007  |    .008058   .0702793     0.11   0.909    -.1296869    .1458029
                        2008  |   .0484251   .0671299     0.72   0.471     -.083147    .1799971
                        2009  |   .0380139   .0655851     0.58   0.562    -.0905306    .1665584
                        2010  |          0  (omitted)
                              |
                          X   |
                           2  |    1.20977   .3477654     3.48   0.001     .5281622    1.891377
                           3  |   .7152767    .287351     2.49   0.013      .152079    1.278474
                           4  |   .1813765   .2763467     0.66   0.512    -.3602532    .7230061
                           5  |          0  (empty)
                           6  |    .750882   .3379602     2.22   0.026     .0884923    1.413272
                              |
                          Y   |   .2927133   .0971447     3.01   0.003     .1023131    .4831135
                          Z   |  -.9795072   .3057005    -3.20   0.001    -1.578669   -.3803452
                          A   |  -1.525683   .3984367    -3.83   0.000    -2.306604   -.7447611
                        _cons |   .5056934   .3791922     1.33   0.182    -.2375097    1.248896
-----------------------------------------------------------------------------------------------
Note: 1 failure and 3 successes completely determined.
note: 1995.Year != 0 predicts success perfectly
      1995.Year dropped and 2 obs not used
note: 5.X != 0 predicts failure perfectly
      5.X dropped and 322 obs not used
note: 2010.Year omitted because of collinearity


Second model: -xtlogit- with random effects

Random-effects logistic regression              Number of obs      =      4850
Group variable: ID                              Number of groups   =       624

Random effects u_i ~ Gaussian                   Obs per group: min =         2
                                                               avg =       7.8
                                                               max =        16

                                                Wald chi2(8)       =     37.88
Log likelihood  = -379.48407                    Prob > chi2        =    0.0000

-----------------------------------------------------------------------------------------------
                 Profitstatus |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
------------------------------+----------------------------------------------------------------
                          X   |
                           2  |   1.952878   1.480678     1.32   0.187    -.9491981    4.854953
                           3  |   2.070424   1.322714     1.57   0.118    -.5220471    4.662896
                           4  |   .2901392   1.324056     0.22   0.827    -2.304962    2.885241
                           5  |   -55.3464   4613.941    -0.01   0.990    -9098.504    8987.812
                           6  |   3.294871   2.974562     1.11   0.268    -2.535163    9.124905
                              |
                          Y   |   .4409673   .1010158     4.37   0.000     .2429799    .6389547
                          Z   |  -1.308088   1.035164    -1.26   0.206    -3.336972    .7207964
                          A   |  -3.993116   2.400293    -1.66   0.096    -8.697604    .7113715
                        _cons |  -.0651275   2.014896    -0.03   0.974     -4.01425    3.883995
------------------------------+----------------------------------------------------------------
                     /lnsig2u |   4.505505    .116975                      4.276238    4.734772
------------------------------+----------------------------------------------------------------
                      sigma_u |   9.513887   .5564434                      8.483466    10.66946
                          rho |   .9649282   .0039586                      .9562861     .971912
-----------------------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =  5026.14 Prob >= chibar2 = 0.000

In the standard logistic regression, variables Y, Z and A are significant. However, in the random-effects panel data regression, only the Y variable is significant. The X variable result is also different. I did not expect the models to vary that much. Why are these models so different or am I doing something wrong?

Thank you!


Kind regards,
Kim 




----- Original Message -----
From: Maarten Buis <[email protected]>
To: [email protected]
Cc: 
Sent: Wednesday, February 8, 2012 10:27 AM
Subject: Re: st: Binary panel data questions

On Wed, Feb 8, 2012 at 1:19 AM, Kim Peeters wrote:
> Somewhat remarkably, it turns out that none of the participants in the study experienced a transition from one state to the other state (e.g. transition from no ailment to ailment and vice versa). In other words, all patients that did not suffer from the illness at the onset of the study remained disease-free and all patients that did suffer from the illness at the onset of the study continued to be ill.
>
> Originally, I planned to use -xtlogit- with fixed effects to control for unobserved influences that differ between patients but remain constant in a given patient. However, since none of patients experienced a transition, Stata correctly returns error code 2000: outcome does not vary in any group.
>
> At the moment, I do not know which statistical technique would be the most appropriate. Recall that I try to test for a relationship between the outcome (no illness vs. illness) and a group of independent variables.  I thought about running a logistic regression with clustered standard errors (i.e. vce(cluster ID)). However, I do not want to discard the time dimension in the panel data and I would to correct for potential omitted variable bias.

In essence you do not have panel data, you could just as well use the
first observation in each person and do a regular -logit-. I just
don't think there is any more information present in your data, and no
amount of fancy modeling can invent information that isn't present in
the data.

I would really check again whether that constant disease status isn't
some error during data preparation or some artifact of the way the
data was collected, as that a) sounds really suspicious and b) is
causing you this problem.

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany


http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*  http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Follow-Ups:
- Re: st: Binary panel data questions
  - From: [email protected] (Brendan Halpin)
References:
- st: Binary panel data questions
  - From: Kim Peeters <[email protected]>
- Re: st: Binary panel data questions
  - From: Maarten Buis <[email protected]>
Prev by Date: st: RE: time varying covariate Cox regression
Next by Date: Re: st: Binary panel data questions
Previous by thread: Re: st: Binary panel data questions
Next by thread: Re: st: Binary panel data questions
Index(es):
- Date
- Thread