Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: -xtmelogit- question


From   ymarchenko@stata.com (Yulia Marchenko, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: -xtmelogit- question
Date   Mon, 24 Sep 2007 17:07:57 -0500

Garrard, Wendy M. <wendy.garrard@Vanderbilt.Edu> asks about the group
information reported in the output after -xtmelogit-:

> I am using -xtmelogit- to estimate a crossed-effects random intercept model.
> The data represents responses from individuals who use services at different
> agencies; those agencies operate in multiple counties.  I want to identify the
> county-specific effects and the agency-specific effects.  There are 95 total
> counties in TN, but only 70 have respondents; There are 20 agencies in the
> data. (-tab- confirms this.)
> 
> PROBLEM -- when I run the first command below , the groupings section of the
> output show 70 counties but 93 agencies and the RE estimates for the var
> output are puzzling (see the example output below my signature):
> 
>   . xtmelogit outcome varlist || tncounty: || agencynum: , options
>
> Yet, when I run this second version of the command, mimicking the new Stata10
> manual for treating the agencies as if nested in counties, it gives me the
> correct number of agencies in the groupings section of the output and the RE
> estimates for the var/std are sensible.
>
> . xtmelogit outcome varlist || _all:R.tncounty || agencynum: , options

Let me first clarify the two syntaxes for the nested- and crossed-effects
specifications of the random-effects models.  The first syntax

(1) . xtmelogit outcome varlist || tncounty: || agencynum: , options

corresponds to the mixed logistic model with two _nested_ random effects for
agencies (defined by the level-variable "agencynum") which are being treated
as nested within the random effects for counties (defined by the
level-variable "tncounty").  The second syntax

(2a) . xtmelogit outcome varlist || _all:R.tncounty || agencynum: , options

corresponds to the _crossed_-effect specification in which counties are assumed
to be crossed with agencies.  The above syntax is equivalent to a more direct
specification of the crossed-effect model

(2b) . xtmelogit outcome varlist || _all:R.tncounty || _all:R.agencynum:, options

Here, the whole dataset is treated as a one big panel as requested by the
-_all- specifiers, and indicator variables identifying counties and agencies
are created per the -R.tncounty- and -R.agencynum- specifications.  Without
providing the details which may be found in the sources given below I'll point
out that (2a) syntax is merely a more efficient way of fitting the
crossed-effects model as specified by (2b).  In fact, an even more efficient
way of fitting this same model is to use -R.varname- notation with the
"agencynum" as the "varname" to create indicator variables for agencies at the
first level and use "tncounty" as the second-level variable:

(2c) . xtmelogit outcome varlist || _all:R.agencynum || tncounty: , options

Since -R.varname- notation adds a column to the random-effects matrix for each
category of the variable "varname" it is better to use a variable with smaller
number of categories with the -R.varname- notation.  In Wendy's example such a
variable is "agencynum".

For more examples and details of fitting crossed-effects models see example 5
in [XT] xtmelogit, example 6 in [XT] xtmixed, and other sources, for example:

http://www.stata.com/bookstore/mlmus.html
http://www.stata-journal.com/abstracts/st0095.pdf
http://www.stata.com/statalist/archive/2007-06/msg00837.html

Let's now investigate the group information reported for each syntax.
Consider part of the output presented by Wendy for the nested-effects model
(1):

--------------------------------------------------------------------------
                |   No. of       Observations per Group       Integration
 Group Variable |   Groups    Minimum    Average    Maximum      Points
----------------+---------------------------------------------------------
       tncounty |        70       ...
      agencynum |        93       ...
--------------------------------------------------------------------------

The number of groups for "tncounty" is 70 (as Wendy expected) and the number
of groups for "agencynum" is 93 rather than the expected 20.  Since Wendy is
fitting the nested-effects model the number of levels for the agencies nested
within counties is determined by the number of unique combinations of counties
and agencies.  Wendy can verify this, for example, as follows.  After fitting
the -xtmelogit- command, Wendy can type

. egen count = group(tncounty agencynum) if e(sample)
. qui tabulate count
. display as txt "Number of categories = " as res r(r)

and should obtain the same number 93 as reported in the above output.

Although Wendy does not provide the output from fitting the crossed-effects
model (2a), the group information should look something like:

--------------------------------------------------------------------------
                |   No. of       Observations per Group       Integration
 Group Variable |   Groups    Minimum    Average    Maximum      Points
----------------+---------------------------------------------------------
           _all |        1        ...
      agencynum |       20        ...
--------------------------------------------------------------------------

In this model, "agencynum" is nested within one big panel (-_all-) so the
number of groups is the number of categories of "agencynum", 20.

> judging from the missing value in the RE table, I wonder if there is some sort
> of problem maximizing/estimating, but I'm not sure how or why.

According to the iteration log from the output of -xtmelogit- there is no
problem with the maximization of the likelihood; the model converged to an
answer.  A missing value in the random-effects table indicates a very large
upper bound for the estimate of the variance component for counties.

One comment on the below syntax:

> . xtmelogit vsat black || tncounty: || agencynum: , or variance cov(un)
> Note: single-variable random-effects specification; covariance structure set
>       to identity

is that Wendy does not need to specify the -cov(un)- option since the
"agencynum"-level equation contains only one random coefficient (random
intercept).

Since Wendy wants to fit a crossed-effects model, a more efficient syntax is
as given by (2c) above and repeated here:

. xtmelogit outcome varlist || _all:R.agencynum || tncounty: , options


-- Yulia
ymarchenko@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index