Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: clustering and log likelihood.

 From Stas Kolenikov To statalist@hsphsun2.harvard.edu Subject Re: st: clustering and log likelihood. Date Thu, 7 Mar 2013 08:30:28 -0600

```Since your data are not i.i.d., you don't have likelihood anymore. A
long technical story inappropriately short,

1. What you write down is pseudo-likelihood, and it looks exactly the
same as how the likelihood would look for the i.i.d. case. Think about
OLS: you can still minimize the sum of squared errors to get some sort
of idea about the line of best fit. That's what pseudo-likelihood is
for, to generate some sort of point estimates.

2. Now, having obtained the point estimates, you need to recognize
that the model as fitted to the data is not the true likelihood.
Hence, the nice theorems about the asymptotic variance being the
inverse Hessian do not work. Instead, you need to use the more general
theorems from M-estimation theory, which give you a sandwich variance
estimator (inverse Hessian times the variance of scores times inverse
Hessian). In the variance of scores computation, you need to account
for clustering: this is the sum over the clusters, rather than
individual observations.

If you were to try to incorporate the cluster effects explicitly into
your likelihood, (i) the likelihood is the sum over clusters; (ii)
each cluster contribution is an integral of products of the
observation-level likelihoods, conditioned on the random effect. You
can assume normal random effects, and integrate over them. What you
will get in the end is -xtprobit, re-, which is a very different
model.

--
-- Stas Kolenikov, PhD, PStat (SSC)
-- Senior Survey Statistician, Abt SRBI
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer

On Thu, Mar 7, 2013 at 1:54 AM, Elin Vimefall <Elin.Vimefall@oru.se> wrote:
> Hi
>
> I'm running a probit model like:
> probit enrol x1 x2 [pweight=weight], cluster(prov)
>
> I understand why I should use the cluster option but I'm really struggling with understanding what's happening more formally.
> I would like to write the log likelihood function. However; I do not understand how to incorporate the clusters.
> Will the clusters influence the likelihood function?
>
> I guess my question is stupid but I'm really struggling with this and would really appreciate any help I could get.
>
> Best regards
> /Elin Vimefall
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```