Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: cluster in logisitc

From   Mark Schaffer <[email protected]>
To   [email protected], "Cram, Peter" <[email protected]>
Subject   Re: st: cluster in logisitc
Date   Fri, 06 Feb 2004 10:52:21 +0000 (GMT)


Quoting "Cram, Peter" <[email protected]>:

> Dear Statalist,
> I am running the following logistic regression model.
> xi: logistic mort30 SpecHosp age i.race alcohol_ex
> arrhyth_ex
> However I have hospitals clustered within regions (variable hrrnum)
> and patients clustered within hospitals (variable prov_num).
> I have been able to run my regression with a single cluster
> included:
> xi: logistic mort30 SpecHosp age i.race alcohol_ex arrhyth_ex,
> cluster
> (hrrnum)
> but am unable to run the model with 2 clusters because I receive
> error messages each time I do so.
> xi: logistic mort30 SpecHosp age i.race alcohol_ex arrhyth_ex,
> cluster (hrrnum) cluster (prov_num)
> How can I account for the 2 levels of clustering in my data in my
> regression model?

It depends what you want to do, and on whether you want what -cluster- 

-cluster- allows, very generally, for arbitrary intra-cluster correlation.  
The only thing you need to assume is that disturbances across clusters are 
not correlated.

You have identified two levels of clusters, (hospitals within) regions and 
(patients within) hospitals.

If you cluster on regions using only the -cluster- option, then you are 
allowing for arbitrary intra-region correlation.  Unless I misunderstand 
the structure of the data, this would include possible correlations between 
patients who are located in the same hospital, because such patients are 
therefore also in the same region.

In other words, clustering on just region will give you SEs that are robust 
to intra-regional, and hence intra-hospital correlation...

...BUT, and this is crucial, you need a "respectable" number of clusters 
(in this case, regions) for the asymptotics behind -cluster- to work.  
Roughly speaking, if the number would be respectable as the number of 
observations in a regression, it would also be respectable as the number of 
clusters for the -cluster- option.  E.g., 100 could be respectable, but not 

Hope this helps.


> Thanks,
> Pete 
> *
> *   For searches and help try:
> *
> *
> *

Prof. Mark Schaffer
Director, CERT
Department of Economics
School of Management & Languages
Heriot-Watt University, Edinburgh EH14 4AS
tel +44-131-451-3494 / fax +44-131-451-3008
email: [email protected]


This e-mail and any files transmitted with it are confidential
and intended solely for the use of the individual or entity to
whom it is addressed.  If you are not the intended recipient
you are prohibited from using any of the information contained
in this e-mail.  In such a case, please destroy all copies in
your possession and notify the sender by reply e-mail.  Heriot
Watt University does not accept liability or responsibility
for changes made to this e-mail after it was sent, or for
viruses transmitted through this e-mail.  Opinions, comments,
conclusions and other information in this e-mail that do not
relate to the official business of Heriot Watt University are
not endorsed by it.
*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index