[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: clustering at 2 levels

From   Steven Samuels <>
Subject   Re: st: clustering at 2 levels
Date   Thu, 24 Apr 2008 18:47:54 -0400

For some reason I did not receive Vivian's original post, so I am responding to Stas's reply.

1. I agree with Stas that meterological stations are not part of the design. The ordinary procedure is to average the contribution of nearby stations, not to associate a single station with a village.

2. Vivian's -svyset- is incorrect. It should start by specifying the Primary Sampling Unit (PSU):

svyset vilid

I would guess that Stata issued the error message because Vivian has repeated person ID's within village. Vivan doesn't give details about strata, weights, or subsequent stages of sampling, so I am not sure what follows. It sounds like Vivan intends to model some outcome. In that case she should omit the finite population corrections.


On Apr 24, 2008, at 4:32 PM, Stas Kolenikov wrote:

On Wed, Apr 23, 2008 at 3:54 PM, Vivian Hoffmann <> wrote:

Is there an easy way to introduce clustering at multiple levels when using
ivreg and ivtobit?
I doubt those commands would respond to any -svy- settings. Check with
-svyivreg- though. At any rate,...

My two levels of clusters are village (vilid) and meteorological station
(id). Villages are randomly sampled from the population of villages. There
are 21 meteorological stations and I have matched one of these to each
village by minimum distance.
... as you properly note below, you did not sample your stations, so
they do not belong to the design in any least way. All there is in
your design is clustering on villages (unless you had some
stratification there, and you should know better whether you had
different probabilities of selection leading to sampling weights). You
can model your met stations as fixed effects with dummy variables, or
you can post-stratify on them, although that is somewhat dubious a
solution. Why do you want to do anything with those stations, anyway?

I tried to do 2-stage clustering using svyset <svyset id|| vilid> but got
the error "Note: stage 1 is sampled with replacement, all further stages
will be ignored".

I checked the Stata FAQ and apparently the solution to this problem is to
include a finite population correction for the highest level. But an FPC
on the met stations doesn't make sense, since these are not sampled. In
fact these are really more like strata (they cover mutually exclusive
territory). However these are not true strata and I think it would be
wrong to treat them as such.
Stas Kolenikov, also found at

Small print: Please do not reply to my Gmail address as I don't check
it regularly.
*   For searches and help try:
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index