Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Finite population correction with clustering of SE at a different level than the strata

From   Austin Nichols <>
Subject   Re: st: Finite population correction with clustering of SE at a different level than the strata
Date   Mon, 4 Jun 2012 14:24:04 -0400

Ole Dahl Rasmussen <> :
Not sure what is meant by "endline" here.  Is it the case that strata
have either "interested"=0 or =1 and treatment is perfectly correlated
with interested?  I would recommend superpopulation inference instead,
but with different probabilities of selection across strata,
. reg consumption treatXendline endline treat [pweight=weights], cluster(vid)
does not get the superpopulation inference exactly right
(  Is it true that everyone
assigned treatment gets it?  If not, then you want some kind of ivreg
model, for which the survey stats theory is woefully underdeveloped.

On Mon, Jun 4, 2012 at 9:57 AM, Ole Dahl Rasmussen <> wrote:
> Dear Statalist,
> As part of a cluster randomized control trial, colleagues and I are doing stratified sampling and we're not sure if we're analyzing data correctly. Great if someone has suggestions.
> We have 46 villages. Before anything else, we went to all villages and asked them if they would be interested in participating in the project we were about to implement. We wrote down the names of the interested households on lists. We then stratified the population on village and interest: On household population lists we marked the interested households and randomly selected an absolute number, 24, of the interested and 14 on the non-interested in each village, 1750 household out of a total population of approximately 3000 households.  In the end we have a total of 92 interested/village combination, which we define as our stratas in the analysis. The sampling rate inside the stratas vary from 10% to 100%.
> Then we randomly selected 23 of the villages and implemented a project in these 23 villages.
> After two years, we surveyed everybody again.
> Finally, following Cameron/Trivedi p 817 in Microeconometrics and others, we estimate the following:
> svyset vid [pweight=weights], fpc(one) || _n, strata(strataID) fpc(f) singleunit(certainty)
> svy: reg consumption treatXendline endline treat
> where
> - vid is and ID variable for villages, where I want clustered standard errors
> - weights is the inverse probability of sampling
> - one is a dummy that is equal to 1.
> - consumption is a consumption measure
> - treatXendline is the interaction between selection as treatment village and endline
> - endline is an endline dummy
> - treat is a treatment dummy
> - weights is the inverse probability of sampling
> - f is the total probability of sampling
> - strataID is an ID variable for strata which is each of the 92 village/interested combinations.
> So for the questions:
> . Are we doing it right?
> . In particular, is our finite population correction justified?
> . We want to cluster standard errors at the village level, because we think this is the relevant level, i.e. not the strata level. Is this the right way of doing it?
> Any suggestions and thoughts are appreciated.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index