Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Stas Kolenikov <skolenik@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Finite population correction with clustering of SE at a different level than the strata |

Date |
Wed, 6 Jun 2012 07:42:43 -0500 |

On Mon, Jun 4, 2012 at 8:57 AM, Ole Dahl Rasmussen <odr@dca.dk> wrote: > Dear Statalist, > > As part of a cluster randomized control trial, colleagues and I are doing stratified sampling and we're not sure if we're analyzing data correctly. Great if someone has suggestions. > > We have 46 villages. Before anything else, we went to all villages and asked them if they would be interested in participating in the project we were about to implement. We wrote down the names of the interested households on lists. We then stratified the population on village and interest: On household population lists we marked the interested households and randomly selected an absolute number, 24, of the interested and 14 on the non-interested in each village, 1750 household out of a total population of approximately 3000 households. In the end we have a total of 92 interested/village combination, which we define as our stratas in the analysis. The sampling rate inside the stratas vary from 10% to 100%. > > Then we randomly selected 23 of the villages and implemented a project in these 23 villages. > > After two years, we surveyed everybody again. > > Finally, following Cameron/Trivedi p 817 in Microeconometrics and others, we estimate the following: > > svyset vid [pweight=weights], fpc(one) || _n, strata(strataID) fpc(f) singleunit(certainty) This is a weird design specification. This is what it says: 1. your PSUs are identified by -vid-, but 2. they don't contribute any variance at the first stage, since the fpc of 1 kills all variability 3. Then, at the next stage, you have a stratified SRSWOR sample of observations, with strata given by -stataID- and fpc given by -f-. If there are any strata where only one observation is being used, disregard the contribution to variance from such strata. In a sense, (2) indicates that this is sample is not generalizable to any population; whether that is true or not depends on where the 46 initial villages came from. If they were sampled from a larger population, then you would need to account for that in the first stage. If you somehow got stuck with them based on what the national government gave you, then it is indeed impossible to say how your microfinance could work in the population as a whole beyond the sample that you have. If you do care about correlations of the units within villages (which is the advice you seem to be getting from empirical economics literature: cluster as high as you can, then come up with a justification as to why you have done so), you should omit the -fpc()- option in the first stage and pretend you sampled these villages in the first place. Note that "stratum" is singular and "strata" are plural, so "stratas" is a non-word. -- ---- Stas Kolenikov -- http://stas.kolenikov.name ---- Senior Survey Statistician, Abt SRBI -- Opinions stated in this email are mine only, and do not reflect the position of my employer * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Finite population correction with clustering of SE at a different level than the strata***From:*Ole Dahl Rasmussen <odr@dca.dk>

- Prev by Date:
**Re: st: Combining seperate graphs into one** - Next by Date:
**Fwd: st: Combining seperate graphs into one** - Previous by thread:
**Re: st: Finite population correction with clustering of SE at a different level than the strata** - Next by thread:
**Re: st: Finite population correction with clustering of SE at a different level than the strata** - Index(es):