Re: st: Correctly setting FPC in svyset

From   Steven Samuels <>
Subject   Re: st: Correctly setting FPC in svyset
Date   Wed, 9 Feb 2011 08:55:38 -0500

The finite population correction is not appropriate here. The theory for it applies only for the sampling fraction set by the design: e.g.. if you'd had a population of 1,000 and selected 500 at random. In contrast, you tried to do a census and select 100%. The schools that responded are not a random sample of the total.

Your problem is how to deal with the large bias due to non-response. For some partial remedies, consult the sections on non-response in a good sampling text. I suggest especially Lohr, Sampling Design and Analysis, Duxbury, 1999 or 2010 and Groves et al. 'Survey Methodology", Wiley, 2010. Both contain further references.

In my experience, trying to do a census is almost always a mistake. You would have had better information (smaller mean square error) from a designed sample of 50 to 200 schools with methods intended to minimize non-response. We've needed such methods even in business surveys where participation was required by law. If it were my study, I would contact a sample of the non-responders, and I suggest that you do that if possible.


On Feb 9, 2011, at 6:44 AM, R G wrote:


I'm analyzing a census I've done of schools, no stratification or clustering.
Response rate is around 50%. I need to set the FPC (finite population
correction) in svyset and I'm not sure the appropriate number to use. I know some argue that there's no need to set the FPC, but in my case the issues I'm surveying on there is a strong case for considering a finite population, and not
a superpopulation.

According to Stata help:

fpc(varname) requests a finite population correction for the variance
estimates.  If varname has values less than or
equal to 1, it is interpreted as a stratum sampling rate f_h = n_h/N_h,
where n_h = number of units sampled from
stratum h and N_h = total number of units in the population belonging to
stratum h.  If varname has values greater
than or equal to n_h, it is interpreted as containing N_h. It is an
error for varname to have values between 1
and n_h or to have a mixture of sampling rates and stratus sizes.

The UCLA page also states that:
Stata will calculate the actual fpc for us; we just need to  specify the
population total.

If there are a total of 1000 schools, with 500 responses, am I correct in
interpreting that I should set the fpc=1000?

Just need to confirm this, as I'm relatively new at Stata's svy commands.



