Home  /  Resources & support  /  FAQs  /  Stratum with only one PSU detected
This FAQ is only applicable for Stata 8 or earlier versions of Stata.

What do I do when one of the survey estimators returns an error message, "stratum with only one PSU detected"?

Title   Stratum with only one PSU detected
Author Allen McDowell, StataCorp

Having a stratum with a single PSU is a fairly common problem. When there is only one PSU within a stratum, there is insufficient information with which to compute an estimate of that stratum's variance. Therefore, it is impossible to compute the variance of an estimated parameter when the data are from a stratified clustered design. There are two solutions. The first solution is to simply delete the stratum with the singleton PSU from your sample. The second solution is to treat the data from that stratum as though it is from another stratum. In order to implement either solution, one must first identify which strata are affected and which observations in the dataset belong to those strata. The svydes command will identify the strata with singleton PSUs by placing an asterisk next to the stratum identifier. For example, in the output below, stratum 1 is identified as having only 1 PSU.

 . svydes
 
 pweight:  pweight
 Strata:   strata
 PSU:      psu
                                       #Obs per PSU
  Strata                       ----------------------------
  strata     #PSUs     #Obs       min      mean       max
 --------  --------  --------  --------  --------  --------
        0         3        52        11      17.3        27
        1         1*        3         3       3.0         3
        3         2        19         9       9.5        10
 --------  --------  --------  --------  --------  --------
        3         6        74         3      12.3        27

Now I can identify the observations within that stratum by issuing the following command:

 . list  strata psu if strata==1
        
        strata       psu
  53.        1         3
  54.        1         3
  55.        1         3

I now know that I need to either delete observations 53, 54, and 55 from my sample, or I need to assign the variable “strata” a different value in those same observations. I could, for example,

 . replace strata = 3 in 53/55
 (3 real changes made)

so that now, when I use svydes,

 . svydes
 
 pweight:  pweight
 Strata:   strata
 PSU:      psu
                                       #Obs per PSU
  Strata                       ----------------------------
  strata     #PSUs     #Obs       min      mean       max
 --------  --------  --------  --------  --------  --------
        0         3        52        11      17.3        27
        3         3        22         3       7.3        10
 --------  --------  --------  --------  --------  --------
        2         6        74         3      12.3        27

I no longer have any strata with a singleton PSU.

Strata with singleton PSUs can arise for several reasons. You might, for example, run svydes on your full dataset and not detect any strata with singleton PSUs. However, if you used a survey estimator, say, svymean, and the variable upon which you were estimating the mean happened to have missing values for all observations within a particular stratum except for those observations in a single PSU, svymean would issue an error message saying that it had detected a stratum with a singleton PSU. To detect this problem ahead of time, you should use svydes and provide it with a varlist of only those variables that you will be using with a survey estimator.

Stratum with a singleton PSU can also arise during estimation if the estimator itself drops observations from the estimation sample. This can occur, for example, with svylogit or svyprobit. When estimating a logit or probit model, observations can be excluded from the estimation sample because in those observations, a variable or group of variables form a perfect predictor of the dependent variable. This makes parameter estimation impossible unless those observations (and perhaps some variables) are excluded from the estimation sample. Once these offending observations are dropped, the result may be a stratum that now only has observations from a single PSU. In such a case, svylogit or svyprobit would just terminate with an error message. To verify that this is indeed the problem and to identify which observations are being dropped, use logit or probit with pweights and the cluster() option (the clusters are the same thing as PSUs). You can then use the e(sample) function to identify the estimation sample. With an indicator for the estimation sample in hand, you can then say

 . svydes varlist if e(sample)

(where the varlist represents the variables in the svylogit or svyprobit model). svydes will identify the strata with singleton PSUs. Once you have corrected the problem, you can then estimate the model using svylogit or svyprobit.