|
This FAQ is only applicable for Stata 8 or earlier versions of Stata.
What do I do when one of the survey estimators returns an error message,
"stratum with only one PSU detected"?
|
Title
|
|
Stratum with only one PSU detected
|
|
Author
|
Allen McDowell, StataCorp
|
|
Date
|
July 2002
|
Having a stratum with a single PSU is a fairly common problem. When there
is only one PSU within a stratum, there is insufficient information with
which to compute an estimate of that stratum's variance. Therefore, it is
impossible to compute the variance of an estimated parameter when the data
are from a stratified clustered design. There are two solutions. The first
solution is to simply delete the stratum with the singleton PSU from your
sample. The second solution is to treat the data from that stratum as though
it is from another stratum. In order to implement either solution, one must
first identify which strata are affected and which observations in the
dataset belong to those strata. The
svydes
command will identify the strata with singleton PSUs by placing an asterisk
next to the stratum identifier. For example, in the output below, stratum 1
is identified as having only 1 PSU.
. svydes
pweight: pweight
Strata: strata
PSU: psu
#Obs per PSU
Strata ----------------------------
strata #PSUs #Obs min mean max
-------- -------- -------- -------- -------- --------
0 3 52 11 17.3 27
1 1* 3 3 3.0 3
3 2 19 9 9.5 10
-------- -------- -------- -------- -------- --------
3 6 74 3 12.3 27
Now I can identify the observations within that stratum by issuing the
following command:
. list strata psu if strata==1
strata psu
53. 1 3
54. 1 3
55. 1 3
I now know that I need to either delete observations 53, 54, and 55 from my
sample, or I need to assign the variable “strata” a different
value in those same observations. I could, for example,
. replace strata = 3 in 53/55
(3 real changes made)
so that now, when I use svydes,
. svydes
pweight: pweight
Strata: strata
PSU: psu
#Obs per PSU
Strata ----------------------------
strata #PSUs #Obs min mean max
-------- -------- -------- -------- -------- --------
0 3 52 11 17.3 27
3 3 22 3 7.3 10
-------- -------- -------- -------- -------- --------
2 6 74 3 12.3 27
I no longer have any strata with a singleton PSU.
Strata with singleton PSUs can arise for several reasons. You might, for
example, run svydes on your full dataset and not detect any strata with
singleton PSUs. However, if you used a survey estimator, say,
svymean, and the
variable upon which you were estimating the mean happened to have missing
values for all observations within a particular stratum except for those
observations in a single PSU, svymean would issue an error message
saying that it had detected a stratum with a singleton PSU. To detect this
problem ahead of time, you should use svydes and provide it with a
varlist of only those variables that you will be using with a survey
estimator.
Stratum with a singleton PSU can also arise during estimation if the estimator
itself drops observations from the estimation sample. This can occur, for
example, with
svylogit or
svyprobit. When
estimating a logit or probit model, observations can be excluded from the
estimation sample because in those observations, a variable or group of
variables form a perfect predictor of the dependent variable. This makes
parameter estimation impossible unless those observations (and perhaps some
variables) are excluded from the estimation sample. Once these offending
observations are dropped, the result may be a stratum that now only has
observations from a single PSU. In such a case, svylogit or
svyprobit would just terminate with an error message. To verify that
this is indeed the problem and to identify which observations are being
dropped, use logit or probit with pweights and the
cluster() option (the clusters are the same thing as PSUs). You can
then use the e(sample) function to identify the estimation sample.
With an indicator for the estimation sample in hand, you can then say
. svydes varlist if e(sample)
(where the varlist represents the variables in the svylogit or
svyprobit model). svydes will identify the strata with
singleton PSUs. Once you have corrected the problem, you can then estimate
the model using svylogit or svyprobit.
|