Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: Dealing with strata with singleton PSU's

From   "R.E. De Hoyos" <[email protected]>
To   <[email protected]>
Subject   st: Re: Dealing with strata with singleton PSU's
Date   Thu, 19 May 2005 14:47:14 +0300


If you are using Stata 8 or under, what I would do is the following:

1. Identify the singletons for each of your age/gender cell using Jeff Pitblado's -singleton-
(code provided)

***** BEGIN: singleton.ado
program singleton, sort
version 8
syntax [varlist(numeric default=none)] [if] [in], ///
STRata(varname) gen(name) [ PSU(varname) ]
confirm new var `gen'
marksample touse
if "`psu'" == "" {
tempvar psu
gen `psu' = _n
tempvar u
sort `touse' `strata' `psu'
quietly by `touse' `strata' `psu': gen `u' = _n == 1
quietly by `touse' `strata': replace `u' = sum(`u')
quietly by `touse' `strata': replace `u' = cond(`u'[_N] == 1, 1, 0)

quietly replace `u' = . if !`touse'
rename `u' `gen'
***** END: singleton.ado

2. Create a general-singleton identifier:

gen gsingleton = singleton_cell1 | singleton_cell2 | ...

3. Collapse the strata to remove the singletons identified by "gsingleton". This will allow you to have the same stratification for each of your statistics. The pweights must remain the same, since the expanssion factors are associated with the PSU's and these wont change.

I hope this helps,

R.E. De Hoyos
Faculty of Economics
University of Cambridge

----- Original Message ----- From: "Trish Gorely" <[email protected]>
To: <[email protected]>
Sent: Thursday, May 19, 2005 2:30 PM
Subject: st: Dealing with strata with singleton PSU's

I have a stratified data set that I want to calculate means and proportions
for using svymean and svyprop. Unfortunately I have some
strata with single PSU's and svymean and svyprop don't like this. The
manual and help service recommend 2 ways of dealing with the singleton
1. collapse across strata to effectively remove them (the advice being to
collapse in the way that makes most sense for your data)
2. drop the singleton PSU's

The preferred option for me is to collapse across strata and I can do this
easily enough. However I'm still not clear on the following:

1. do you need to recalculate probability weights?
2. Do you need to use the same collapsed strata for everyone? For example,
when I do svymean for Grade 9 boys I have 3 singelton PSU but when I do the
same analysis for Grade 10 boys there are 4 singleton PSU's, and at grade 11
7! The problem is much less in girls (grade 9 there is one, grade 10 1 and
grade11 3). Should I collapse to remove the singleton's at year 11 boys
(which would, by chance have the net effect of removing all the singletons
at all year/gender groups) calling the new strata NEWSTRA, and then use
NEWSTRA to define the data for all analyses, or should I be doing the
relevant collapse for each age/gender group?

Thanks for any help anyone can offer

* For searches and help try:

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index