Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steve Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Treatment of missing values in surveys in Stata (subpop) |

Date |
Fri, 8 Mar 2013 16:04:59 -0500 |

Ángel: The theory of subpopulation corrections does not apply to non-response. A subpopulation is a subset of the population tht can be defined in advance: (e.g. males, ages 30-40, living in rural areas). The number selected by a sample will be random. For example, suppose a population of N members contains a subpopulation of M members. A SRS of size n taken. You should be able work out the exact probability that the sample will contain exactly k members of the subpopulation. The theory of the subpopulation correction is an extension of this, and can be found in any good text. In contrast, "responder" is not a characteristic, like gender, that is known in advance. It is defined only in relation to the particular sample design and protocol. For identical designs, better protocols can increase response rates. Thus, sampling theory alone cannot describe the numbers of responders and, consequently, the subpopulation correction is not applicable. Steve sjsamuels@gmail.com On Mar 8, 2013, at 2:37 PM, Ángel Rodríguez Laso wrote: Dear Statalisters, I have found two recommended procedures for dealing with individuals with missing items ('normal' missing answers like 'DK/DA' or equipment failure) when analysing surveys with Stata: 1) One is based on the recommendation that, unless there is a very strong reason to do otherwise, whenever you analyse a group of individuals in a survey with Stata, you have to use subpop. (See for example: http://www.stata.com/meeting/mexico10/mex10sug_canette.pdf). Under this perspective, those with valid values would be a subpopulation. From my point of view, this means that in order to prevent Stata from dropping them from the calculation of standard errors, missing codes (".") should be recoded to a numerical value (like 999) and then a command issued this way: svy, subpop(if var<999): command var 2) Nevertheless, most of the information I've read does not make any statement about this, what implicitly means that missing codes don't need to be recoded. I've even found this piece of advice (http://www.stata.com/statalist/archive/2012-09/msg01028.html): 'I've never seen a recommendation to consider observations with non-missing values as a subpopulation' I wonder if anyone could throw some ligth on this topic. Thank you very much. Angel Rodriguez-Laso * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Treatment of missing values in surveys in Stata (subpop)***From:*Ángel Rodríguez Laso <angelrlaso@gmail.com>

- Prev by Date:
**Re: st: Determining mutual exclusivity for a series of dummy variables.** - Next by Date:
**Re: st: all suggestions are welcome retiming project** - Previous by thread:
**st: Treatment of missing values in surveys in Stata (subpop)** - Next by thread:
**Re: st: Treatment of missing values in surveys in Stata (subpop)** - Index(es):