Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steve Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Treatment of missing values in surveys in Stata (subpop) |

Date |
Sun, 10 Mar 2013 22:55:56 -0400 |

You are welcome, Ángel. The bottom line is that missing a data item (or not) does not classify people in the _population_ the way that a personal attribute like gender does. Steve On Mar 10, 2013, at 3:12 PM, Ángel Rodríguez Laso wrote: Thank you very much, Steve. Very convincing explanation. Angel Rodriguez-Laso 2013/3/9 Steve Samuels <sjsamuels@gmail.com>: > Ángel: > > Rereading, I see that you asked about using the subpop() option when there are missing values. Leaving a particular question unanswered could happen for many reasons, including fatigue, haste, interviewer error, and data entry mistakes. So again, the theory of the subpopulation correction does not apply. > > You didn't need to recode a missing numerical value to something like 999 in order to use it. Such 999 coding is used only for data forms these days. > > . svy, subpop( if var < .) > > would do the job. This takes care of extended missing values, like .a, since in Stata they order as: . , .a , .b ,..., .z > > Multiple imputation is the approach for handling missing values. > > > Steve > Ángel: > > The theory of subpopulation corrections does not apply to non-response. > > A subpopulation is a subset of the population tht can be defined in > advance: (e.g. males, ages 30-40, living in rural areas). The number > selected by a sample will be random. For example, suppose a population > of N members contains a subpopulation of M members. A SRS of size n > taken. You should be able work out the exact probability that the sample > will contain exactly k members of the subpopulation. The theory of the > subpopulation correction is an extension of this, and can be found in > any good text. > > In contrast, "responder" is not a characteristic, like gender, that is > known in advance. It is defined only in relation to the particular sample > design and protocol. For identical designs, better protocols can > increase response rates. Thus, sampling theory alone cannot > describe the numbers of responders and, consequently, the > subpopulation correction is not applicable. > > Steve > > sjsamuels@gmail.com > > On Mar 8, 2013, at 2:37 PM, Ángel Rodríguez Laso wrote: > > Dear Statalisters, > > I have found two recommended procedures for dealing with individuals > with missing items ('normal' missing answers like 'DK/DA' or equipment > failure) when analysing surveys with Stata: > > 1) One is based on the recommendation that, unless there is a very > strong reason to do otherwise, whenever you analyse a group of > individuals in a survey with Stata, you have to use subpop. (See for > example: http://www.stata.com/meeting/mexico10/mex10sug_canette.pdf). > Under this perspective, those with valid values would be a > subpopulation. From my point of view, this means that in order to > prevent Stata from dropping them from the calculation of standard > errors, missing codes (".") should be recoded to a numerical value > (like 999) and then a command issued this way: > > svy, subpop(if var<999): command var > > 2) Nevertheless, most of the information I've read does not make any > statement about this, what implicitly means that missing codes don't > need to be recoded. I've even found this piece of advice > (http://www.stata.com/statalist/archive/2012-09/msg01028.html): 'I've > never seen a recommendation to consider observations with non-missing > values as a subpopulation' > > > I wonder if anyone could throw some ligth on this topic. > > Thank you very much. > > Angel Rodriguez-Laso > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Treatment of missing values in surveys in Stata (subpop)***From:*Ángel Rodríguez Laso <angelrlaso@gmail.com>

**Re: st: Treatment of missing values in surveys in Stata (subpop)***From:*Steve Samuels <sjsamuels@gmail.com>

**Re: st: Treatment of missing values in surveys in Stata (subpop)***From:*Ángel Rodríguez Laso <angelrlaso@gmail.com>

- Prev by Date:
**Re: st: how does one automatically calculate the age in years by subtracting the system date from the date of birth?** - Next by Date:
**Re: st: how does one automatically calculate the age in years by subtracting the system date from the date of birth?** - Previous by thread:
**Re: st: Treatment of missing values in surveys in Stata (subpop)** - Next by thread:
**st: Logistic Model optimization** - Index(es):